NLP Archives • Page 3 of 3 • MediaScopium

To facilitate the exchange of current ongoing work, projects or plans, the workshop allowed participants to give very short lightning talks and project pitches of max 5 minutes.

Part 1
Chair: Martijn Kleppe (National Library of the Netherlands)

1. How can Caffe be used to segment historical images into different categories?
Thomas Smits (Radboud University)

Slides
Thomas’s personal blog: Illustrated News
Data: 1.5M historical images from Dutch newspaper database;
Results by meantime:

Number of images by identified categories.

Challenge: how to attack the “unknown” category and make data more discoverable?

2. The Analysis Of Colors By Means Of Contrasts In Movies
Niels Walkowski (BBAW / KU Leuven)

Slides
Cinemetrics, Colour Analysis & Digital Humanities:
- Brodbeck (2011) “Cinemetrics”: the project is about measuring and visualizing movie data, in order to reveal the characteristics of films and to create a visual “fingerprint” for them. Information such as the editing structure, color, speech or motion are extracted, analyzed and transformed into graphic representations so that movies can be seen as a whole and easily interpreted or compared side by side.
  Film Data Visualization
- Burghardt (2016) “Movieanalyzer“

3. New project announcement INSIGHT: Intelligent Neural Networks as Integrated Heritage Tools
Mike Kestemont (Universiteit Antwerpen)

Slides
Data from two museums Museums: Royal Museums of Fine Arts of Belgium and Royal Museums of Art and History;
Research opportunity: how can multimodal representation learning (NPL + Vision) help to organize and explore this data;
Transfer knowledge approach:
- Large players in the field have massive datasets;
- How easily can we transfer knowledge from large to small collections? E.g. automatic dating or object description;
Partner up: the Departments of Literature and Linguistics (Faculty of Arts and Philosophy) of the University of Antwerp and the Montefiore Institute (Faculty of Applied Sciences) of the University of Liège are seeking to fill two full-time (100%) vacancies for Doctoral Grants in the area of machine/deep learning, language technology, and/or computer vision for enriching heritage collections. More information.

4. Introduction of CODH computer vision and machine learning datasets such as old Japanese books and characters
Asanobu KITAMOTO (CODH -National Institute of Informatics)

Slides;
Center for Open Data in the Humanities (CODH);
It’s a research center in Tokyo, Japan, officially launched on April 1, 2017;
Scope: (1) humanities research using information technology and (2) other fields of research using humanities data.
Released datasets:
- Dataset of Pre-Modern Japanese Text (PMJT): Pre-Modern Japanese Text, owned by National Institute of Japanese Literature, is released image and text data as open data. In addition, some text has description, transcription, and tagging data.
  Pre-Modern Japanese Text Dataset: currently 701 books
- PMJT Character Shapes;
- IIIF Curation Viewer
  Curation Viewer
CODH is looking for a project researcher who is interested in applying computer vision to humanities data. Contact: http://codh.rois.ac.jp/recruit/

5. Introduction to the new South African Centre for Digital Language Resources (SADiLaR )
Juan Steyn

Slides;
SADiLaR is a new research infrastructure set up by the Department of Science and Technology (DST) forming part of the new South African Research Infrastructure Roadmap (SARIR).
Officially launched on October, 2016;
SADiLaR runs two programs:
- Digitisation program: which entails the systematic creation of relevant digital text, speech and multi-modal resources related to all official languages of South Africa, as well as the development of appropriate natural language processing software tools for research and development purposes;
- A Digital Humanities program; which facilitates research capacity building by promoting and supporting the use of digital data and innovative methodological approaches within the Humanities and Social Sciences. (See http://www.digitalhumanities.org.za)

Category: NLP

DH2017 – Computer Vision in DH workshop (lightining talks part 1)