Conserve the sound is an online museum for vanishing and endangered sounds. The sound of a dial telephone, a walkman, an analog typewriter, a pay phone, a 56k modem, a nuclear power plant or even a cell phone keypad are partially already gone or are about to disappear from our daily life.
Conserve the sound is a project form CHUNDERKSEN and is funded
by the Film & Medienstiftung NRW, Germany.
Cinemetrics, Colour Analysis & Digital Humanities:
Brodbeck (2011) “Cinemetrics”: the project is about measuring and visualizing movie data, in order to reveal the characteristics of films and to create a visual “fingerprint” for them. Information such as the editing structure, color, speech or motion are extracted, analyzed and transformed into graphic representations so that movies can be seen as a whole and easily interpreted or compared side by side.
Data from two museums Museums: Royal Museums of Fine Arts of Belgium and Royal Museums of Art and History;
Research opportunity: how can multimodal representation learning (NPL + Vision) help to organize and explore this data;
Transfer knowledge approach:
Large players in the field have massive datasets;
How easily can we transfer knowledge from large to small collections? E.g. automatic dating or object description;
Partner up: the Departments of Literature and Linguistics (Faculty of Arts and Philosophy) of the University of Antwerp and the Montefiore Institute (Faculty of Applied Sciences) of the University of Liège are seeking to fill two full-time (100%) vacancies for Doctoral Grants in the area of machine/deep learning, language technology, and/or computer vision for enriching heritage collections. More information.
4. Introduction of CODH computer vision and machine learning datasets such as old Japanese books and characters
Asanobu KITAMOTO (CODH -National Institute of Informatics)
It’s a research center in Tokyo, Japan, officially launched on April 1, 2017;
Scope: (1) humanities research using information technology and (2) other fields of research using humanities data.
Dataset of Pre-Modern Japanese Text (PMJT): Pre-Modern Japanese Text, owned by National Institute of Japanese Literature, is released image and text data as open data. In addition, some text has description, transcription, and tagging data.
SADiLaR is a new research infrastructure set up by the Department of Science and Technology (DST) forming part of the new South African Research Infrastructure Roadmap (SARIR).
Officially launched on October, 2016;
SADiLaR runs two programs:
Digitisation program: which entails the systematic creation of relevant digital text, speech and multi-modal resources related to all official languages of South Africa, as well as the development of appropriate natural language processing software tools for research and development purposes;
A Digital Humanities program; which facilitates research capacity building by promoting and supporting the use of digital data and innovative methodological approaches within the Humanities and Social Sciences. (See http://www.digitalhumanities.org.za)
In this work, the researchers train machine learning algorithms to match images from book scans with the text in the pages surrounding those images.
Using 400K images collected from 65K volumes published between the 14th and 20th centuries released to the public domain by the British Library, they build information retrieval systems capable of performing cross-modal retrieval, i.e., searching images using text, and vice-versa.
Previous multi-modal work:
Datasets: Microsoft Common Objects in Context (COCO) and Flickr (images with user-provided tags);
Tasks: Cross-modal information retrieval (ImageCLEF) and Caption search / generation
Use text to provide context for the images we see in digital libraries, and as a noisy “label” for computer vision tasks
Use images to provide grounding for text.
Why is this hard? Most relationship between text and images is weakly aligned, that is, very vague. A caption is an example of strong alignments between text and images. An article is an example of weak alignment.
The research project FilmColors, funded by an Advanced Grant of the European Research Council, aims at a systematic investigation into the relationship between film color technologies and aesthetics.
Initially, the research team analyzed a large group of 400 films from 1895 to 1995 with a protocol that consists of about 600 items per segment to identify stylistic and aesthetic patterns of color in film.
This human-based approach is now being extended by an advanced software that is able to detect the figure-ground configuration and to plot the results into corresponding color schemes based on a perceptually uniform color space (see Flueckiger 2011 and Flueckiger 2017, in press).
Is a drop-in module that facilitates the creation and sharing of time-based media annotations on the Web
Knight News Challenge Prototype Grant
Knight Foundation has awarded a Prototype Grant for Media Innovation to The Media Ecology Project (MEP) and Prof. Lorenzo Torresani’s Visual Learning Group at Dartmouth, in conjunction with The Internet Archive and the VEMI Lab at The University of Maine.
“Unlocking Film Libraries for Discovery and Search” will apply existing software for algorithmic object, action, and speech recognition to a varied collection of 100 educational films held by the Internet Archive and Dartmouth Library. We will evaluate the resulting data to plan future multimodal metadata generation tools that improve video discovery and accessibility in libraries.
This presentation is about Decomposing Bodies, a large-scale, lab-based, digital humanities project housed in the Visual Media Workshop at the University of Pittsburgh that is examining the system of criminal identification introduced in France in the late 19th century by Alphonse Bertillon.
Data: System of criminal identification from American prisoners from Ohio.
Tool: OpenFace. Free and open source face recognition with deep neural networks.
Goal: An end-to-end system for extracting handwritten text and numbers from scanned Bertillon cards in a semi-automated fashion and also the ability to browse through the original data and generated metadata using a web interface.
Mechanical Turk: we need to talk about it”: consider Mechanical Turk if public domain data and task is easy.
Findings: Humans deal very well with understanding discrepancies. We should not ask the computer to find these discrepancies to us, but we should build visualizations that allow us to visually compare images and identify de similarities and discrepancies.
2) Distant Viewing TV (Taylor Arnold and Lauren Tilton, University of Richmond)
Distant Viewing TV applies computational methods to the study of television series, utilizing and developing cutting-edge techniques in computer vision to analyze moving image culture on a large scale.
This presentation related the University of Oxford’s Visual Geometry Group’s experience in making images computationally addressable for humanities research.
The Visual Geometry Group has built a number of systems for humanists, variously implementing (i) visual search, in which an image is made retrievable; (ii) comparison, which assists the discovery of similarity and difference; (iii) classification, which applies a descriptive vocabulary to images; and (iv) annotation, in which images are further described for both computational and offline analysis
Idea: Visual Search for the Era of Big Data is a large research project based in the Department of Engineering Science, University of Oxford. It is funded by the EPSRC (Engineering and Physical Sciences Research Council), and will run from 2015 – 2020.
Objectives: to carry out fundamental research to develop next generation computer vision methods that are able to analyse, describe and search image and video content with human-like capabilities. To transfer these methods to industry and to other academic disciplines (such as Archaeology, Art, Geology, Medicine, Plant sciences and Zoology)
This is a technical demo of the large-scale on-the-fly web search technologies which are under development in the Oxford University Visual Geometry Group, using data provided by BBC R&D comprising over five years of prime-time news broadcasts from six channels. The demo consists of three different components, which can be used to query the dataset on-the-fly for three different query types: object search, image search and people search.
The objective of this research is to find objects in paintings by learning classifiers from photographs on the internet. There is a live demo that allows a user to search for an object of their choosing (such as “baby”, “bird”, or “dog, for example) in a dataset of over 200,000 paintings, in a matter of seconds.
It allows computers to recognize objects in images, what is distinctive about our work is that we also recover the 2D outline of the object. Currently, the project has trained this model to recognize 20 classes. The demo allows the user to test our algorithm on their images.