DH2017 – Computer Vision in DH workshop (Keynote)

Robots Reading Vogue Project

A keynote by Lindsay King & Peter Leonard (Yale University) on “Processing Pixels: Towards Visual Culture Computation”.

SLIDES HERE

Abstract: This talk will focus on an array of algorithmic image analysis techniques, from simple to cutting-edge, on materials ranging from 19th century photography to 20th century fashion magazines. We’ll consider colormetrics, hue extraction, facial detection, and neural network-based visual similarity. We’ll also consider the opportunities and challenges of obtaining and working with large-scale image collections.

Project Robots Reading Vogue project at Digital Humanities Lab Yale University Library

1) The project:

  • 121 yrs of Vogue (2,700 covers, 400,000 pages, 6 TB of data). First experiments: N-Grams, topic modeling.
  • Humans are better at seeing “distant vision” (images) patterns with their own eyes than  “distant reading” (text)
  • A simple layout interface of covers by month and year reveals patterns about Vogue’s seasonal patterns
  • The interface is not technically difficult do implement
  • Does not use computer vision for analysis

2) Image analysis in RRV (sorting covers by color to enable browsing)

    • Media visualization (Manovich) to show saturation and hue by month. Result: differences by the season of the year. Tool used:  ImagePlot
    • “The average color problem”. Solutions:
    • Slice histograms: Visualization Peter showed.

The slice histograms give us a zoomed-out view unlike any other visualizations we’ve tried. We think of them as “visual fingerprints” that capture a macroscopic view of how the covers of Vogue changed through time.
  • “Face detection is kinda of a hot topic people talk about but I only think it is of use when it is combined with other techniques’ see e.g. face detection within 

    3. Experiment Face Detection + geography 

  •  Photogrammer
Face Detection + Geography
  • Code on Github
  • Idea: Place image as thumbnail in a map
  • Face Detection + composition
Face Detection + composition

4. Visual Similarity 

  • What if we could search for pictures that are visually similar to a given image
  • Neural networks approach
  • Demo of Visual Similarity experiment:
In the main interface, you select an image and it shows its closest neighbors.
  • In the main interface, you select an image and it shows its closest neighbors.

Other related works on Visual Similarities:

  • John Resig’s Ukiyo-e  (Japenese woodblock prints project). Article: Resig, John. “Aggregating and Analyzing Digitized Japanese Woodblock Prints.” Japanese Association of Digital Humanities conference, 2013.
  • John Resig’s  TinEye MatchEngine (Finds duplicate, modified and even derivative images in your image collection).
  • Carl Stahmer – Arch Vision (Early English Broadside / Ballad Impression Archive)
  • Article: Stahmer, Carl. (2014). “Arch-V: A platform for image-based search and retrieval of digital archives.” Digital Humanities 2014: Conference Abstracts
  • ARCHIVE-VISION Github code here
  • Peter refers to paper Benoit presented in Krakow.

5. Final thoughts and next steps

  • Towards Visual Cultures Computation
  • NNs are “indescribable”… but we can dig in to look at pixels that contribute to classifications: http://cs231n.github.io/understanding-cnn/
  • The Digital Humanities Lab at Yale University Library is currently working with as image dataset from YALE library through Deep Learning approach to detect visual similarities.
  • This project is called Neural Neighbors and there is a live demo of neural network visual similarity on 19thC photos
Neural Neighbors seeks to show visual similarities in 80,000 19th Century photographs
  • The idea is to combine signal from pixels with signal from text
  • Question: how to organize this logistically?
  • Consider intrinsic metadata of available collections
  • Approaches to handling copyright licensing restrictions (perpetual license and transformative use)
  • Increase the number of open image collections available: museums, governments collections, social media
  • Computer science departments working on computer vision with training datasets.

 

DH2017 – Computer Vision in DH workshop (Hands-on)

Hands-on Part I – Computer Vision basics, theory, and tools

SLIDES HERE

Instructor: Benoit Seguin (from Image and Visual Representation Lab – | École Polytechnique Fédérale de Lausanne)

An introduction of basic notions about the challenges of computer vision. A feeling of the simple, low-level operations necessary for the next stage.

Tools:
Python
Basic image operations: scikit-image
Face-object identification + identification: dlib
Deep Learning: Keras

What is CV?
How to gain high-level understanding from digital images or videos.
It tries to resolve tasks that humans can do (Wikipedia)

Human Vision System (HVS) versus Digital Image Processing (what the computer sees)

Our human understanding of images is way more complex than their digital version (arrays of pixels)
Convolution illustrated

Practice:
Jupyter system (an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text);
– perform basic image operations;
– Play with different convolutions to develop intuition.

Hands-on Part II – Deep Learning and its application

 

DH2017 – Computer Vision in DH workshop

During the DH2017 conference in Montreal, I attended the ‘Computer Vision in Digital Humanities‘ workshop organized by AVinDH SIG (Special Interest Group AudioVisual material in Digital Humanities). All information about the workshop can be found here.

An abstract about the workshop was published on DH2017 Proceedings and can be found here.

Computer Vision in Digital Humanities Workshop: Keynote by Lindsay King & Peter Leonard.
Workshop Computer Vision in Digital Humanities: hands-on session.

This workshop focus on how computer vision can be applied within the realm of Audiovisual Materials in Digital Humanities. The workshop included:

  • A keynote by Lindsay King & Peter Leonard (Yale University) on “Processing Pixels: Towards Visual Culture Computation”.
  • Paper presentations. (papers have been selected by a review commission)
  • hands-on session to experiment with open source Computer Vision tools.
  • Lightning Talks allowing participants to share their ideas, projects or ongoing work in a short presentation of two minutes.