PixPlot is a project by Yale Digital Humanities Lab Team. The tool facilitates the dynamic exploration of tens of thousands of images. Inspired by Benoît Seguin et al’s paper at DH Krakow (2016), PixPlot uses the penultimate layer of a pre-trained convolutional neural network for image captioning to derive a robost featurization space in 2,048 dimensions.
Improved Dimensionality Reduction
In order to collapse those 2,048 dimensions into something that can be rendered on a computer screen, we turned to Uniform Manifold Approximation and Projection (UMAP), a dimensionality reduction technique similar to t-Distributed Stochastic Neighbor Embedding (t-SNE) that seeks to preserve both local clusters and an intrepretable global shape.
The resulting WebGL-powered visualization consists of a two-dimensional projection within which similar images cluster together. Users can navigate the space by panning and zooming in and out of clusters of interest, or they can jump to designated “hotspots” that feature a representative image from each cluster, as identified by the computer.
PixPlot provides new ways of engaging large-scale visual collections. Initial experiments underway at Yale use the tool to look at thousands of cultural heritage images held in the Beinecke Rare Book & Manuscript Library, Yale Center for British Art, and the Medical Historical Library.