The Visible Archive

In this TEDxCanberrra talk, Mitchell Whitelaw talks about the limitations of the search box paradigm and presents a project he is involved in to preserve and visualize rich data about Australian cultural assets including photographs, prints and documents.

The Visible Archive (now discontinued) was a research project on the visualisation of archival datasets, supported by the National Archives of Australia under the 2008 Ian Maclean Award. As part of this work Whitelaw developed two prototype visualisations of the Archives collection:

  1. The Series Browser, visualising all 65,000 archival series in the collection

2. The A1 Explorer, showing some 64,000 records in series A1

Building on techniques developed in the Visible Archive project, the Flickr Commons Explorer was created. The Explorer (also discontinued) presents a three-pane interface consisting of a term cloud, a single image view, and a thumbnail grid, with a central strip providing navigation and orientation.

Flickr Commons Explorer’s main interface

Video recordings of Information Plus Conference

Information Plus is a biennial conference on interdisciplinary practices in information design and visualization. The last edition took place in Potsdam Germany from 19 to 21 October.

Organizers have just updated the website with video recordings of the first conference day and photo documentation of the workshops, exhibition and dialog dinner. The remaining videos will follow over the next weeks.

Presentations I watched so far:

Digital technologies to preserve and disseminate Brazilian cultural collections

The fire that destroyed the National Museum in Rio in September this year sparked the alert for the state of conservation of the Brazilian collections and has motivated initiatives from different sectors of civil society. Tomorrow (September, 27), a group of researchers, curators, and educators will meet to discuss how digital technologies can help preserve, disseminate and popularize national cultural collections.

Coordinator of the Vision and Computer Graphics Laboratory of IMPA (Visgraf-IMPA), and one of the guests of the “I Panorama in Digital Technologies for Museums” (“I Panorama em Tecnologias Digitais para Museus“), Luiz Velho knows the theme well. For over two decades, Visgraf has developed projects related to different processes of safeguarding, researching and disseminating museum collections.

At the round table “State of the art of technological solutions, reflections on experiences implemented”, Velho will present part of the work done by Visgraf. One of them is the 3D Museum, a modeling and visualization project, which resulted in the creation of a website and a CD with the virtual exhibition of a collection of clay sculptures, part of the collection of the Folklore Museum of Rio.

Velho will also present projects created for the Astronomy Museum (MAST) and the Antônio Carlos Jobim Institute. Recently, Visgraf partnered with the Moreira Salles Institute (IMS) to research and develop applications regarding IMS’s photographic cultural heritage.

The event will be held at the auditorium of the FGV headquarters (Praia de Botafogo, 190), from 8:30 a.m. to 5:00 p.m.

Registration is free and can be done here

Updated (28 November)

Slides of the presentation Media Technologies in the New Museum

Participants of the round table “State of the art of technological solutions, reflections on experiences implemented”

Luiz Velho talk

Velho presents IMPA and IMS project aiming IMS’s photographic collections.

Crotos: a project on visual artworks powered by Wikidata and Wikimedia Commons

Crotos is a search and display engine for visual artworks based on Wikidata and using Wikimedia Commons files.

The Wikidata extraction contains more than 133 866 artworks (September 2018) including 66 271 with HD image. This extraction is regularly automatically updated from Wikidata on the basis of the nature of the items and corresponds to visual artworks such as paintings, photographs, prints, illuminated manuscripts and much more.

The interface

Searches can be made by free or indexed search through a user interface. Results are displayed by chronological order with thumbnails. Links on thumbnails open a viewer with the image hosted on Wikimedia Commons.

It is possible to filter the results by type (painting, sculpture, print…) or to specify a period as a criterion.

By default, without criteria, a random selection is displayed. Besides with the Cosmos interface, it is possible to discover the artworks by indexation (par type d’œuvre, creator, movement, genre, collection…).


For each resulting image, the interface displays the title, the creator(s)  and the collection or the location where the artwork is maintained. These information are on Wikidata, a free, collaborative, multilingual, secondary database, collecting structured data to provide support for Wikipedia, Wikimedia Commons, the other wikis of the Wikimedia movement, and to anyone in the world.

Additional descriptors are date or period, nature of work, material used, inventory number, movement, genre, depicts, main subject, and so on. A full list of descriptors is mentioned here.

Contribution mode

The project has a contribution mode, useful for identifying missing information with facets. Finally, source is on github and the database of Crotos can be downloaded. Both are under Free Licence.

IMS Photography Research Grant

In order to stimulate the research of the history of Brazilian photography and the works of its collection, Instituto Moreira Salles, holder of one of the most relevant photographic collections in Brazil, is promoting the first edition of the IMS Photography Research Grant. An unprecedented project will be selected based on the production of Marc Ferrez (1843-1923), one of the most important Brazilian photographers of the 19th century, whose work has been under the custody of IMS since 1998.

Entries are from July 16 to August 31, 2018, and the final result will be announced in October.

Grant announcement, application form and additional information here (Portuguese only)

Multiplicity project at the 123data exhibition in Paris

Multiplicity is a collective photographic portrait of Paris. Idealized and designed by Moritz Stefaner, in the occasion of the 123 data exhibition, this interactive installation provides an immersive dive into the image space spanned by hundreds of thousands of photos taken across the Paris city area and shared on social media.

Content selection and curation aspects

The original image dataset consisted of 6.2m geo-located social media photos posted in Paris in 2017. However, for a not really clarified reason (maybe a technical aspect?), a custom selection of 25.000 photos was chosen according to a list of criteria. Moritz highlights it was his intention not to measure, but portray the city. He says: “Rather than statistics, the project presents a stimulating arrangement of qualitative contents, open for exploration and to interpretation — consciously curated and pre-arranged, but not pre-interpreted.” This curated method wasn’t just used for data selection but also for bridging the t-SNE visualization and the grid visualization. Watch the transition effect in the video below. As a researcher interested in user interface and visualization techniques to support knowledge discovery in digital image collections, I wonder if a curated-applied method could be considered in a Digital Humanities approach.

Data Processing

Using machine learning techniques, the images are organized by similarity and image contents, allowing to visually explore niches and microgenres of image styles and contents. More precisely, it uses t-SNE dimensionality reduction to visualize the features from the last layer of a pre-trained neural network to cluster images of Paris. The author says: “I used feature vectors normally intended for classification to calculate pairwise similarities between the images. The map arrangement was calculated using t-SNE — an algorithm that finds an optimal 2D layout so that similar images are close together.”

While the t-SNE algorithm takes care of the clustering and neighborhood structure, manual annotations help with identification of curated map areas. These areas can be zoomed on demand enabling close viewing of similar photos.


Curating photography with neural networks

“Computed Curation” is a 95-foot-long, accordion photobook created by a computer. Taking the human editor out of the loop, it uses machine learning and computer vision tools to curate a series of photos from Philipp Schmitt personal archive.

The book features 207 photos taken between 2013 to 2017. Considering both image content and composition the algorithms uncover unexpected connections among photographies and interpretations that a human editor might have missed.

A spread of the accordion book feels like this: on one page, a photograph is centralized with a caption above it: “a harbor filled with lots of traffic” [confidence: 56,75%]. Location and date appear next to the photo, as a credit: Los Angeles, USA. November, 2016. On the bottom of the photo, some tags are listed: “marina, city, vehicle, dock, walkway, sport venue, port, harbor, infrastructure, downtown”. On the next page, the same layout with different content: a picture is captioned “a crowd of people watching a large umbrella” [confidence: 67,66%]. Location and date: Berlin, Germany. August, 2014. Tags: “crowd, people, spring, festival, tradition”.

Metadata from the camera device (date and location) is collected using Adobe Lightroom. Visual features (tags and colors) are extracted from photos using Google’s Cloud Vision API. Automated captions for photos, with their corresponding score confidence, are generated using Microsoft’s Cognitive Services API. Finally, image composition is analyzed using histogram of oriented gradients (HOGs). These components were then considered by a t-SNE learning algorithm, which sorted the images in a two-dimensional space according to similarities. A genetic TSP algorithm computes the shortest path through the arrangement, thereby defining the page order. You can check out the process, recorded in his video below: