Digital technologies for museums

The School of Applied Mathematics (EMAp) and the School of Social Sciences (CPDOC) from  Getúlio Vargas Foundation (FGV) will organize and host the First Panorama in Digital Technologies for Museums (I Panorama em Tecnologias Digitais para Museus) on November 27, 2018.

The objective of this Panorama is to present the demands of the museological sector, as well as reflections on previous experiences. Given the scenario of the recent disaster of the National Museum of UFRJ, it is necessary a reaction of all the actors involved in the theme: managers, researchers, educators and other sectors of society.

The event will discuss the strengthening of a knowledge network around the use of digital technologies in the museum context. Likewise, it is necessary to consider impacts related to the diffusion of the collections of these museums, understanding that the society’s engagement with the issue, as well as the development of a close relationship between population and museums, is one of the ways of preserving, collecting and maintaining investments in these institutions.

Representatives of diverse institutions will participate as speakers in this event. Among them, my Ph.D. co-advisor and coordinator of the Visgraf Laboratory, Luiz Velho.

Experiments by Google explore how machines understand artworks

Google Arts & Culture initiative promotes experiments at the crossroads of art and technology created by artists and creative coders. I selected two experiments that apply Machine Learning methods to detect objects in photographs and artworks and generate machine-based tags. These tags are then used to enhance accessibility and exploration of cultural collections.

Tags and Life Tags

These two demo experiments explore how computers read and tag artworks through a Machine Learning approach.

Tags: without the intervention of humans, keywords were generated by an algorithm also used in Google Photos, which analyzed the artworks by looking at the images without any metadata.

The user interface shows a list of tags (keywords) followed by its number of occurrence in the artwork collection. Selecting the tag ‘man’ reveals artworks containing what an intelligent machine understands to be a man. Hovering an artwork reveals other tags detected on that specific representation.

The user interface shows a list of tags (keywords) followed by its number of occurrence in the artwork collection. Selecting the tag ‘man’ reveals artworks containing what an intelligent machine understands to be a man. Hovering an artwork reveals other tags detected on that specific representation.

Life Tags: organizes over 4 million images from the Life magazine archives into an interactive interface that looks like an encyclopedia. The terms of the “encyclopedia” were generated by an algorithm based on a deep neural network used in Google photo search that has been trained on millions of images and labels to recognize categories for labels and pictures.

Labels were clustered into categories using a nearest neighbor algorithm, which finds related labels based on image feature vectors. Each image has multiple labels linked to the elements that are recognized. The full-size image viewer shows dotted lines revealing the objects detected by the computer.

The overall interface of Life Tags looks like an encyclopedia
Kitchen is a categoria clustering labels using a nearest neighbor algorithm.
Selecting a specific photo expands it and reveals the labels recognized by the machine.

Digital Humanities 2018: a selection of sessions I would like to attend

As Digital Humanities 2018 is approaching, I took a time to look at its program. Unfortunately, I didn’t have contributions to submit this year so I won’t attend the Conference. But I had the pleasure to be a reviewer this edition and I’ll also stay tuned on Twitter during the Conference!

My main topic of interest in Digital Humanities bridges the analysis of large-scale visual archives and graphical user interface to browse and make sense of them. So I selected the following contributions I would like to attend if I were at DH2018.


Distant Viewing with Deep Learning: An Introduction to Analyzing Large Corpora of Images

by Taylor Arnold, Lauren Tilton (University of Richmond)

Taylor and Lauren coordinate the Distant Viewing, a Laboratory which develops computational techniques to analyze moving image culture on a large scale. Previously, they contributed on Photogrammar, a web-based platform for organizing, searching, and visualizing the 170,000 photographs. This project was first presented ad Digital Humanities 2016. (abstract here) and I’ve mentioned this work in my presentation at the HDRIO2018 (slides here, Portuguese only).

  • Beyond Image Search: Computer Vision in Western Art History, with Miriam Posner, Leonardo Impett, Peter Bell, Benoit Seguin and Bjorn Ommer;
  • Computer Vision in DH, with Lauren Tilton, Taylor Arnold, Thomas Smits, Melvin Wevers, Mark Williams, Lorenzo Torresani, Maksim Bolonkin, John Bell, Dimitrios Latsis;
  • Building Bridges With Interactive Visual Technologies, with Adeline Joffres, Rocio Ruiz Rodarte, Roberto Scopigno, George Bruseker, Anaïs Guillem, Marie Puren, Charles Riondet, Pierre Alliez, Franco Niccolucci

Paper session: Art History, Archives, Media

  • The (Digital) Space Between: Notes on Art History and Machine Vision Learning, by Benjamin Zweig (from Center for Advanced Study in the Visual Arts, National Gallery of Art);
  • Modeling the Fragmented Archive: A Missing Data Case Study from Provenance Research, by Matthew Lincoln and Sandra van Ginhoven (from Getty Research Institute);
  • Urban Art in a Digital Context: A Computer-Based Evaluation of Street Art and Graffiti Writing, by Sabine Lang and Björn Ommer (from Heidelberg Collaboratory for Image Processing);
  • Extracting and Aligning Artist Names in Digitized Art Historical Archives by Benoit Seguin, Lia Costiner, Isabella di Lenardo, Frédéric Kaplan (from EPFL, Switzerland);
  • Métodos digitales para el estudio de la fotografía compartida. Una aproximación distante a tres ciudades iberoamericanas en Instagram (by Gabriela Elisa Sued)
Paper session: Visual Narratives
  • Computational Analysis and Visual Stylometry of Comics using Convolutional Neural Networks, by Jochen Laubrock and David Dubray (from University of Potsdam, Germany);
  • Automated Genre and Author Distinction in Comics: Towards a Stylometry for Visual Narrative, by Alexander Dunst and Rita Hartel (from University of Paderborn, Germany);
  • Metadata Challenges to Discoverability in Children’s Picture Book Publishing: The Diverse BookFinder Intervention, by Kathi Inman Berens, Christina Bell (from Portland State University and Bates College, United States of America)
Poster sessions:
  • Chromatic Structure and Family Resemblance in Large Art Collections — Exemplary Quantification and Visualizations (by Loan Tran, Poshen Lee, Jevin West and Maximilian Schich);
  • Modeling the Genealogy of Imagetexts: Studying Images and Texts in Conjunction using Computational Methods (by Melvin Wevers, Thomas Smits and Leonardo Impett);
  • A Graphical User Interface for LDA Topic Modeling (by Steffen Pielström, Severin Simmler, Thorsten Vitt and Fotis Jannidis)

Data mining with historical documents

The last seminar held by the Vision and Graphics Laboratory was about data mining with historical documents. Marcelo Ribeiro, a master student at the Applied Mathematics School of the Getúlio Vargas Foundation (EMAp/FGV), presented the results obtained with the application of topic modeling and natural language processing on the analysis of historical documents. This work was previously presented at the first International Digital Humanities Conference held in Brazil (HDRIO2018) and had Renato Rocha Souza (professor and researcher at EMAp/FGV) and Alexandre Moreli (professor and researcher at USP) as co-authors.

The database used is part of the CPDOC-FGV collection and essentially comprises historical documents from the 1970s belonging to Antonio Azeredo da Silveira, former Minister of Foreign Affairs of Brazil.

The documents:

• +10 thousand documents
• +66 thousand pages
• +14 million tokens / words (dictionaries or not)
• 5 languages, mainly Portuguese

• Physical documents
• Images (.tif and .jpg)
• Texts (.txt)

The presentation addressed the steps of the project, from document digitalization to Integration of results into the History-Lab platform.

The images below refer to the explanation of the OCR (Optical Character Recognition) phase and the topic modeling phase:

Presentation slides (in pt) can be accessed here. This initiative integrates the History Lab project, organized by Columbia University, which uses data science methods to investigate history.

Gugelmann Galaxy

Gugelmann Galaxy is an interactive demo by Mathias Bernhard exploring itens from the Gugelmann Collection, a group of 2336 works by the Schweizer Kleinmeister – Swiss 18th century masters. Gugelmann Galaxy is built on Three.js, a lightweight javascript library, allowing to create animated 3D visualizations in the browser using WebGL.

The images are grouped according to specific parameters that are automatically calculated by image analysis and text analysis from metadata. A high-dimensional space is then projected onto a 3D space, while preserving topological neighborhoods between images in the original space. More explanation about the dimensionality reduction can be read here.

The user interface allows four types of image arrangement: by color distribution, by technique, by description and by composition.  As the mouse hovers over the items, an info box with some metadata is displayed on the left. The user can also perform rotation, zooming, and panning.

The author wrote on his site:

The project renounces to come up with a rigid ontology and forcing the items to fit in premade categories. It rather sees clusters emerge from attributes contained in the images and texts themselves. Groupings can be derived but are not dictated.


My presentation at HDRio2018

During the paper session “Social networks and visualizations”, held on April 11 at HDRio2018 Congress, I presented the work “Perspectivas para integração do Design nas Humanidades Digitais frente ao desafio da análise de artefatos visuais”  (“Perspectives for integrating Design in Digital Humanities in the face of the challenge of visual artifacts analysis”).

In this work, I outline initial considerations of a broader and ongoing research that seeks to reflect on the contributions offered by the field of Design in the conception of a graphical user interface that, along with computer vision and machine learning technologies, support browsing and exploration of large collections of images.

I believe my contribution raises three main discussions for the field of Digital Humanities:

  1. The investigation of large collections of images (photographs, paintings, illustrations, videos, GIFs, etc.) using image recognition techniques through a Machine Learning approach;
  2. The valorization of texts and media produced on social networks as a valid source of cultural heritage for Digital Humanities studies;
  3. Integration of Design principles and methodologies (HCI and visualization techniques) in the development of tools to retrieve, explore and visualize large image collections.

Slides from this presentation can be accessed here (Portuguese only).

First Digital Humanities Conference in Brazil

The I International Congress on Digital Humanities – HDRio2018, held in Getulio Vargas Foundation (FGV), Rio de Janeiro, from April 9 to 13, 2018, initiated in Brazil a broad and international debate on this relevant and emerging field, constituting a timely opportunity for academics, scientists and technologists of Arts, Culture and Social Sciences, Humanities and Computation, to reflect, among other topics, the impact of information technologies, communication networks and the digitization of collections and processes in individuals’ daily lives and their effects on local and global institutions and societies, especially in Brazilian reality.

HDRio2018’s program included Opening and Closing Ceremony, 6 workshops, 8 panels, 8 paper sessions (featuring 181 presentations) and 1 poster session. Accepted papers can be found here.

Organizers: The Laboratory Of Digital Humanities – LHuD from Centre for Research and Documentation of Contemporary History of Brazil (CPDOC) at Getulio Vargas Foundation (FGV) and the Laboratory for Preservation and Management of Digital Collections (LABOGAD) at Federal University of the State of Rio de Janeiro (UNIRIO).

Visualizing time, texture and themes in historical drawings

Past vision is a collection of historical drawings visualized in a thematic and temporal arrangement. The interface highlights general trends in the overall collection and gives access to rich details of individual items.

The case study examines the potential of visualization when applied to, and developed for, cultural heritage collections. It specifically explores how techniques aimed at visualizing the quantitative structure of a collection can be coupled with a more qualitative mode that allows for detailed examination of the artifacts and their contexts by displaying high-resolution views of digitized cultural objects with detailed art historical research findings.

Past vision is a research project by Urban Complexity Lab at Potsdam University of Applied Sciences.

Reference: “Past Visions and Reconciling Views: Visualizing Time, Texture and Themes in Cultural Collections.” ResearchGate. Accessed March 8, 2018.

Visualizing cultural collections

Browsing the content from Information Plus Conference (2016 edition) I bumped into a really interesting presentation regarding the use of graphical user interfaces and data visualization to support the exploration of large-scale digital cultural heritage.

One View is Not Enough: High-level Visualizations of Large Cultural Collections is a contribution by the Urban Complexity Lab, from the University of Applied Sciences Potsdam. Check the talk by Marian Dörk:

As many cultural heritage institutions, such as museums, archives, and libraries, are digitizing their assets, there is a pressing question which is how can we give access to this large-scale and complex inventories? How can we present it in a way to let people can draw meaning from it, get inspired and entertained and maybe even educated?

The Urban Complexity Lab tackle this open problem by investigating and developing graphical user interfaces and different kinds of data visualizations to explore and visualize cultural collections in a way to show high-level patterns and relationships.

In this specific talk, Marian presents two projects conducted at the Lab. The first, DDB visualized, is a project in partnership with the Deutsche Digitale Bibliothek. Four interactive visualizations make the vast extent of the German Digital Library visible and explorable. Periods, places and persons are three of the categories, while keywords provide links to browsable pages of the library itself.


The second, GEI – Digital, is a project in partnership with the Georg Eckert Institute. This data dossier provides multi-faceted perspectives on GEI-Digital, a digital library of historical schoolbooks created and maintained by the Georg Eckert Institute for International Textbook Research.


The problem of gender bias in the depiction of activities such as cooking and sports in images

The challenge of teaching machines to understand the world without reproducing prejudices. Researchers from Virginia University have identified that intelligent systems have started to link the cooking action in images much more to women than men.

Gender bias test with artificial intelligence to the act “cook”: women are more associated, even when there is a man in the image.

Just like search engines – which Google has as its prime example – do not work under absolute neutrality, free of any bias or prejudice, machines equipped with artificial intelligence trained to identify and categorize what they see in photos also do not work in a neutral way.

Article on Wired.

Article on Nexo (Portuguese)

Reference: Zhao, Jieyu, Tianlu Wang, Mark Yatskar, Vicente Ordonez, and Kai-Wei Chang. “Men Also Like Shopping: Reducing Gender Bias Amplification Using Corpus-Level Constraints.” arXiv:1707.09457 [Cs, Stat], July 28, 2017.