“Urban histories can be told in a thousand ways. The archaeological research project of the North/South metro line lends the River Amstel a voice in the historical portrayal of Amsterdam. The Amstel was once the vital artery, the central axis, of the city. Along the banks of the Amstel, at its mouth in the IJ, a small trading port originated about 800 years ago. At Damrak and Rokin in the city center, archaeologists had a chance to physically access the riverbed, thanks to the excavations for the massive infrastructure project of the North/South metro line between 2003 and 2012”.
Bellow the surface website presents the scope and methods of this research project, detailing the data processing and of approximately 700,000 finds. The website provides access to the photographs of more than 19,000 finds. The access to the complete dataset hasn’t been released yet, but a disclaimer note informs it will be shortly available.
In the main user interface (the object overview page), thumbnails of all 18.978 objects are arranged by estimated time of creation (period varies from 2005 AC to 119.000 BC). Users can scroll vertically along time with a mouse. A panel of facets filters the objects according to time range, object function (communication & exchange, personal artifacts & clothing, etc.), material (metal, synthetics, etc.) and location (Damrak or Rokin metro stations). The time range facet has an interesting feature: it is also a visual variable that shows patterns of distribution at a glance. The other facets indicate the objects occurrence through absolute numbers. Facets don’t require a preceding search and enable refinement (selecting an option of a facet changes the options occurrence in the other facets.
Selecting a thumbnail of an object reveals detailed information about it (close viewing). A bigger photography t is shown followed by detailed information about the object properties.
This research was conducted by the Department of Archaeology, Monuments and Archaeology (MenA), City of Amsterdam, in cooperation with the Chief Technology Office (CTO), City of Amsterdam.
The Institute of Pure and Applied Mathematics (IMPA), one of the most important Brazilian research centers, and The Moreira Salles Institute (IMS), holder of one of the most relevant cultural collections of Brazil, signed a research agreement aiming IMS’s photographic heritage. Two institutes apparently of such a distinct nature, but which embraced the challenge of bridging research and development in Mathematics, Design and Culture. I hope this partnership illustrates how productive and valuable interdisciplinary and collaborative knowledge can be! I feel privileged to contribute as a designer and researcher on this project! Hands on!
The article that follows has been translated from IMPA’s website (original article in Pt)
IMPA and IMS team up with focus on photographic heritage
Imagine strolling down a Rue de l’Ouvidor filled with perfumeries, shop windows with the latest fashion in Paris […] as registered by Marc Ferrez in 1890. Thanks to a partnership with IMPA, new forms of enjoyment of the invaluable photographic collection of Instituto Moreira Salles (IMS) will emerge, as, perhaps, immersive experiences in iconic spaces of the history of Rio de Janeiro.
Renowned in the areas in which they act, IMPA and IMS signed on Monday (25) a collaborative research term, a promising step that will show how productive and valuable the intersection between institutes that are dedicated to Mathematics and the Culture.
The collaboration will be led by Visgraf, the Vision and Computer Graphics Laboratory of IMPA – created in 1989, with ample expertise in the field and a series of national and international partnerships – and IMS Photography Coordination.
With 2 million images, the collection brings together one of the most important collections of the 19th century in Brazil, including that of the pioneer Marc Ferrez (1843-1923), the main photographer of the period. IMS also has relevant collections from almost the entire 20th century and saves no effort to incorporate records from the current century.
The IMPA general director, Marcelo Viana, praised the convergence between the two institutes. The Visgraf coordinator, Luiz Velho, considered the partnership a way to validate and transfer to the society the work developed at IMPA: “It’s a way of empowering the culture of Brazil. The IMS collection is invaluable and, frankly, we can do unprecedented things” he said, estimating that the collaborative agreement would be long-term.
The IMS executive director, Flávio Pinheiro, observed that faced with the massive collection of the institution, it is inexorable to think about the impact caused by the new technologies: “It’s a huge job and a path that we needed to break through. I think georeferencing photographies is a start, something important, but there are unsuspected things ahead, such as what artificial intelligence and machine learning can do with images.”
Platform allows immersive experience
Dedicated to computational mathematics applied to media, Visgraf conducts research and development in several fields, such as visualization and image processing, animation and multimedia and virtual and augmented reality, with projects that go through the elaboration of audiovisual narratives, reality systems virtual, data monitoring and visualization platforms, among others.
For those who still wonder the connection between Mathematics and photography, it is worth knowing that algorithms can be developed, for example, to extract information from images. Techniques can allow computers to recognize faces or objects in large-scale visual databases, such as IMS photographic collection.
“Essentially, you have a mathematical model that is an abstraction of representations of some sort, and everything revolves around that. The area where we work at Visgraf goes from very technical things, like super-resolution that takes an image and enlarges it beyond the limit the photo was captured by the equipment, to image analyses and understanding. A photo has metadata that can be referenced”, noted Velho.
Under the agreement, IMPA lent to IMS a unique device in Brazil, developed by Google, which combines several applications for viewing content. The Liquid Galaxy is a platform consisting of screens with an angle of about 180 degrees that allow a panoramic view of videos and photos, enabling interactive tours in an immersive 3D environment. It will be used in research and demonstrations, said Velho.
About the equipment, Sergio Burgi, coordinator of IMS photography, highlighted the enormous potential a platform has when it integrates several databases, such as photographs and cartographies, among others. “There are many possibilities that are not yet effective in the collections. It’s an absurdly new challenge in the field of information processing conciliating intellectual and technical metadata”, he emphasized.
The management of the partnership will be accompanied by Visgraf’s assistant researcher, Julia Giannella, and the IMS’s technical assistant, Bruno Buccalon. With the outstanding IMS collection and IMPA resources, the collaboration will be long and productive, esteem Velho: “It will work very well and with scientific rigor. I think we are in a holistic moment of humanity, of integration of Exact and Social sciences. This type of partnership we are doing here is innovative. The sky is the limit”, concluded the coordinator of Visgraf, about the results of the agreement.
Google Arts & Culture initiative promotes experiments at the crossroads of art and technology created by artists and creative coders. I selected two experiments that apply Machine Learning methods to detect objects in photographs and artworks and generate machine-based tags. These tags are then used to enhance accessibility and exploration of cultural collections.
Tags and Life Tags
These two demo experiments explore how computers read and tag artworks through a Machine Learning approach.
Tags: without the intervention of humans, keywords were generated by an algorithm also used in Google Photos, which analyzed the artworks by looking at the images without any metadata.
The user interface shows a list of tags (keywords) followed by its number of occurrence in the artwork collection. Selecting the tag ‘man’ reveals artworks containing what an intelligent machine understands to be a man. Hovering an artwork reveals other tags detected on that specific representation.
Life Tags: organizes over 4 million images from the Life magazine archives into an interactive interface that looks like an encyclopedia. The terms of the “encyclopedia” were generated by an algorithm based on a deep neural network used in Google photo search that has been trained on millions of images and labels to recognize categories for labels and pictures.
Labels were clustered into categories using a nearest neighbor algorithm, which finds related labels based on image feature vectors. Each image has multiple labels linked to the elements that are recognized. The full-size image viewer shows dotted lines revealing the objects detected by the computer.
As Digital Humanities 2018 is approaching, I took a time to look at its program. Unfortunately, I didn’t have contributions to submit this year so I won’t attend the Conference. But I had the pleasure to be a reviewer this edition and I’ll also stay tuned on Twitter during the Conference!
My main topic of interest in Digital Humanities bridges the analysis of large-scale visual archives and graphical user interface to browse and make sense of them. So I selected the following contributions I would like to attend if I were at DH2018.
Distant Viewing with Deep Learning: An Introduction to Analyzing Large Corpora of Images
Taylor and Lauren coordinate the Distant Viewing, a Laboratory which develops computational techniques to analyze moving image culture on a large scale. Previously, they contributed on Photogrammar, a web-based platform for organizing, searching, and visualizing the 170,000 photographs. This project was first presented ad Digital Humanities 2016. (abstract here) and I’ve mentioned this work in my presentation at the HDRIO2018 (slides here, Portuguese only).
Beyond Image Search: Computer Vision in Western Art History, with Miriam Posner, Leonardo Impett, Peter Bell, Benoit Seguin and Bjorn Ommer;
Computer Vision in DH, with Lauren Tilton, Taylor Arnold, Thomas Smits, Melvin Wevers, Mark Williams, Lorenzo Torresani, Maksim Bolonkin, John Bell, Dimitrios Latsis;
Building Bridges With Interactive Visual Technologies, with Adeline Joffres, Rocio Ruiz Rodarte, Roberto Scopigno, George Bruseker, Anaïs Guillem, Marie Puren, Charles Riondet, Pierre Alliez, Franco Niccolucci
Extracting and Aligning Artist Names in Digitized Art Historical Archives by Benoit Seguin, Lia Costiner, Isabella di Lenardo, Frédéric Kaplan (from EPFL, Switzerland);
Métodos digitales para el estudio de la fotografía compartida. Una aproximación distante a tres ciudades iberoamericanas en Instagram (by Gabriela Elisa Sued)
Paper session: Visual Narratives
Computational Analysis and Visual Stylometry of Comics using Convolutional Neural Networks, by Jochen Laubrock and David Dubray (from University of Potsdam, Germany);
Automated Genre and Author Distinction in Comics: Towards a Stylometry for Visual Narrative, by Alexander Dunst and Rita Hartel (from University of Paderborn, Germany);
Metadata Challenges to Discoverability in Children’s Picture Book Publishing: The Diverse BookFinder Intervention, by Kathi Inman Berens, Christina Bell (from Portland State University and Bates College, United States of America)
Chromatic Structure and Family Resemblance in Large Art Collections — Exemplary Quantification and Visualizations (by Loan Tran, Poshen Lee, Jevin West and Maximilian Schich);
Modeling the Genealogy of Imagetexts: Studying Images and Texts in Conjunction using Computational Methods (by Melvin Wevers, Thomas Smits and Leonardo Impett);
A Graphical User Interface for LDA Topic Modeling (by Steffen Pielström, Severin Simmler, Thorsten Vitt and Fotis Jannidis)
The last seminar held by the Vision and Graphics Laboratory was about data mining with historical documents. Marcelo Ribeiro, a master student at the Applied Mathematics School of the Getúlio Vargas Foundation (EMAp/FGV), presented the results obtained with the application of topic modeling and natural language processing on the analysis of historical documents. This work was previously presented at the first International Digital Humanities Conference held in Brazil (HDRIO2018) and had Renato Rocha Souza (professor and researcher at EMAp/FGV) and Alexandre Moreli (professor and researcher at USP) as co-authors.
The database used is part of the CPDOC-FGV collection and essentially comprises historical documents from the 1970s belonging to Antonio Azeredo da Silveira, former Minister of Foreign Affairs of Brazil.
• +10 thousand documents
• +66 thousand pages
• +14 million tokens / words (dictionaries or not)
• 5 languages, mainly Portuguese
Existing projects in visualization-based interfaces (interfaces which enables navigation through visualization) for cultural collections usually focusses on making their content more accessible to specialists and the public.
Possibly one of the first attempts to explore new forms of knowledge discovery in cultural collections was SFMOMA ArtScope, developed by Stamen Design in 2007 (now decommissioned). The interface allows users to explore more than 6,000 artworks in a grid-based and zoomable visualization. Navigating the collection follows a visualization-based first paradigm which is mainly exploratory (although the interface enables navigation through keyword search, the visualization canvas is clearly protagonist). The artworks’ thumbnails are visually organized by when they were purchased by the museum. The user is able to pan the canvas by dragging it and the lens serves as a selection tool, which magnifies the selected work and reveals detailed information about the selected piece.
ArtScope is an attractive interface which offers the user an overview of the size and content of SFMOMA’s collection. However, the artworks in the canvas are only organized by time of acquisition, a not very informative feature for users (maybe just for the staff museum). Other dimensions (authorship, creation date, technique, subject, etc.) can’t either be filtered and visually organized in the structure of the canvas.
The video bellow illustrates the interface navigation:
Multiplicity is a collective photographic portrait of Paris. Idealized and designed by Moritz Stefaner, in the occasion of the 123 data exhibition, this interactive installation provides an immersive dive into the image space spanned by hundreds of thousands of photos taken across the Paris city area and shared on social media.
Content selection and curation aspects
The original image dataset consisted of 6.2m geo-located social media photos posted in Paris in 2017. However, for a not really clarified reason (maybe a technical aspect?), a custom selection of 25.000 photos was chosen according to a list of criteria. Moritz highlights it was his intention not to measure, but portray the city. He says: “Rather than statistics, the project presents a stimulating arrangement of qualitative contents, open for exploration and to interpretation — consciously curated and pre-arranged, but not pre-interpreted.” This curated method wasn’t just used for data selection but also for bridging the t-SNE visualization and the grid visualization. Watch the transition effect in the video below. As a researcher interested in user interface and visualization techniques to support knowledge discovery in digital image collections, I wonder if a curated-applied method could be considered in a Digital Humanities approach.
Using machine learning techniques, the images are organized by similarity and image contents, allowing to visually explore niches and microgenres of image styles and contents. More precisely, it uses t-SNE dimensionality reduction to visualize the features from the last layer of a pre-trained neural network to cluster images of Paris. The author says: “I used feature vectors normally intended for classification to calculate pairwise similarities between the images. The map arrangement was calculated using t-SNE — an algorithm that finds an optimal 2D layout so that similar images are close together.”
While the t-SNE algorithm takes care of the clustering and neighborhood structure, manual annotations help with identification of curated map areas. These areas can be zoomed on demand enabling close viewing of similar photos.
TheUSmilitary is funding an effort to determine whether AI-generated video and audio will soon be indistinguishable from the real thing—even for another AI.
The Defense Advanced Research Projects Agency (DARPA) is holding a contest this summer to generate the most convincing AI-created videos and the most effective tools to spot the counterfeits.
Some of the most realistic fake footage is created by generative adversarial networks, or GANs. GANs pit AI systems against each other to refine their creations and make a product real enough to fool the other AI. In other words, the final videos are literally made to dupe detection tools.
Why it matters? The software to create these videos is becoming increasingly advanced and accessible, which could cause real harm. Sooner this year, actor and filmmaker Jordan Peele warned of the dangers of of deepfakes by manipulating a video of Barack Obama’s speech.
The images are grouped according to specific parameters that are automatically calculated by image analysis and text analysis from metadata. A high-dimensional space is then projected onto a 3D space, while preserving topological neighborhoods between images in the original space. More explanation about the dimensionality reduction can be read here.
The user interface allows four types of image arrangement: by color distribution, by technique, by description and by composition. As the mouse hovers over the items, an info box with some metadata is displayed on the left. The user can also perform rotation, zooming, and panning.
The author wrote on his site:
The project renounces to come up with a rigid ontology and forcing the items to fit in premade categories. It rather sees clusters emerge from attributes contained in the images and texts themselves. Groupings can be derived but are not dictated.
Have you heard about the so-called deepfakes? The word, a portmanteau of “deep learning” and “fake”, refers to a new AI-assisted human image synthesis technique that generates realistic face-swaps.
The technology behind deepfake is relatively easy to understand. In short, you show a set of images of an individual to a machine (a computer program or an app such as FakeApp) and, through an artificial intelligence approach, it finds common ground between two faces and stitches one over the other.
Deepfake phenomenon started to draw attention after 2017 porn scandal when an anonymous Reddit user under the pseudonym “Deepfakes” posted several porn videos on the Internet.
Deepfakes in politics
Deepfakes have been used to misrepresent well-known politicians on video portals or chatrooms. For example, the face of the Argentine President Mauricio Macri has been replaced by the face of Adolf Hitler:
Also, Angela Merkel’s face was replaced with Donald Trump’s.
In April 2018, Jordan Peele, from BuzzFeed, demonstrated the dangerous potential of deepfakes, with a video where a man who looks just like Barack Obama says the following: “So, for instance, they could have me say things like ‘Killmonger was right’ or ‘Ben Carson is in the Sunken Place,’ or ‘President Trump is a total and complete dipshit.'”