Artscope: a grid-based visualization for SFMOMA cultural collection

Existing projects in visualization-based interfaces (interfaces which enables navigation through visualization) for cultural collections usually focusses on making their content more accessible to specialists and the public.

Possibly one of the first attempts to explore new forms of knowledge discovery in cultural collections was SFMOMA ArtScope, developed by Stamen Design in 2007 (now decommissioned). The interface allows users to explore more than 6,000 artworks in a grid-based and zoomable visualization. Navigating the collection follows a visualization-based first paradigm which is mainly exploratory (although the interface enables navigation through keyword search, the visualization canvas is clearly protagonist). The artworks’ thumbnails are visually organized by when they were purchased by the museum. The user is able to pan the canvas by dragging it and the lens serves as a selection tool, which magnifies the selected work and reveals detailed information about the selected piece.

ArtScope is an attractive interface which offers the user an overview of the size and content of SFMOMA’s collection. However, the artworks in the canvas are only organized by time of acquisition, a not very informative feature for users (maybe just for the staff museum). Other dimensions (authorship, creation date, technique, subject, etc.) can’t either be filtered and visually organized in the structure of the canvas.

The video bellow illustrates the interface navigation:

My presentation at HDRio2018

During the paper session “Social networks and visualizations”, held on April 11 at HDRio2018 Congress, I presented the work “Perspectivas para integração do Design nas Humanidades Digitais frente ao desafio da análise de artefatos visuais”  (“Perspectives for integrating Design in Digital Humanities in the face of the challenge of visual artifacts analysis”).

In this work, I outline initial considerations of a broader and ongoing research that seeks to reflect on the contributions offered by the field of Design in the conception of a graphical user interface that, along with computer vision and machine learning technologies, support browsing and exploration of large collections of images.

I believe my contribution raises three main discussions for the field of Digital Humanities:

  1. The investigation of large collections of images (photographs, paintings, illustrations, videos, GIFs, etc.) using image recognition techniques through a Machine Learning approach;
  2. The valorization of texts and media produced on social networks as a valid source of cultural heritage for Digital Humanities studies;
  3. Integration of Design principles and methodologies (HCI and visualization techniques) in the development of tools to retrieve, explore and visualize large image collections.

Slides from this presentation can be accessed here (Portuguese only).

PAIR Symposium 2017

The first People + AI Research Symposium brings together academics, researchers and artists to discuss such topics as augmented intelligence, model interpretability, and human–AI collaboration.

PAIR Symposium

The Symposium is part of PAIR initiative, a Google Artificial Intelligence project, and is scheduled to go on livestream on September 26, 2017, at 9 am (GMT-4).

The livestream content will be available on this link.

Morning Program:
1) Welcome: John Giannandrea (4:55); Martin Wattenberg and Fernanda Viegas (20:06)
2) Jess Holbrook, PAIR Google
(UX lead for the AI project. Talks about the concept of Human-centered Machine Learning)
3) Karrie …, University of Illinois
4)  Hae Won Park, MIT
5) Maya Gupla, Google
6) Antonio Torralba, MIT
7) John Zimmerman, Carnegie Melon University

Machine Learning and Logo Design

The rise of neural networks and generative design are impacting the creative industry. One recent example of this repercussion is Adobe using AI to automate some designer’s tasks.

This Fast Company article approaches the application of Machine Learning to Logo Design and touches the issue of whether or not robots and automation are coming to take designer’s jobs.

More specifically, the article describes Mark Maker, a web-based platform that generates logo designs.

In Mark Maker, you start typing in a word.
The system then generates logos for the given word.

But how does it work? I’ll quote Fast Company’s explanation: “In Mark Maker, you type in a word. The system then uses a genetic algorithm–a kind of program that mimics natural selection–to generate an endless succession of logos. When you like a logo, you click a heart, which tells the system to generate more logos like it. By liking enough logos, the idea is that Mark Maker can eventually generate one that suits your needs, without ever employing a human designer”.

I’m not sure if we can say this tool is actually applying design to create logos. Either way, it still a fun web toy. Give it a try!

“Human-augmented” design: how Adobe is using AI to automate designer’s tasks

According to a Fast Company article, Adobe is applying machine learning and image recognition to graphic and web design. Using Sensei, the company has created tools that automate designers’ tasks, like cropping photos and designing web pages.

Instead of a designer deciding on layout, colors, photos, and photo sizes, the software platform automatically analyzes all the input and recommends design elements to the user. Using image recognition techniques, basic photo editing like cropping is automated, and an AI makes design recommendations for the pages. Using photos already in the client’s database (and the metadata attached to those photos), the AI–which, again, is layered into Adobe’s CMS–makes recommendations on elements to include and customizations for the designer to make.

Should designers be worried? I guess not. Machine learning helps automate tedious and boring tasks. The vast majority of graphic designers don’t have to worry about algorithms stealing their jobs.

While machine learning is great for understanding large data sets and making recommendations, it’s awful at analyzing subjective things such as taste.

Designing ML-driven products

The People + AI Research Initiative (PAIR), launched on 10th July 2017 by Google Brain Team, brings together researchers across Google to study and redesign the ways people interact with AI systems.

The article “Human-Centered Machine Learning” by Jess Holbrook¹, addresses how ML is causing UX designers to rethink, restructure, displace, and consider new possibilities for every product or service they build.

Both texts made me think about the image search and comparison engine I’m proposing through an user-centered point of view. I can take the following user needs identified by Martin Wattenberg and Fernanda Viégas and try to apply them to the product I’m planning to implement and evaluate:

  • Engineers and researchers: AI is built by people. How might we make it easier for engineers to build and understand machine learning systems? What educational materials and practical tools do they need?
  • Domain experts: How can AI aid and augment professionals in their work? How might we support doctors, technicians, designers, farmers, and musicians as they increasingly use AI?
  • Everyday users: How might we ensure machine learning is inclusive, so everyone can benefit from breakthroughs in AI? Can design thinking open up entirely new AI applications? Can we democratize the technology behind AI?

In my opinion, my research expects to attend the needs of “domain experts” (eg. designers and other professionals interested on visual discovery) and everyday users. But how to design this image search and comparison engine through a ML-driven approach or what Jess Holbrook calls “Human-Centered Machine Learning”? In his text, there are 7 steps to stay focused on the user when designing with ML. However, I want to highlight a distinction between what I see to be a full ML-driven product (in the way of what Google creates) and what I understand to be a product that shows a ML approach in its conception but not in its entirety (that is, the engine proposed in my research).

A full ML-driven product results in an interface that dynamically responds to the user input. That is, the pre-trained model performs tasks during user interaction and the interface presents the desired output for the user input. Or even more: the model can be retrained from the user’s data during interaction and the interface will dynamically show the results.

On the other hand, in my research, the ML approach will be only used during the image classification phase, which does not include the final user. After we collect all images from Twitter (or Instagram) these data will be categorized by Google Vision API, which is driven by ML algorithms. The results of Google’s classification will be then selected and used to organize the images on a multimedia interface. Finally, the user will be able to search for image trough text queries or by selecting filters based on ML image classification. However, during user interaction, there are no ML tasks being performed.

 

1 UX Manager and UX Researcher in the Research and Machine Intelligence group at Google