OBSERVATÓR!O2016

A web-based platform for monitoring, structuring and visualizing the online response to Rio 2016 Olympic Games on Twitter

Data collection and IT framework

OBSERVATÓR!O2016 collected and analysed around 1 million Twitter messages, via Twitter’s public API, from 18 April to 25 August 2016. Approximately 180 thousand of these tweets included images that were also stored in our database. In other to gather different perspectives of the debate about the Olympics we created seven different Twitter search queries each with its particularity. As the project unfolded and the Games came close, OBSERVATÓR!O2016 processed this data stream and visualized the textual messages, hashtags and images looking for a better understanding of the audience response. Presentation was in eight main visualizations.

Data collection and modeling

In order to capture tweets related to the Olympics, a custom infrastructure was developed and search queries categories were set up based on predefined topics. Each of this search queries categories has its own technical scripts and semantic particularity. Some categories collected tweets since the web portal was launched, others captured tweets only in the period of the Olympics. Further, categories Rio2016, Crítica and Oficial collected tweets via streaming API while categories Tocha, Paes, Atletas and Equipe captured tweets via Rest API . Finally, the scripts for collecting tweets were developed using Python language and considering Twitter API Objects structure (Tweets, Users, Entities, Entities in Objects and Places).

Storage

The storage infrastructure includes two SQL databases installed on a remote server: DB_Raw and DB_oo. The first, DB_Raw, stores all tweets including its entirely metadata. Then, a crawler script consults DB_Raw and selects just tweets ant metadata that will be used later in visualisations and store them in DB_OO. This same script also recognises image links (pic.twitter.com) and go into each of them to download attached images and finally save these images into DB_OO. The original search query category from each tweet was captured is saved as a metadata.

Software architecture

OBSERVATÓR!O2016 web portal was developed over a Model-View-Controller (MVC) software architecture. The server side was developed using Django, a high-level Python Web framework, and is directly connect to the database. The client side was designed using a web development approach. A Content Management System (CMS) called Mezzanine - built using the Django framework - was used and HTML5/CSS/JavaScript were applied to customize the web interface for multiple devices (desktop, tablet and mobile).

Visualization

Mostly OBSERVATÓR!O2016 visualizations were developed by building on code libraries such as D3.js. A full list of all frameworks used are acknowledged in the website. Each visualisation represents a different collection of texts, images, connections or hashtags depending on the search queries categories consulted.