infographics – Parerga und Paralipomena http://www.michelepasin.org/blog At the core of all well-founded belief lies belief that is unfounded - Wittgenstein Mon, 04 Jan 2016 18:44:15 +0000 en-US hourly 1 https://wordpress.org/?v=5.2.11 13825966 Nature.com Subjects Stream Graph http://www.michelepasin.org/blog/2016/01/03/nature-com-subjects-stream-graph/ Sun, 03 Jan 2016 00:28:08 +0000 http://www.michelepasin.org/blog/?p=2750 The nature.com subjects stream graph displays the distribution of content across the subject areas covered by the nature.com portal.

This is an experimental interactive visualisation based on a freely available dataset from the nature.com linked data platform, which I’ve been working on in the last few months.

streamgraph

The main visualization provides an overview of selected content within the level 2 disciplines of the NPG Subjects Ontology. By clicking on these, it is then possible to explore more specific subdisciplines and their related articles.

For those of you who are not familiar with the Subjects Ontology: this is a categorization of scholarly subject areas which are used for the indexing of content on nature.com. It includes subject terms of varying levels of specificity such as Biological sciences (top level), Cancer (level 2), or B-2 cells (level 7). In total there are more than 2500 subject terms, organized into a polyhierarchical tree.

Starting in 2010, the various journals published on nature.com have adopted the subject ontology to tag their articles (note: different journals have started doing this at different times, hence some variations in the graph starting dates).

streamgraph2

streamgraph3

The visualization makes use of various d3.js modules, plus some simple customizations here and there. The hardest part of the work was putting the different page components together, to the effect of a more fluent ‘narrative’ achieved by gradually zooming into the data.

The back end is a Django web application with a relational database. The original dataset is published as RDF, so in order to use the Django APIs I’ve recreated it as a relational model. That let me also add a few extra data fields containing search indexes (e.g. article counts per month), so to make the stream graph load faster.

Comments or suggestions, as always very welcome.

 

]]>
2750
Infographics Course, Week 2 http://www.michelepasin.org/blog/2012/11/14/infographics-course-week-2/ Wed, 14 Nov 2012 23:16:15 +0000 http://www.michelepasin.org/blog/?p=2241 Here’re the materials related to the second week of the Introduction to Infographics and Data Visualization online course. This week we talked about two topics: a) Visual Perception and Graphic Design Principles and b) Planning for Infographics and Visualizations. The exercise was focusing on an interactive visualisation available on the New York Times website.

Key Concepts from Lesson 2

 


A) Visual Perception and Graphic Design Principles
-----------------------------------------

- The principles of design are grounded on how perception works
- Visual perception works differently from what we think 
	- we don't have photographs in our heads!
	- the eye passes on signals to the brain, which elaborates and makes assumptions about the perception
		- eg think of the classic geometrical visual illusions 
- Visual perception is an active process. The brain is not a passive receptor of information, but it completes, organizes and creates priorities (or hierarchies) and relationships to extract meaning
- If we know that, the goal of the designer should be to arrange compositions anticipating what the user's brain will most likely do

- The key to any visual design is the presentation of a cohesive, structured, readable and understandable composition.

==> check online examples by John Grimwade

First goal when arranging elements on a page:
	- think about a composition
	- Structure / Order / Hierarchy / Harmony / Balance
	
Main principles of Graphic Design
	Unity: presentation of a composition as an integrated whole
	Variety: is the opposite of unity, but also its complement. With too much variety, a composition will look randoml with too much unity, it will look boring
	Hierarchy: the balance between unity and variety can lead to a good hierarchy
		- answers the question: where should I start reading the infographics
		
Strategies for achieving Unity, Variety and Hierarchy
	Grids, Color, and Type
	
Grids
	- they can support unity thanks to a sense of alignment
	- help keep the consistency
	- key to using grids: think of the composition as a set of rectangles
	- first step in building an infographics: divide up the space into functional rectabgles
		- headline / intro / map / chart / timeline etc...
		- tip: things that are stacked on top of each other should have the same width
		- tip: if objects are side by side, they should have they same height
		
Fonts, Colors
	- different fonts can be use to achieve variety, and support the creation of a hierarchy
	- same with colors: eg using a Copy Color (eg Black) an highlight color (eg yellow) and a series of Neutral Tones (grey shades)
	
	

B) Planning for Infographics and Visualizations
-----------------------------------------

Creating a chart consists of making a data set adopt a visual shape

==> seminal paper by William Cleveland and Robert McGill on infovis
	- scale of charts that allow more accurate/generic judgements
		eg barcharts are accurate and facilitate comparison
		eg color gradation graphs or bubbles are generic (=show big trends)
	
- correlation coefficient
	- formula that represent the correlation between two series (on a values between -1 and 1)
	- scatter plot: good at representing correlation between two variables
	- slope graph: another way to represent correlation (although it's usually employed to represent change over time)


- Types of charts
	- line charts: display variation of one or more magnitudes over a time period by means of rising and falling lines
	- Comparison Charts: visualization of amounts, each represented by a bar (or toher objects)
	- Distribution Chart: division of a whole into its components. It can be represented by a circle ('pie') or by other objects, such as a divided bar
	- Correlation chart: shows the correlation between two (or more) variables. Also called scatter plots

- Common components of a graphic
	- headline: clearly stated what the graphic is about, or makes a point
	- values
	- axes
	- sources/attribution info
	- byline: who made the graphic
	- legend
	
	
Styling a graphic with colors and fonts makes the graphic more readable (= create the hierarchy)

Things to avoid
- Dont' distort charts (eg 3d effects) especially with pie charts, as it makes it more difficult to compare areas
- Avoid vertical labels
- Avoid backgrounds that detract attention from the main graphics (eg photographs etc)
- Avoid creating overloaded compositions

The Design Process
- Learn as much as you can about the topic
- Identify goals and challenges
- Prototype and sketch
- Test and tweak
- Turn the project in

 

The exercise

Infographics_week2

See the following graphic, by The New York Times, an interesting project that allows you to compare the words that were used in the National conventions. Imagine that you are hired by Steve Duenes, infographics director at the Times, to make a constructive critique of that piece. What would you say about it?

Here’s my comments about the NY visualization:

– The infographics allows to compare the two different parties usage of a certain word rather intuitively, so in that sense it is functional. The visualisation based on bubbles’ areas is usually clear; in cases where the two areas are very similar you can still get the ratio right thanks to the percentage numbers displayed.

– The fact that you can move around the bubbles is eye-catching and fun to use, but it fails to provide any added value to the tool (= no extra functionality other than maybe organising the words some other way). This is detrimental to the understanding of the structure and purpose of the visualisation: the bubbles’ location seem to imply a semantic correlation of some sort, but unfortunately it doesn’t have any.

– I’d expect to be able to filter the quotes by speaker, so  to compare the usage of a certain word only between two specified people (e.g. Obama and Romney). Unfortunately that is not possible. Also, it’d be nice to be able to order or re-organise the citations on a timeline, so to explore potential patterns in the increase/decrease of use of a word. All of this could be easily achievable by introducing  a ‘filter by’ panel right on top of the the quotations columns. The types of filters could be others too, eg geographical ones (by state or regions), or by the importance of the ‘roles’ of the speakers (majors, governors).

– The main issue from the interactivity point of view is that when you click on a bubble (assuming a user understands that’s what he/she has to do – btw no tooltips at all!) it’s not immediately obvious that the bottom part of the screen gets updated. I’d add some mechanism, such as a partial screen refresh, or a ‘loading’ icon, that would make this process more transparent. 

– There is no way to remove a bubble once you’ve added it. So if you’re trying to compose your own ‘view’ of the tool by selecting only words you are interested in, once you get something wrong you’ve stuck with it (you can only restart from scratch by reloading the page) 

– The 4 static captions at the bottom (AUTO, WOMEN etc..) are ok at the beginning of the visualisation, but once you start moving things around they don’t update at all which is not really the expected behaviour. 

– If the full transcripts the quotations derive from are available online, it’d be nice to be able to link directly to them e.g. by clicking on the quotations themselves. This would allow to investigate further the original context of use of a word.

– Having small photos on the side of a speaker’s name would make it easier to identify these people; also, it shouldn’t be too difficult to include links to the person’s home pages or wikipedia entries

 

]]>
2241
Infographics Course, Week 1 http://www.michelepasin.org/blog/2012/11/06/infographics-course-week-1/ http://www.michelepasin.org/blog/2012/11/06/infographics-course-week-1/#comments Tue, 06 Nov 2012 23:18:38 +0000 http://www.michelepasin.org/blog/?p=2227 This is a short summary of the activities in week 1 of the Introduction to Infographics and Data Visualization massive online course offered by the Knight Center for Journalism at Texas University. I’ll be posting the course materials and exercises here on the blog, so stay tuned if you want more.

The course is hosted by Alberto Cairo, author of the book ‘The Functional Art’. It’s been only a week since we started, but I can definitely tell that the quality of both the teaching materials and overall e-learning platform are very high. So I’d highly recommend it to anyone interested in deepening his/her knowledge on such topics. It’s too late to sign up for it now, but there will be another class running in early 2013 so keep an eye on their site if you don’t want to miss the next enrollment.

Key Concepts from Lesson 1

 

- Infographics is a piece of functional art (different from pure art)
- the stuff in the world is shapeless and useless, it requires people to give it a form (to model it) according to some specific purpose
- our world is not about ideology anymore, it's about complexity (Matt Taibbi en Griftopia)
- the model of information designers is to model that raw material and make sense of it
- infographics is not just about summarizing, organizing data, but it's also about letting reader explore those data
- a graphic is a tool: it extends our skills and capacities
- any good infographics is functional as a hammer - the design predetermines the function the tool should have.
- any good infographics is multilayered as an onion - eg as in a summary of main points, plus more in depth examinations
- any good infographics is beautiful and true as a mathematical equation

- function doesn't dictate form, but it restricts the variety of forms that are acceptable to use for each set of data

- classic distinction:
	- infographics: presents information in a way that becomes meaningful; it's an edited story based on data
	- information visualization: fine-tuned so to support exploration; doesn't tell a particular story, but it allows users to create their own story (it's unedited).
- for Cairo, the difference is very fuzzy, often the two things are mixed together

- Definition: a good infographics lets you answer questions more efficiently 

- considering infographics only as art is wrong: infographics are tools. Often graphics with no structure or no context are presented as inforaphics, but they are not a visual representaion of the data, just a simple page layout exercise with a bunch of unrelated numbers

- numbers have little meaning if they can't be compared with other data (eg summaries) or if I can't relate them to my life (eg contextual information)

- infographics should be constructed in many layers so that data could be cross-compared in many ways

- choosing the correct 'visual metaphor' is essential for an infographics

- if your goal is to let users compare numbers, it's better not to use bubble charts but bar charts! The human brain is not good at comparing sizes of bubbles. Bubbles are good for presenting overall patterns, but not good for precise comparisons.

- the 'onion' approach allows you to represent data in different ways to facilitate different kinds of tasks

- Infographics and visualization must be considered as visual tools for communication, understanding and analysis

- Charles Joseph Minard 1869: cosidered by many (eg Tufte) the best visualization ever!

The exercise

Infographics_week1

See the following graphic (socialwebinvolvement.jpg) and try to answer the questions

1) Is this infographic really “functional” in the sense of facilitating basic, predictable tasks (comparing, relating variables, etc.)?
Not really. If we think of an infographics as a ‘technology’ this is certainly a very poor one. Apart from the fact that China and USA have the biggest audiences (in absolute terms) it’s extremely difficult (if not impossible) to use the graphics as a tool in any other way. Several variables are presented, but we can’t compare, organize or correlate the data cause they’re all expressed in a way that doesn’t support those actions. Moreover the choice of using bubble charts is not approriate, as they make it harder for people to make comparisons among surface sizes.

2) If not, how could it be improved?
Similarly to what discussed in chapter two of the handouts (see the armed forces employees example), I’d do the following things:
a) eliminate the bubble charts, and replace them with bar charts
b) create bar charts for both absolute and relative values using a derived variable
c) improve the rendering of the labels, since now they’re a bit too small and hard to read. This could be done for example by using a specialized type of bar chart where the ‘bar’ is divided into 5 continuous coloured segments corresponding to the types of social web involvement.
d) keep the geographical map in order to provide some context; however, it could be shrunk, positioned at the bottom and used primarily to highlight which are the countries being examined in the experiment (eg by having their areas in a different color)

3) What kind of headlines, intro copy, and labels could it include to make it meaningful for a broad audience?
I think that the correlation between the label colors and their meaning should be made more explicit.

4) What other variables (if any) should be gathered/analyzed if we want to give an accurate portrait of Internet users across countries? Could we go beyond what is currently presented? Can we provide a better context for the data?
It’d be nice to have a sense of the total number of users per country, versus people that admittedly don’t use the internet (or social media). Also, it’s not clear whether the 32000 users interviewed have been split proportionally to the total population of the countries taken into consideration, or not.
Other variables that it’d be nice to investigate are
– mean of access of the internet: eg mobile phone, computer, tablet
– age distribution
– overall context of internet usage: eg leisure, work, education

Other Approaches

Here’re the work of other students (not me) who tried (with impressive results) to redesign the graphic above:

http://www.flickr.com/photos/89317425@N05/8150814858/
http://dl.dropbox.com/u/43885573/Draft2.jpg
http://www.flickr.com/photos/aaugur/8144159956/sizes/k/in/photostream/
http://www.flickr.com/photos/rubenvalero/8139950164/
http://n79.org/infographics/asg1/
http://public.tableausoftware.com/views/SocialWebInvolvement_1/Dashboard1?:embed=y

 

]]>
http://www.michelepasin.org/blog/2012/11/06/infographics-course-week-1/feed/ 1 2227