Dataviz

21 July 2020

How to make a coronavirus data visualization that counts

It’s easy to get it wrong, so here are 5 principles for getting it right.

Gemma Conroy

How to make a coronavirus data visualization that counts

It’s easy to get it wrong, so here are 5 principles for getting it right.

21 July 2020

Gemma Conroy

matejmo/Getty

The coronavirus crisis has shown how easily a poorly drawn or chosen map, chart, or data visualization can be misinterpreted, with potentially grave consequences.

In one recent example, researchers used a map of global airline routes from 2012 in a tweet promoting their new coronavirus preprint. Their preprint estimated how COVID-19 spread from Wuhan, China to the rest of the world through air travel.

The researchers, from the University of Southampton in the United Kingdom, did not give any context for the 2012 map in their tweet. This led social media audiences to wrongly interpret it as a map of the spread of COVID-19.

News outlets and Twitter users described the map as “horrifying” and “alarming”.

alt The tweet has since been deleted by the researchers.

Hassan Vally, an applied epidemiologist at La Trobe University in Melbourne, Australia, says coronavirus visualizations have the potential to make or break public health decisions.

“People can either underestimate what’s going on and underreact, or they become overly fearful and overreact,” says Vally. “But when communicated right, data visualizations can be the most powerful tools for changing the way people think.”

Below are guidelines for designing COVID-19 graphs, charts, and maps.

1. Communicate what you don’t know

Capturing the fast-evolving scale of the pandemic in a single visualization can be a challenge. There is still a lot of uncertainty in the data on the clinical effects of the virus and its spread.

To avoid misleading readers such knowledge gaps should be communicated, particularly when you’re visualizing projections based on modelling rather than empirical data.

“Everything can look like the rock-solid truth, when in reality there is a huge amount of uncertainty that isn’t being communicated,” says Vally. “If this isn’t addressed on day one, and the story changes on day two, it can undermine people’s confidence [in the science].”

Vally recommends telling readers what is known, what is not known, what they need to do, and how your research team or organization is trying to improve the situation.

“That’s the basic framework for communication in these emergency situations,” he says.

2. Understand the data you’re working with

A deluge of COVID-19 data is being recorded, but it isn’t always perfect or consistent. A common mistake is to underestimate how these limitations can skew the message of a visualization.

For example, there are differences in how COVID-19 deaths are being reported by individual countries (and states), so comparing between them without addressing or acknowledging these can present an inaccurate picture.

The wrong data may be visualized, for example, when raw counts instead of normalized population data are used to map the progression of the pandemic.

“There are lots of people making maps now, and generally that’s a good thing,” says Amy Griffin, a geospatial scientist at the Royal Melbourne Institute of Technology in Australia. “But when they don’t really understand the data that they’re mapping, they can make maps that are misleading.”

3. Choose colours wisely

Using colour is a powerful way to get a message across quickly, but ingrained visual cues need to be taken into account.

It is a convention of data visualization to use darker colours for high values and lighter shades for low numbers. As a result most people associate pale colours with fewer cases of the disease and darker colours with more cases.

The map below of active cases in Victoria, Australia, shows how the use of dark shading to indicate both high and low numbers of COVID-19 cases is easily misinterpreted.

alt Active COVID-19 Cases in Victoria, 22 June 2020 Victorian Government Department of Health and Human Services via The Conversation

The map has since been redesigned with more conventional colour shading.

alt Active COVID-19 Cases in Victoria, 20 July 2020 Victorian Government Department of Health and Human Services

The bright red highlighting global flight paths in the 2012 map mentioned earlier was appropriate when it was originally published, but for a tweet about the coronavirus pandemic, it appears alarming.

“Different colour schemes are good for different purposes,” says Griffin. “A good question to ask yourself is, ‘Why am I making this map, and what do I want people to understand?’”

4. Tell your audience how to read your visualization

The fact that “flatten the curve” became a catchphrase of the pandemic shows how effective line graphs can be in communicating the pattern of coronavirus cases over time.

But without an adequate explanation of the data behind such graphs, it’s easy for readers to misunderstand the message.

For instance, logarithmic visualizations – which show exponential growth over long periods of time – can make coronavirus cases appear as if they are slowing down, rather than increasing. This is because the values between intervals on the y-axis increase in a non-linear way, so the largest numbers in the scale may be hundreds or thousands of times greater than the smallest.

See the example below, showing the number of confirmed COVID-19 cases in the United States, Brazil, European Union, India, Russia, and the United Kingdom as of 19 July 2020.

alt Source: Johns Hopkins University CSSE 91-DIVOC

Linear graphs show values plotted in even intervals (1, 2, 3 or 10, 20, 30, for example), resulting in a much steeper curve, as in the example below, showing the number of confirmed COVID-19 cases in the United States, Brazil, European Union, India, Russia, South Africa, and the United Kingdom as of 19 July 2020.

alt Source: Johns Hopkins University CSSE 91-DIVOC

Griffin says that misunderstandings can be avoided by doing something as simple as adding caveats or definitions in an obvious place alongside the graph, map, or visualization, or adding detailed descriptions in the accompanying text.

“Don’t just give them a map, tell them how to interpret it,” she says.

5. Don’t forget about the context

COVID-19 data is often visualized without enough context, says Griffin, for example, not including how many tests have been conducted when reporting fatality rates in different countries.

“You may have numbers, but do they reflect what’s actually happening in the population?” says Griffin.

Vally adds that it’s important to consider how the data you are working with could be used to inform public health decisions.

“It’s about simplifying complexity, without oversimplifying what’s going on,” says Vally.

“There’s no shortage of people who can find data and put it into a visualization, but we need people to think very carefully about the data before making interpretations.”

Sign up to the Nature Index newsletter

Get regular news, analysis and data insights from the editorial team delivered to your inbox.