Back

Visualizing Data

Seeing patterns. Spotting trends. Going beyond Excel.

Data visualization is necessary because you don’t know anything about my data, and vice versa. A chart of crime rate is not an actual picture of a victimized column killed by an error bar to the head, X and Y axes acting as police tape to keep back the bystanders of numbers and units. You know that. When one sees a graph or chart or map, one is seeing a visualization of data. Presented below in no particular order of importance (except perhaps the first one) are resources that I have found helpful in visualizing numeric data.

FlowingData

FlowingData is a site all about visualizing data and teaching others how to do it. And to do it well. Created by Nathan Yau, this site, at minimum, offers graphic design inspiration to the visitor. I personally began following this site because of his presentation of datasets that anyone would find interesting, even if the data presented is sometimes linked in from other sites (example). FlowingData offers paid membership which unlocks a four week training course in R (“statistics” program) and numerous, constantly added tutorials on data analysis and visualization. Much of the training utilizes R. Yau really, really likes R. In addition, he has written some books and provides guides to certain topics in data visualization.

R + ggplot2

R is the statistics and programming language mentioned just above. But rather than explain its uses here, I’ll explain over here. Here is an example of plotting with R and the ggplot2 package.

lm <- lm(soil$p_meh3 ~ soil$p_olsen)


soil %>%
  ggplot(aes(x = p_olsen,
             y = p_meh3)) +
  geom_smooth(
    method = "lm",
    color = "#900000",
    se = FALSE,
    size = 0.5
  ) +
  geom_point() +
  annotate(
    geom = "text",
    x = 10,
    y = 150,
    label = paste("R^2 == ", round(summary(lm)$r.squared,
                                   digits = 2)),
    parse = TRUE,
    family = "Roboto Condensed"
  ) +
  labs(
    x = "Olsen (ppm)",
    y = "Mehlich III (ppm)",
    title = "Correlation of Olsen and Mehlich III soil phosphorus extractions",
    caption = "source: gradcylinder"
  )
## `geom_smooth()` using formula 'y ~ x'

Google Sheets

I love using Google products for one main reason: I can access the data confidently from any computer. Because the data is stored on Google servers, I don’t have to worry (as much) about accidentally deleting and losing an important spreadsheet file from an external drive.

Google Sheets is more or less a full spreadsheet program that runs in a browser. Sheets also allows for real-time collaboration on a dataset. Simply create a shareable link and give it to a colleague. Making charts is quite simple, and Google has quite a lot of chart options, including basic maps and motion charts. If you don’t know what a motion chart is, check this out. Google Sheet charts can be embedded into websites, updating whenever changes to the chart data are made.

Microsoft Excel

Is there much for me to explain here? You know what Excel is, right? When creating graphs in Excel, I do like being able to quickly change the overall appearance of the graph by selecting different templates provided by Microsoft. With that said, not all the templates are visually appealing. Some of the automatically assigned colors are kind of bland. And exporting graphs? Can only really be done if one is copying and pasting into Powerpoint or Word.

Datawrapper

Datawrapper quickly and intuitively takes data–even just copied and pasted–and provides easy chart making tools, no coding required. Datawrapper has been growing its features and abilities for a while now, and expanding its knowledge base for those just learning how to make visualization. For example, here are some basic cartography instruction for creating quick choropleth maps. Since last I used it, the site has added more chart types, and has chart annotation options, too. And they are German, so things work like a Mercedes-Benz F1 car.

Tableau

Tableau offers data visualization and analysis software, including Tableau Public, a free program and platform that streamlines visualization and online publishing. The visualizations are far more comprehensive than Sheets or Excel, giving the user the ability to create dashboards or stories out of multiple graphs. Here is an example of a dashboard that recently won a student contest. As for me, I haven’t used this software much, but I was impressed by how quickly I could make a simple map.

RAW

RAW is a nifty little site for for designers and vis geeks that:

“…aims at providing a missing link between spreadsheet applications (e.g. Microsoft Excel, Apple Numbers, OpenRefine) and vector graphics editors (e.g. Adobe Illustrator, Inkscape, Sketch).”

Creating a graph is easy and intuitive, as explained here. Labeling axes and adding titles and legends are not options. The idea here is to export the graph as a vector for post-design. But at least they have improved RAW to ignore null values in my data, whereas before plotted points along y = 0.00 (except for whatever is happening in the top left corner).

102030405060708030405060708090100110120130140150

DrawMyData

And if one needs a quick chart to demonstrate something (e.g. correlation, basic pattern in a chart one recently saw but don’t have the data for reproducing, etc.), then one should try DrawMyData. This site makes it really easy to add points to a chart and export the data to a .csv file.

ChartAccent

ChartAccent is a really useful tool for creating basic graphs that need annotation. Sometimes, graphs speak for themselves, but often you’ll need to draw the eyes of viewers to a certain data point, certain trend or pattern, or at least acknowledge the data source if not your own.

Built with Hugo
Theme Stack designed by Jimmy