Chapter 5 ggplot2 for history: advantages and challenges

ggplot2 gives lots of instruments to create any kind of graph on the basis of dataset, coordinate system and geoms, i.e. visual presentation of the data.

Accordingly ggplot2consists of several basic components: data, aesthetics and coordinates:

Ggplot(data) + <geom_function>(mapping = aes(<mappings>), stat, position) + <coordinate_function> + <facet_function> + <scale_function> + <theme_function>

It is highly customizable and allows to render typographic ready graphs. However ggplot2 graphics is less interactive and more static that make ggplot2 relatively fast but also less friendly to users, if one needs something more than an image and wish to add some dynamic parameters. A static approach works perfectly for visualization of thousands entities on the graph and creating an image of the tendencies or picture in general. Nevertheless, on the basis of a static image user cannot interact with each particular entity that might be essencial for historical visualization, when the aim is to show a particular element, which contains a lot of information, in the context of the system or it’s relationships with other elements.

For instance, implementing such a simple element as popup for a point, which will bring more detailed information about this particular entity, for example, year, title in the source, brief description, needs an additional JavaScript extension, because by default this function is absent in a standard ggplot2 library. A selection of a particular entity is not possible by default either, however it can be added manually in a form of brush, where the output with the data will be shows as an another elemtnt - a data table, a plain text, etc. It is commonly used in ggplot2 visualization and in documentation it looks:

brushedPoints(df, brush, xvar = NULL, yvar = NULL, panelvar1 = NULL,
  panelvar2 = NULL, allRows = FALSE)

For implementing this element, we need to do a couple of manipulations. First of all, it can be made only within shiny framework, which I will consider in details in the next part of the tutorial. Secondly, on a very basic level this approach means that in the dataset, the categories suggested for x and y axis might be only single that’s often not the case for historical data. For example, we would like to visualize a dataset, which contains the information about revolutions in some period of history. We have titles, years and key groups of participants. Everything works fine until the moment, when we discover that in some entities we have more than one group of participants. How shall we explain the R that we need to put this particular entity into two groups? Let’s discuss this question in the next part, which focuses on historical data specifics and how to approach it in an optimal way.

5.1 Historical data and graphic libraries

Historical datasets contain significant element of uncertainty that’s distinguish historical dataset from other types of the data. It is caused by sources that can contain different information about events, persons and locations. In addition, historical entities ofter correspond several categories that need be reflected in visual presentation and searching options.

Among our projects, I worked with this sorts of entities, which contained as uncertain data as corresponded several parameters. Here there an examples:

This is a presentation of the .bib file in the format of the data table. In the column ‘Keywords’ we see that journal articles correspond several categories. How could we present them in a graphic way? For example we would like to see only the sources among these entities and exclude other categories. In this case, ggplot2 would need some extension, because otherwise R will not see a comma.

I suggest to use interactive shiny app. Let’s create a simple shiny structure with our data, UI and server. For the libraries we need only shiny, ggplot2 and DT.