obsplot is still in an early stage, in particular its API could change in the future, either for self improvements or to follow Observable Plot evolutions. It may not be suitable for production right now.
Also to be considered,
obsplot is not suitable for charting very large datasets : the generated plots are in SVG format, and when using it in RMarkdown or Shiny the underlying data are included in the output as JSON.
obsplot is not on CRAN yet, but can be installed from Github with :
Or from R-universe with :
install.packages("obsplot", repos = "https://juba.r-universe.dev")
Don’t forget to load the library with :
Suppose we want to create a very simple dot chart from the
penguins dataset of the
palmerpenguins package :
To create such a chart we first initialise it with
obsplot(). We pass as argument the data frame containing the data to plot :
We then add a graphical mark to create our chart. Here we use the dot mark by piping the
mark_dot function. We pass as arguments the
y channels giving the corresponding data frame columns :
Here we passed the data frame columns as symbols, but we can also use character strings instead :
We can add other channels, for example by changing dots color according to another variable :
We can also add constant options to a mark to modify an attribute in the same way for all dots :
We can also add global options to the chart with the
opts() function :
Finally, we can modify the way variables values are linked to graphical attributes by using scales function :
obsplot(penguins) |> mark_dot(x = bill_length_mm, y = bill_depth_mm, stroke = island, r = 2) |> scale_color(scheme = "set1") |> opts(grid = TRUE)
To go a bit deeper, we have to take a look at the fundamental concepts of Observable Plot : marks, faceting, scales and transforms.
Marks are the fundamental building blocks of Observable Plot charts. Each mark is a graphical representation of some data under a specific form : dot, line, area, text…
In Observable Plot, marks are defined by giving a
Plot.plot() function. In
obsplot, it is done by piping one or more of the
mark_* family of functions. In the following example we add three different marks to create a scatterplot with two rules for
y mean values :
mean_length <- mean(penguins$bill_length_mm, na.rm = TRUE) mean_depth <- mean(penguins$bill_depth_mm, na.rm = TRUE) obsplot(penguins) |> mark_ruleY(y = mean_depth) |> mark_ruleX(x = mean_length) |> mark_dot(x = bill_length_mm, y = bill_depth_mm)
A mark function takes several arguments. The first one is an optional
data object. If not specified, it is inherited from the one passed to
obsplot. Other named arguments are called mark constructors and can be of several types :
data, as a string (
"col") or a symbol (
JS()function, evaluated at runtime
In the following example, both
y are column channels, whereas
stroke is a constant. In fact values passed to a color constructor (
fill) are automatically considered as constant if they look like a CSS color name or definition.
If we want to highlight some points by adding a text label, we can do it by giving a specific
data argument to
We can also provide data directly to one of the channels (in Observable Plot, you can do it only by specifying a corresponding indexed
data argument of the same length, this is done automatically in
The rules to determine a channel type are as follows (this may be subject to change in the future):
You can explicitly specify that a channel is a vector channel by using the
as_data() helper function. In the following example, without
as_data the code would raise an error as it would look for a
"Paris" column in the data :
When a column or vector channel is of type
POSIXt in R, it is automatically converted to
Here is the list of the different mark functions currently available in
Faceting allows to create a grid of comparable grouped charts. In Observable Plot faceting is used by adding a
facet option to
obsplot it is achieved by piping the
Here, we create an horizontal set of scatterplots by passing an
x channel to
To get a vertical faceting, define
y instead of
x. We can also add a frame around each subchart by using
obsplot(penguins) |> mark_dot(x = bill_length_mm, y = bill_depth_mm, stroke = sex) |> mark_frame() |> facet(y = island)
Finally it is also possible to create a trellis of charts by specifying both
obsplot(penguins) |> mark_dot(x = bill_length_mm, y = bill_depth_mm, stroke = sex) |> mark_frame() |> facet(x = species, y = island)
Scales is a family of functions which allow to modify the way a data value is mapped to a visual attribute such as position, size or color.
Modifying scales in
obsplot is done by piping one of the
scale_ family of functions :
scale_yallow to change the
scale_opacitymodify the mappings on
scale_rmodifies the scale of the radius
scale_fyare used to modify the band scales added when using faceting
For example, we could modify the
y scales to become logarithmic and change their labels:
For a comprehensive list of scales arguments, see the scale options API reference.
Transforms are used to filter, modify or compute new data before plotting them.
Every mark allows to provide a set of basic transforms :
The transforms notebook provides more examples of these three transforms.
Transform functions are a set of functions which takes mark channels and options as input and compute a new set of channels and options. They are used, for example, to bin data to create an histogram, group them to compute a bar chart, etc.
In Observable Plot, transforms are functions (
Plot.windowX…) passed as option to a mark. In
obsplot, a corresponding transform function (
transform_windowX()) is called and passed as argument to a mark function.
For example, if we want to create an histogram, we have to apply binning by calling
transform_binX inside a
Note that data columns can be passed as symbols (
bill_depth_mm), but other arguments have to be character strings (
To create a cell chart of the cross tabulation of two categorical variables, we have to apply a
transform_group before calling
obsplot(penguins) |> mark_cell( transform_group(fill = "count", x = island, y = species) ) |> mark_text( transform_group(text = "count", x = island, y = species) ) |> scale_color(scheme = "PuRd")
Some transform functions take a specific first argument : either outputs for
transform_map, or a map for
transform_mapY. By default, the first argument passed is considered as the unique output or map, whereas the other ones are options. If you must specify several outputs, or if an output has the same name as an option, wrap them into a
obsplot(penguins) |> mark_dot(y = species, x = body_mass_g) |> mark_ruleY( transform_groupY( list(x1 = "min", x2 = "max"), y = species, x = body_mass_g ) ) |> mark_tickX( transform_groupY( list(x = "median"), y = species, x = body_mass_g, stroke = "red" ) ) |> scale_x(inset = 6) |> scale_y(label = NULL)
Transforms can be composed, and you can store a transform in an R object and reuse it.
df <- data.frame( index = 1:100, value = rnorm(100) ) xy <- transform_mapY("cumsum", y = value, x = index, k = 20) obsplot(df) |> mark_lineY(xy) |> mark_lineY( transform_windowY(xy), stroke = "red" )
opts can also be used to add a caption :
Plot sizing can be specified by giving
width arguments in
height value is
"auto" : in this case height and width are computed by
htmlwidgets and passed to Observable Plot, which should give a plot adjusted to its container’s size :
By specifying height or width values, both Observable Plot and
htmlwidgets will use these values :
width are set to
NULL, the chart dimensions in pixels will be determined by Observable Plot. Note that these dimensions may not be the same as the HTML widget dimensions, which can produce big margins :
obsplot is used in a Shiny app with a responsive layout such as
fluidPage, it is recommended to use
"auto" (the default) at least for width so that the chart will redraw itself accordingly when its container is resized.
htmlwidgets via JSON serialization. As a general rule, a data.frame in R is converted to a
d3 style data array (an array of objects), a
list in R is converted to an object, a vector of size > 1 is converted to an array, and a vector of size 1 is converted to a number or character string.
obsplot includes some helpers to automatically detect when an object is of class
There are several differences between
obsplot and Observable Plot, mainly :
datacan be declared in
obsplot()and inherited by the chart marks, whereas in Observable Plot it must be declared for each mark.
datahas been declared, an indexed
dataargument of the same length is automatically added.
When the plotted data are stored in a data frame,
obsplot has currently no way to determine which columns are used or not. This is not a problem in an interactive session, but when used in an RMarkdown document, the whole dataset will be embedded in the output document in JSON format, which can make the document size go up quickly.
One solution is to preselect the needed data in R before calling
You can predefine transform argument in a list for reuse :
xy <- list(x = "island", y = "species") obsplot(penguins, height = 100) |> mark_cell( transform_group(fill = "count", xy) ) |> mark_text( transform_group(text = "count", xy) ) |> scale_color(scheme = "PuRd")
Note that in this case, all arguments including data column names must be passed as strings, not as symbols.
If you want to add new arguments to this predefined list, you’ll have to use
append and put the new arguments themselves in a list :
To make interactive usage simpler,
obsplot allows to pass column names as symbols instead of character strings.
If the symbol matches both a data column and an environment object, the data column has priority.
Only single symbols can be used as data columns, any other type of expression will be evaluated in the current environment.
The same rules apply when symbols are used in
transform functions, data columns can also be passed as symbols, but in these cases the rules are a bit different because the transform doesn’t have a direct access to the data to check if the symbol name is a column.
df <- data.frame( v1 = rnorm(100) ) obsplot(df, height = 120) |> mark_rectY( transform_binX(y = "count", x = v1) )
df <- data.frame( max = rnorm(100) ) obsplot(df, height = 120) |> mark_rectY( transform_binX(y = "count", x = max) )
What may be confusing here is that the priority is reversed regarding
facet functions : if a symbol exists with in the calling environment, it has priority over a data column of the same name.
df <- data.frame( x = rnorm(100) ) x <- 1000:1100 obsplot(df, height = 120) |> mark_rectY( transform_binX(y = "count", x = x) )
In this case you can use a character string instead of a symbol if you want to be sure that a channel will be seen as a data column.