Some types of R objects can be used only in the R session they were created. If used as-is in another R process, such objects often result in an immediate error or in obscure and hard-to-troubleshoot outcomes. Because of this, they cannot be saved to file and re-used at a later time. They may also not be exported to a parallel worker when doing parallel processing. These objects are sometimes referred to as non-exportable or non-serializable objects. For example, assume we load an HTML document using the xml2 package:
file <- system.file("extdata", "r-project.html", package = "xml2")
doc <- xml2::read_html(file)
Next, imagine that we would save this document object doc
to file and quit R;
Then, if we try to use this saved xml2 object in another R session, we’ll find that it will not work;
doc2 <- readRDS("html.rds")
xml2::xml_length(doc2)
#> Error in xml_length.xml_node(doc2) : external pointer is not valid
This is because xml2 objects only work in the R process that created them.
One solution to this problem is to use “marshalling” to encode the R object into an exportable representation that then can be used to re-create a copy of that object in another R process that imitates the original object.
The marshal package provides generic functions marshal()
and unmarshal()
for marshalling and unmarshalling R objects of certain class. This makes it possible to save otherwise non-exportable objects to file and then be used in a future R session, or to transfer them to another R process to be used there.
The long-term goal with this package is for it to provide a de-facto standard and API for marshalling and unmarshalling objects in R. To achieve this, this package proposes three generic functions:
marshallable()
- check whether an R object can be marshalled or not
marshal()
- marshal an R object
unmarshal()
- reconstruct a marshalled R object
If we return to our xml2 object, the marshal package implements an S3 marhal()
method for different xml2 classes that takes care of everything for us. We can use this when we save the object;
file <- system.file("extdata", "r-project.html", package = "xml2")
doc <- xml2::read_html(file)
saveRDS(marshal::marshal(doc), "html.rds")
quit()
Later, in another R session, we can reconstruct this xml2 HTML document by using:
In order to test the proposed solution and API, this package will implement S3 marshal()
methods for some common R packages and their non-exportable classes. Note that the long-term goals is that these S3 methods should be implemented by these packages themselves, such that the marshal package will only provide a light-weight API.
The [A Future for R: Non-Exportable Objects] vignette has a collection of packages and classes that cannot be exported out of the box. This package has marshalling prototypes for objects from the following packages:
It also has implementations that will throw an error for objects from the following packages, because they cannot be marshalled, at least not at the moment:
The plan is to improve on add support for more R packages and object classes.
The marshal package is not, yet, on CRAN. In the meanwhile, it can be installed from the R Universe as:
options(repos = c("https://futureverse.r-universe.dev", getOption("repos")))
install.packages("marshal")
To install the pre-release version that is available in Git branch develop
on GitHub, use:
remotes::install_github("futureverse/marshal", ref = "develop")
This will install the package from source.