R models are bundled and serialized using the
.Rdata format. The
yhatr client inspects the user's model and identifies which functions, source code, modules, objects, etc. it needs in order to execute the
model.predict function. Once it has a list of those, it uses the
save function in R to bundle them into an
.Rdata file which can then be transported to ScienceOps.
Python models are bundled and serialized using the
yhat Python package. During a deployment,
yhat will inspect the user's workspace and identify source code, functions, modules, objects, etc. that it needs in order to run the
execute method of the user's model class (either a
YhatModel or a
SplitTestModel). Once it's identified the list of requirements, the
yhat package uses
terragon (yes, it's spelled wrong) to serialize those objects.
terragon uses a combination of serialization formats in Python in order to accomodate as many different libraries as possible. The preferred format is
pickle, however not all Python objects can be serialized this way. For example, models built using Tensorflow and
PySpark have their own respective serialization formats which
terragon uses accordingly.