Using 3rd Party Python Packages

Importing Python packages and modules

The Yhat Python client allows for packages and modules to be installed into a model's environment on deployment.

  • Packages and modules can be listed explicitly or auto-detected.
  • Modules and packages can be added explicitly to a REQUIREMENTS array in the model class.
  • Auto-detect can capture most dependencies installed using conda or pip.
  • Modules and packages will be installed from available conda channels or pip.

  • Private Conda channels can be added on the "Package Sources" page on ScienceOps: (Admin > Package Sources).

Sections:

Specifying packages:

Packages can be installed on ScienceOps using both pip and conda. Pip freeze files and packages with specified versions can also be used.

ScienceOps will first attempt to conda install the package and otherwise default to pip install.

Valid requirements array:

    REQUIREMENTS=[
      'git+https://www.github.com/username/packagerepo@branch',
      'path/to/pip-freeze-file.txt',
      'pandas==0.18.0',
      'scikit-learn',
      'numpy'
    ]
from yhat import Yhat, YhatModel, preprocess, df_to_json

class MyModel(YhatModel):
    # Below, our specified requirements
    REQUIREMENTS=[
      'git+https://www.github.com/user/package@branch', # install from github
      'path/to/pip-freeze-file.txt', # install from a pip-freeze txt file
      'pandas==0.18.0', # install specific version of a package
      'scikit-learn',   # install latest version of a package
      'numpy'           # install latest version of a package
    ]
    def execute(self, data):
        ...
        return result

yh = Yhat("USERNAME", "APIKEY", "https://scienceops_url.com/")

## NOTE: autodetect=False,
## we must specify ALL packages our model requires
yh.deploy("MyModelname", MyModel, globals(), autodetect=False)

Example Model and Package Requirements

pandas==0.18.0, the latest version of scikit-learn, and latest version of numpy will all be installed into the model environment on deployment.

If a different version of pandas is specified on deployment than what is installed on your local system, a warning will print, but the the REQUIREMENTS will be left as specified.

import numpy as np
import pandas as pd
from sklearn.svm import SVC
from sklearn.datasets import load_iris

iris = load_iris()

X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.DataFrame(iris.target, columns=["flower_types"])

clf = SVC()
clf.fit(X, y["flower_types"])

####################DEPLOYMENT#########################

from yhat import Yhat, YhatModel, preprocess, df_to_json

class MySVC(YhatModel):
    # Below, our specified requirements
    REQUIREMENTS=[
      'pandas==0.18.0',
      'scikit-learn',
      'numpy'
    ]
    @preprocess(in_type=pd.DataFrame, out_type=pd.DataFrame)
    # The execute function the will run on each POST request
    def execute(self, data):
        prediction = clf.predict(pd.DataFrame(data))
        species = ['setosa', 'versicolor', 'virginica']
        result = [species[i] for i in prediction]
        return result


yh = Yhat("USERNAME", "APIKEY", "https://scienceops_url.com/")

##  NOTE: Because autodetect=False,
##  we must specify ALL packages our model requires
yh.deploy("SVC", MySVC, globals(),autodetect=False)

Requirements

Valid inputs in the REQUIREMENTS array consist of:

  • packages_name
  • package_name==version
  • git+https://www.github.com/username/packagerepo@branch
  • path/to/pip-freeze.txt

Note that the pip freeze.txt file should not include more than 20 packages. This will also significantly increase the size of the model environment.

Installing from Github

Packages can be installed directly from github

from yhat import Yhat, YhatModel, preprocess
import pulp

class  SomeModel(YhatModel):
    REQUIREMENTS = [
      "git+https://github.com/yhat/ggplot@master"
    ]
    # model definition

Autodetect

While autodetect will detect most packages automatically, there are certain edge cases when it will not find all requirements.

For example, Yhat cannot automatically discover modules from which only a constant is imported, so they will need to be added explicitly to the REQUIREMENTS. It requires the module or an instance of one of its classes be in the same scope as the Yhat.deploy call. For instance the following code does not allow automatic discovery of the dependency on PuLP.

from yhat import Yhat, YhatModel, preprocess
from pulp import LpBinary

class  SomeModel(YhatModel):
    # model definition

ScienceOps v2.4.1 and before

Packages can be installed on ScienceOps using conda. Based on packages listed in the REQUIREMENTS array, ScienceOps will first attempt to conda install using the available conda channels.

Valid requirements:

    REQUIREMENTS=[
      'pandas==0.18.0',
      'scikit-learn',
      'numpy'
    ]

results matching ""

    No results matching ""