PySpark

Setup

PySpark models are slightly different from "regular" Python models. PySpark models require a custom runtime environment. You'll need to configure your ScienceOps cluster to use the yhat/scienceops-python-pyspark:1.5.2 base image.

Deployment

To deploy your model, you'll use the same methodology as a regular Python model with the exception of the deploy_spark command.

In this example we'll build a very simple recommendation engine using the MLlib recommendation module.

from pyspark.mllib.recommendation import ALS, MatrixFactorizationModel
from pyspark import SparkContext

sc = SparkContext(appName="Stuffs")

r1 = (1, 1, 1.0)
r2 = (1, 2, 2.0)
r3 = (2, 1, 2.0)
ratings = sc.parallelize([r1, r2, r3])
model = ALS.trainImplicit(ratings, 1, seed=10)


def make_prediction(user, product):
    outcome = model.predict(user, product)
    return { "result": outcome }

from yhat import Yhat, YhatModel

class HelloSpark(YhatModel):
    def execute(self, data):
        user = data['user']
        product = data['product']
        output = make_prediction(user, product)
        return output

yh = Yhat(YOUR_USERNAME, YOUR_APIKEY, SCIENCEOPS_URL)
print yh.deploy_spark("HelloSpark", HelloSpark, globals(), sc, sure=True)

Note the addition of the sc (your SparkContext) parameter to deploy_spark. This helps ScienceOps deploy your model.

results matching ""

    No results matching ""