Streaming API

In addition to the standard REST API, models deployed to ScienceOps can be queried by streaming multiple datapoints to the endpoint at once.

Why would I do this?

The Streaming API is designed for use-cases involving scoring or predicting on large amounts of data. Rather than sending 1,000,000 API requests to update a database, the Streaming API can be used to send large, chunked segments of data, to be scored simultaneously. The requests body is broken up by the master node and distributed across model replications for faster processing.

Code snippets below demonstrate how this can be done:

/<username>/models/<modelname>/?bulk=true

Make a prediction using <username> model <modelname>, and bulk=true

When using the Streaming API:

  1. JSON objects must be separated by line breaks - (\n)

  2. Each JSON object must NOT include line breaks

A valid example:

{"name": "jim"}

{"name": "bob"}

{"name": "sarah"}

Authentication

  • HTTP headers

Parameters:

  • username: Your username (e.g. kermit, kermit@themuppets.com)
  • name: The name of your deployed model (e.g. HelloWorld)
  • data: Data required to make multiple predictions. This is sent in the request body as line-delimited JSON (e.g. {"name": "Kermit"})
  • non_vectorized (optional): If non_vectorized=true, the response values will not be arrays (NOTE: applies to R models only)

Example Streaming POST requests:

curl -X POST -H "Content-Type: application/json"
    --user kermit:API_KEY \
    --data {"name": "jim"} \
    {"name": "bob"} \
    {"name": "sarah"} \
    http://localhost:9000/brandon/models/HelloWorld/?bulk=true

{"bulk_id":"250863ffc05e62f7bfb04bc2dc62ca5f","end_time":"2016-03-30 16:52:43.825976228 -0400 EDT","result":{"greeting":"Hello jim"},"start_time":"2016-03-30 16:52:43.823628183 -0400EDT","version":1,"yhat_id":"cbf769702d9cc14bc3edccec2ecfd5ae","yhat_model":"HelloWorld"}
{"bulk_id":"250863ffc05e62f7bfb04bc2dc62ca5f","end_time":"2016-03-30 16:52:43.827155622 -0400 EDT","result":{"greeting":"Hello sarah"},"start_time":"2016-03-30 16:52:43.826193035 -0400 EDT","version":1,"yhat_id":"dcb0dcb66f69b3a032e661b83ed234f9","yhat_model":"HelloWorld"}
{"bulk_id":"250863ffc05e62f7bfb04bc2dc62ca5f","end_time":"2016-03-30 16:52:43.825743005 -0400 EDT","result":{"greeting":"Hello bob"},"start_time":"2016-03-30 16:52:43.823007877 -0400 EDT","version":1,"yhat_id":"3f8a474532346f965accd0efb34201d3","yhat_model":"HelloWorld"}",

Note that the response differs slightly from the normal REST API.

  • bulk_id: a random hexadecimal string
  • start_time: the time the request was received by the model
  • end_time: the time the request was completed by the model

Note:

The output lines of a bulk prediction are not guaranteed to be returned in the same order in which the input lines were sent. To match input lines with output lines, send a unique value for the yhat_id (yhat_id: '1234abcd'). This will be returned with the corresponding result.

results matching ""

    No results matching ""