Developer Documentation

Third-party model integration, API specification and upload instructions

The Chief AI Marketplace is an open platform for hosting and consuming machine learning models.

We utilise a Docker engine on the backend and so, we ask suppliers to package their models as Docker images (or ‘Docker-ize’ their algorithms). We also have a specification for structuring your algorithm, which allows it to integrate seamlessly with our platform. Currently, we only support the Python programming language, including all standard machine learning libraries: tensorflow, keras, pytorch & scikit-learn.

Once you have correctly ‘Docker-ized’ & structured your model, you will be able to upload it directly to the platform via our web interface. Steps for both ‘Docker-izing’ your model and correctly structuring your model scripts are outlined in this document.

Table of Contents

Structuring your algorithm

Files you need to include

Writing a Dockerfile

Breakdown of Dockerfile commands

Uploading your model

Importing models from 3rd party ML training platforms

AWS Sagemaker export

Structuring your algorithm

Once you’ve trained your algorithm either locally or on your platform of choice, it’s time to package it up so it can be used efficiently for inference.

Files you need to include

Your algorithm directory should have a flat structure with any required scripts, helper functions & model weights stored in it. We ask you to include 2 files in particular:

run_classification.py

This is a script used to execute your model via the command line. This script should accept either a single test image or a directory of images and it should write predictions to a JSON file, as specified below. There are a number of rules this file must adhere too:

This file must be named: run_classification.py
This file must contain a function which runs a prediction using your machine learning model. This function must accept the following command line argument:

--imageDir

the input to this argument can be a single test input file or, if your model can handle it, a directory of images to passed to the model for inference

The output of your model must be written to a local text file named: result.txt

Dockerfile
- This file gives instructions for building a Docker image. This enables us to deploy a secure executable Docker container with your model, which also ensures the model’s original functionality is replicated for users accessing your model via our platform.
- We give further instructions and an example Dockerfile below.

Aside from these 2 files, it is very much up to yourself in how the internals of your model works. At model upload time, you will be able to specify the input and output format of your model, including whether it’s a regression or classification model. This enables us to give customers the correct UI when using the model.

You can view an example model of a binary image classifier trained to predict whether an image is a compound image (an image made up of multiple smaller images, graphs, etc) or a single image. It is hosted publicly on GitHub...

Writing a Dockerfile

A Dockerfile provides instructions for creating a Docker image, which in turn enables us to run a Docker container (see here for more on Docker containers). Often, a Docker image is based on another image, with some additional customization. For example, you may build an image which is based on the ubuntu image, but which also installs tensorflow, as well as configuration details needed to make your application run.

Your Dockerfile should include all your model dependencies, and any further instructions required to build & run your machine learning algorithm. Check out Docker’s reference page for more information. The following is a template Dockerfile for an example model we trained. This can be copied, edited and used for your model:

FROM python:3.7

RUN mkdir -p /srv
WORKDIR /srv
COPY requirements.txt /srv
RUN pip install --upgrade pip
RUN pip install -r requirements.txt

COPY run_classification.py /srv

COPY mdc_compound_fig_classifier_v1.h5 /srv

Breakdown of Dockerfile commands

FROM python:3.7

We inherit from the official Python3.7 Docker image on Docker Hub. We suggest inheriting from official base images such as Tensorflow or others, rather than manually installing dependencies - this may optimise your model build and upload time.

RUN mkdir -p /srv

We create directory called ‘srv’

WORKDIR /srv

We set ‘srv’ as the working directory within a running Docker container

COPY requirements.txt /srv

We copy a text file named requirements.txt into your working directory

RUN pip install --upgrade pip

RUN pip install -r requirements.txt

We upgrade the local installation of pip and subsequently install all the dependencies defined in requirements.txt

COPY run_classification.py /srv

COPY mdc_compound_fig_classifier_v1.h5 /srv

We copy your execution script (run_classification.py) into the working directory. We also copy our model weights file (mdc_compound_fig_classifier_v1.h5) into the working directory

We recommend you build your Docker image locally first before uploading to verify there are no errors in your Dockerfile. This will save time on iterating around errors during the upload process.

Following these instructions, you should be able to upload your machine learning model into our platform via our web interface or directly by API and set your models loose to the wider world! Steps for uploading via the web interface or API are outlined in the section.

Uploading your model

Suppliers have 2 options to share their Docker-ized model:

Your packaged model is saved as a zip archive and uploaded directly via the web platform.

Your packaged model is saved as a zip archive and uploaded directly via RESTful API

Option 1 - Web Platform

This is as simple as it sounds! By following the earlier instruction around Structuring your algorithm, you should now have a folder containing all the files needed to run your algorithm.

To speed up the upload process, we ask you to save this folder as a ‘.zip’ archive which can be done in the command line or by right-clicking the folder and choosing ‘compress’. This may be named differently on different operating systems. This zipped archive can then be uploaded directly via our web upload form.

The fields you will be asked for are ‘model name’, ‘price’ and ‘tag’ - in addition to the zipped model folder, and some descriptive fields. Please refer to the table below for a description of ‘model name’, ‘price’ and ‘tag’.

Option 2 - RESTful API

This endpoint accepts a POST request and allows you to upload a model directly to our system. Similar to Option 1, you must provide a zipped archive of your model directory.

https://api.chief.ai/api/v1.0/model/upload

The request body should be coded as form-data. The following variables are required:

Variable (body)	Example	Description
modelname	modelA	Alphanumeric This is a unique field for naming your model. It can not contain whitespaces. Our system will return an error if your model name already exists UNLESS you are the original supplier who uploaded the model (i.e. you can then upload multiple versions).
membername	userA	Text Your unique member name generated when you sign up on the web form.
price	20000	Integer/Float The price you would like to charge per image for your model. This can be edited later via your supplier dashboard. Set it to 0 to make your model freely available. This is ‘price per prediction’.
zipped_model_folder	file	File This is your model saved as a zipped folder as described in the steps above.
tag	1.2	Alphanumeric This is a unique field that currently serves as a versioning system; you can include any name or version number you like. It will be associated with your model on the backend. Currently you can not overwrite existing tagged models; ie modelA with tag ‘1.0’ cannot be uploaded twice, you will need to change the versioning to ‘2.0’ or ’1.1’ or another suitable version

In the request header, you must include your API key. This should be passed in a field named x-api-key.

Variable (header)

Example

Description

x-api-key

Alphanumeric

This is a unique field for identifying you as a user. It will be assigned to you when you first register via the website, make sure to note it down!

A full API specification for interacting with the platform will be published online very shortly!

Notes on uploading:

We suggest you build your Docker image locally first before uploading to verify there are no errors in your Dockerfile. This will save time on iterating around errors during the upload process.

Additionally, our system currently accepts file uploads of up to approximately 10GB, which may take from 5 to 60 minutes depending on your internet connection. Larger files can be uploaded but file uploads are set to time out after 90 minutes. This is the same whether you upload via the web platform or directly via RESTful API.

Typically we see files around 3GB uploaded in 5-15 minutes; the Docker build process can take another 20 mins or so depending on dependencies.

Importing models from 3rd party ML training platforms

If you’re training your model on a platform like AWS Sagemaker or Paperspace Gradient, you can still export your model and publish it on our platform. This process consists of serialising your model into either a ‘pickle’ or ‘h5’ file, and downloading it locally. You can then follow the same steps for Structuring your algorithm to get your model in a format ready for serving on our platform.

Please refer to the instructions below for serialising and exporting your model from the following platforms:

AWS Sagemaker

The steps for this example workflow can be summarized as:

Build and train your model in SageMaker Notebooks
Serialise your model weights to a .hd5 or .pkl file (see here for steps)
Structure your algorithm as described earlier
Upload via API or website
Let your model roam free!

Supplier Integration Documentation

Helping you come on board

Structuring your algorithm

Files you need to include

Writing a Dockerfile

Breakdown of Dockerfile commands

Uploading your model

Option 1 - Web Platform

Option 2 - RESTful API

Notes on uploading:

Importing models from 3rd party ML training platforms