Developer Documentation
Third-party model integration, API specification and upload instructions
The Chief AI Marketplace is an open platform for hosting and consuming machine learning models.
We utilise a Docker engine on the backend and so, we ask suppliers to package their models as Docker images (or ‘Docker-ize’ their algorithms). We also have a specification for structuring your algorithm, which allows it to integrate seamlessly with our platform. Currently, we only support the Python programming language, including all standard machine learning libraries: tensorflow, keras, pytorch & scikit-learn.
Once you have correctly ‘Docker-ized’ & structured your model, you will be able to upload it directly to the platform via our web interface. Steps for both ‘Docker-izing’ your model and correctly structuring your model scripts are outlined in this document.
Table of Contents
Breakdown of Dockerfile commands
Importing models from 3rd party ML training platforms
Once you’ve trained your algorithm either locally or on your platform of choice, it’s time to package it up so it can be used efficiently for inference.
Your algorithm directory should have a flat structure with any required scripts, helper functions & model weights stored in it. We ask you to include 2 files in particular:
run_classification.py
This is a script used to execute your model via the command line. This script should accept either a single test image or a directory of images and it should write predictions to a JSON file, as specified below. There are a number of rules this file must adhere too:
This file must be named: run_classification.py
This file must contain a function which runs a prediction using your machine learning model. This function must accept the following command line argument:
--imageDir
the input to this argument can be a single test input file or, if your model can handle it, a directory of images to passed to the model for inference
The output of your model must be written to a local text file named: result.txt
Dockerfile
This file gives instructions for building a Docker image. This enables us to deploy a secure executable Docker container with your model, which also ensures the model’s original functionality is replicated for users accessing your model via our platform.
We give further instructions and an example Dockerfile below.
Aside from these 2 files, it is very much up to yourself in how the internals of your model works. At model upload time, you will be able to specify the input and output format of your model, including whether it’s a regression or classification model. This enables us to give customers the correct UI when using the model.
You can view an example model of a binary image classifier trained to predict whether an image is a compound image (an image made up of multiple smaller images, graphs, etc) or a single image. It is hosted publicly on GitHub...
A Dockerfile provides instructions for creating a Docker image, which in turn enables us to run a Docker container (see here for more on Docker containers). Often, a Docker image is based on another image, with some additional customization. For example, you may build an image which is based on the ubuntu image, but which also installs tensorflow, as well as configuration details needed to make your application run.
Your Dockerfile should include all your model dependencies, and any further instructions required to build & run your machine learning algorithm. Check out Docker’s reference page for more information. The following is a template Dockerfile for an example model we trained. This can be copied, edited and used for your model:
FROM python:3.7
We inherit from the official Python3.7 Docker image on Docker Hub. We suggest inheriting from official base images such as Tensorflow or others, rather than manually installing dependencies - this may optimise your model build and upload time.
RUN mkdir -p /srv
We create directory called ‘srv’
WORKDIR /srv
We set ‘srv’ as the working directory within a running Docker container
COPY requirements.txt /srv
We copy a text file named requirements.txt into your working directory
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
We upgrade the local installation of pip and subsequently install all the dependencies defined in requirements.txt
COPY run_classification.py /srv
COPY mdc_compound_fig_classifier_v1.h5 /srv
We copy your execution script (run_classification.py) into the working directory. We also copy our model weights file (mdc_compound_fig_classifier_v1.h5) into the working directory
We recommend you build your Docker image locally first before uploading to verify there are no errors in your Dockerfile. This will save time on iterating around errors during the upload process.
Following these instructions, you should be able to upload your machine learning model into our platform via our web interface or directly by API and set your models loose to the wider world! Steps for uploading via the web interface or API are outlined in the section.
Suppliers have 2 options to share their Docker-ized model:
Your packaged model is saved as a zip archive and uploaded directly via the web platform.
Your packaged model is saved as a zip archive and uploaded directly via RESTful API
This is as simple as it sounds! By following the earlier instruction around Structuring your algorithm, you should now have a folder containing all the files needed to run your algorithm.
To speed up the upload process, we ask you to save this folder as a ‘.zip’ archive which can be done in the command line or by right-clicking the folder and choosing ‘compress’. This may be named differently on different operating systems. This zipped archive can then be uploaded directly via our web upload form.
The fields you will be asked for are ‘model name’, ‘price’ and ‘tag’ - in addition to the zipped model folder, and some descriptive fields. Please refer to the table below for a description of ‘model name’, ‘price’ and ‘tag’.
This endpoint accepts a POST request and allows you to upload a model directly to our system. Similar to Option 1, you must provide a zipped archive of your model directory.
The request body should be coded as form-data. The following variables are required:
In the request header, you must include your API key. This should be passed in a field named x-api-key.
A full API specification for interacting with the platform will be published online very shortly!
We suggest you build your Docker image locally first before uploading to verify there are no errors in your Dockerfile. This will save time on iterating around errors during the upload process.
Additionally, our system currently accepts file uploads of up to approximately 10GB, which may take from 5 to 60 minutes depending on your internet connection. Larger files can be uploaded but file uploads are set to time out after 90 minutes. This is the same whether you upload via the web platform or directly via RESTful API.
Typically we see files around 3GB uploaded in 5-15 minutes; the Docker build process can take another 20 mins or so depending on dependencies.
If you’re training your model on a platform like AWS Sagemaker or Paperspace Gradient, you can still export your model and publish it on our platform. This process consists of serialising your model into either a ‘pickle’ or ‘h5’ file, and downloading it locally. You can then follow the same steps for Structuring your algorithm to get your model in a format ready for serving on our platform.
Please refer to the instructions below for serialising and exporting your model from the following platforms:
AWS Sagemaker
The steps for this example workflow can be summarized as:
Build and train your model in SageMaker Notebooks
Serialise your model weights to a .hd5 or .pkl file (see here for steps)
Structure your algorithm as described earlier
Upload via API or website
Let your model roam free!