Deploy your Keras or TensorFlow Machine Learning Model to AWS using Amazon SageMaker — How to create an API for your cloud-hosted ML model.

Baraa Al-Bourghli
6 min readMay 21, 2020

--

This practical guide is aimed at builders who already have a Keras or TensorFlow Machine Learning Model ready but they want to operationalize it and deploy it to an API to start using it in apps.

Note: the AWS services used for this guide are not free and there is a cost factor involved in what we are about to do. So please do not forget to turn off and delete what you created after you finish experimenting.

This blog post is the result of me trying to follow this tutorial and few others only to be stuck in a few places and not to be able to find a clear answer on how the API input was formatted nor how to call the API from outside of a notebook. So I took the good parts of that tutorial and I added what worked for me.

Steps overview:

1- Create a notebook instance

2- Prepare the trained model for deployment

3- Deploy the model to an API

4- Use Postman to call the API

Create a notebook instance:

Sign in to AWS console then search for SageMaker service, you will land in the SageMaker dashboard:

SageMaker dashboard

Click the Notebook instances link and create a notebook instance:

You will be asked to fill in some information about your notebook, for now, we will leave most things to default, just pick a name and then under the “Permissions and encryption” section you will be asked to create a new IAM role and make sure you select “Any S3 bucket” option:

When you successfully create your IAM role, you can now create your notebook:

Now you will see your notebook instance under “Notebook instances” when the status turns to “InService” you will see an option to open JupyterLab

You will then need to select the kernel to conda_tensorflow_p36 which can run Keras or TensorFlow.

Now we have our workspace ready for us so let’s upload our notebook

Now rerun your notebook as is and you will be ready for the next step.

Prepare the trained model for deployment:

We will now need to export our model to TensorFlow ProtoBuf format but if you already did that skip to step 2.

1- If you are using Keras: we need to save your model’s weights & serialize your model to JSON, so at the bottom of your notebook, add the following code:

model.save_weights('example_model_weights.h5')model_json = model.to_json()
with open("example_model.json", "w") as json_file:
json_file.write(model_json)

We will now create our deployment script, we are going to use the default notebook created for us when we created the instance, you will see that it has a default name “Untitled.ipynb” so let’s rename it to something more meaningful, you can right-click on it and rename it to something like “deployment.ipynb”:

Now we will need to convert your model to TensorFlow ProtoBuf, let’s start writing our “deployment.ipynb” notebook to do that:

import boto3, re
from sagemaker import get_execution_role
role = get_execution_role()
import keras
from keras.models import model_from_json

let’s create a directory to save our exports:

!mkdir keras_model

And move your moel (“example_model_weights.h5” & “example_model.json”) to this new directory, then run the following commands to load the model:

json_file = open('/home/ec2-user/SageMaker/keras_model/'+'example_model.json', 'r')loaded_model_json = json_file.read()
json_file.close()
loaded_model = model_from_json(loaded_model_json)loaded_model.load_weights('/home/ec2-user/SageMaker/keras_model/example_model_weights.h5')

Now let’s convert it to the target format, and we will need to keep the “export/Servo/{version_number}” directory name unchanged:

# Export the Keras model to the TensorFlow ProtoBuf formatfrom tensorflow.python.saved_model import builderfrom tensorflow.python.saved_model.signature_def_utils import predict_signature_deffrom tensorflow.python.saved_model import tag_constantsversion_number = '1'
export_dir = 'export/Servo/' + version_number
builder = builder.SavedModelBuilder(export_dir)
signature = predict_signature_def(
inputs={"inputs": loaded_model.input}, outputs={"score": loaded_model.output})
from keras import backend as K
with K.get_session() as sess:
builder.add_meta_graph_and_variables(
sess=sess, tags=[tag_constants.SERVING], signature_def_map={"serving_default": signature})
builder.save()

2- If you are using TensorFlow ProtoBuf you need to upload your exports to “export/Servo/1/”.

The result in both cases should be like this:

The last step is to compress the files and upload them to S3:

import tarfile
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
archive.add('export', recursive=True)
import sagemaker
sagemaker_session = sagemaker.Session()
inputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='model')

Deploy the model to an API:

Let’s prepare the deployment:

!touch train.py
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
role = role,
framework_version = '1.12',
entry_point = 'train.py')

Now let’s actually deploy it to an API by creating a new cell in our notebook then writing:

%%time
predictor = sagemaker_model.deploy(initial_instance_count=1,
instance_type='ml.m4.xlarge')

please note that the `%%time` command has to be the first line of the cell, so we must add this into a new cell.

The “ml.m4.xlarge” instance type is not the smallest instance available, you can find the list of instance types here.

The output of the last cell will give you your API endpoint similar to this:

endpoint with name sagemaker-tensorflow-2020-04-26-10-42-56-305

Your API is deployed successfully.

Use Postman to call the API:

Now if you go to the Endpoints section

You will find your API ready for you. Copy the API link:

And start your Postman client and past it as a POST request URL, then open the Authorization section and under TYPE select AWS Signature

Then you need to enter your “AccessKey” + “SecretKey” and also the AWS region you deployed your API to and set the service name to “sagemaker”

Under the Headers tab, you need to set the Content-Type to JSON

Finally, we need to give our API the input. But how do we know what is the expected input?

It turns out that our API will accept a JSON array as input, much like the way we designed our model, for example, my model had 3 inputs

And the API expected input type is NumPy Matrix so my input was of the following shape:

Conclusion

This article hopefully provides a good introduction to deploying your Keras/TensorFlow machine learning model to AWS. Please be reminded that it’s the first release, so it may not cover all use cases. Any feedback or questions are welcomed.

--

--