Skip to content

Create & Deploy A Deep Learning App - PyTorch Model Deployment With Flask & Heroku

Create and Deploy your first Deep Learning app! In this PyTorch tutorial we learn how to deploy our PyTorch model with Flask and Heroku.

Create and Deploy your first Deep Learning app! In this PyTorch tutorial we learn how to deploy our PyTorch model with Flask and Heroku. We create a simple Flask app with a REST API that returns the result as json data, and then we deploy it to Heroku. As an example PytTorch app we do digit classification, and at the end we can draw our own digits and then predict it with our live running app.

Technology we will be using


Create project with virtual environment (Commands might slightly differ on Windows).

$ mkdir pytorch-deploy
$ cd pytorch-deploy
$ python3 -m venv venv

Activate it

$ . venv/bin/activate

or on Windows


Install Flask and PyTorch

$ pip install Flask
$ pip install torch torchvision

Create the app

Create a new directory app, and then inside this the file and insert:

from flask import Flask, jsonify
app = Flask(__name__)

@app.route('/predict', methods=['POST'])
def predict():
    if request.method == 'POST':
        return jsonify({'test': 'test_result'})
We only need one endpoint @app.route('/predict'). Here we want receive an image, predict it with our PyTorch model, and then return the result as json data. For now we only return a dummy json data with Flask's jsonify method.

Run and test the app


$ export FLASK_APP=app/
$ export FLASK_ENV=development
$ flask run

Create a test folder test and inside a file and insert:

import requests 

resp ="http://localhost:5000/predict")

In a second terminal run your file (You may need to install requests: pip install requests). If everything is working correctly, this should print the dummy json data {'test': 'test_result'}.

Train and save your model.

Grab the code from my PyTorch course here. This is a simple feed forward neural net that is trained on the MNIST dataset and used for digit classifications. The only modification we do is to add another dataset transformation because I want to demonstrate that we need this same transformation in our app, too. So in the beginning use this transformation and apply it to the dataset:

transform = transforms.Compose([transforms.ToTensor(),

# MNIST dataset 
train_dataset = torchvision.datasets.MNIST(root='./data', 

test_dataset = torchvision.datasets.MNIST(root='./data', 
This is just a normalization with the global mean and standard deviation of the MNIST dataset.

Now at the very end of the file, insert this line to save the model after training., "mnist_ffn.pth")
Then run this file. This should train the model, print a high accuracy, and save the model to our specified file. Grab this file and copy it into the app folder.

Create PyTorch utility functions

Create a new file inside the app folder. Here we want to do three things:

  • Load the model
  • Preprocess the image and convert it to a torch tensor
  • Do the prediction

Load the model

To learn about saving and loading, you can also have a look at my PyTorch course here. We create the same model as in our original file, load the state dictionary, and set it to eval mode. We only use the CPU version here, otherwise our package is too large for Heroku. So if you trained on the GPU, make sure to load it correctly. You'll find the code in the saving/loading tutorial, too. You can remove all .to(device) calls from the code. No worries, the CPU is fine and fast enough for model inference in this application.

import io
import torch 
import torch.nn as nn 
import torchvision.transforms as transforms 
from PIL import Image

class NeuralNet(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNet, self).__init__()
        self.input_size = input_size
        self.l1 = nn.Linear(input_size, hidden_size) 
        self.relu = nn.ReLU()
        self.l2 = nn.Linear(hidden_size, num_classes)  

    def forward(self, x):
        out = self.l1(x)
        out = self.relu(out)
        out = self.l2(out)
        # no activation and no softmax at the end
        return out

input_size = 784 # 28x28
hidden_size = 500 
num_classes = 10
model = NeuralNet(input_size, hidden_size, num_classes)

PATH = "mnist_ffn.pth"

Load and transform the image

Here we want to make sure our tensor has the same properties as in the MNIST dataset. So we apply the same transformations as in the original file. Additionally we want to resize our image to (28,28), and convert it to a grayscale image. Add this function to the file:

def transform_image(image_bytes):
    transform = transforms.Compose([transforms.Grayscale(num_output_channels=1),

    image =
    return transform(image).unsqueeze(0)


Now we use the same code as in the original file to predict the image and return the prediction in a new helper function:

def get_prediction(image_tensor):
    images = image_tensor.reshape(-1, 28*28)
    outputs = model(images)
        # max returns (value ,index)
    _, predicted = torch.max(, 1)
    return predicted

Put everything together

In the file, import these helper function and put everything together in the predictmethod. Additionally, we also include some error checking and only allow certain files:

from flask import Flask, request, jsonify

from torch_utils import transform_image, get_prediction

app = Flask(__name__)

ALLOWED_EXTENSIONS = {'png', 'jpg', 'jpeg'}
def allowed_file(filename):
    # xxx.png
    return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS

@app.route('/predict', methods=['POST'])
def predict():
    if request.method == 'POST':
        file = request.files.get('file')
        if file is None or file.filename == "":
            return jsonify({'error': 'no file'})
        if not allowed_file(file.filename):
            return jsonify({'error': 'format not supported'})

            img_bytes =
            tensor = transform_image(img_bytes)
            prediction = get_prediction(tensor)
            data = {'prediction': prediction.item(), 'class_name': str(prediction.item())}
            return jsonify(data)
            return jsonify({'error': 'error during prediction'})
Note that for this dataset prediction and class_name are the same. Normally we would have to do a mapping from the index to the actual class name here.

Test the model

Now grab some example images or draw your own with a simple paint program. In my case I used Paintbrush on the Mac, created a new image with size 100x100, filled the background with black, and draw digits with white color. Save it a png or jpg, and copy the files into the test folder. Now include this image into your post request, for example with an image called eight.png:

import requests 

resp ="http://localhost:5000/predict", files={'file': open('eight.png', 'rb')})

This should print {'prediction': 8, 'class_name': 8}. Congratulations! You now have a running PyTorch web app! As a last step we deploy it to Heroku

Deploy to Heroku

For production we want to have a proper web server, so we install gunicorn:

$ pip install gunicorn

All the following files should be added to the base directory. First, create a file and insert this line

from app.main import app

Create a Procfile and insert this:

web: gunicorn wsgi:app

Modify path names to take the app package as base:

# in the file:
from app.torch_utils import get_prediction, transform_image

# in the file
PATH = "app/mnist_ffn.pth"

Create a runtime.txt and insert the Python version you are using:


Make sure you are in the root folder of your package again. Now add all the packages to the requirements.txt file using pip freeze:

$ pip freeze > requirements.txt

Since we only can use the CPU version, modify the file like this to use PyTorch's CPU-only version. The command for CPU-only version can be taken from the PyTorch installation guide here. Select Linux, pip, and CUDA None. The download command may be added as first line in your requirements.txt file:


Add a .gitignore. You can take this version for Python from GitHub. Also add the testing folder, so the file may have this as first lines:


# Byte-compiled / optimized / DLL files

Create a Heroku app. For this you need to have the Heroku CLI installed. You can get it here. Login and then create a new app with the name you want:

$ heroku login -i
$ heroku create your-app-name

Test your app locally:

$ heroku local

Now add a git repository, commit all the files, and push it to Heroku:

git init
heroku git:remote -a your-app-name
git add .
git commit -m "initial commit"
git push heroku master

This should deploy your app to Heroku and will show you the link to your live running app. Now let's use this url in the file like this:

import requests

resp ="",
                     files={"file": open('eight.png','rb')})

Congratulations! You now have a live running app with a PyTorch model that can do digit classification! Note that the first time we send a request this may take a few seconds, since Heroku has to wake up our app first if we only use the free tier.

I hope you enjoyed this tutorial!

FREE VS Code / PyCharm Extensions I Use

鉁 Write cleaner code with Sourcery, instant refactoring suggestions: Link*

PySaaS: The Pure Python SaaS Starter Kit

馃殌 Build a software business faster with pure Python: Link*

* These are affiliate link. By clicking on it you will not have any additional costs. Instead, you will support my project. Thank you! 馃檹