---
title: "How to deploy MLFlow on Heroku"
description: "How to deploy MLFlow on Heroku using PostgreSQL as the database, S3 as the artifact storage and with BasicAuth authentication"
author: "Bartosz Mikulski"
author_bio: "Principal AI Engineer & MLOps Architect. I bridge the gap between \"it works in a notebook\" and \"it works for 200 million users.\""
author_url: https://mikulskibartosz.name
author_linkedin: https://www.linkedin.com/in/mikulskibartosz/
author_github: https://github.com/mikulskibartosz
canonical_url: https://mikulskibartosz.name/how-to-deploy-mlflow-on-heroku
---

This article will show you how to deploy MLFlow on Heroku using PostgreSQL as the database and S3 as the artifact storage. In addition to the deployment, I'll also demonstrate how to setup BasicAuth authentication to protect access to your MLFlow instance.

## Challenges

There are two problems when running MLFlow on Heroku.

First, Heroku instances are publicly available unless you want to pay $1000 per month for deploying your app in Heroku Private Spaces. However, even if you deploy the application in a private network, you will probably want to restrict access to it anyway. **MLFlow does not support authentication out of the box, so we'll have to configure a proxy server.**

The second issue is the Heroku ephemeral filesystem. We can't store the artifact in the filesystem of the machine running MLFlow because it gets reset at least once a day. Because of that, we'll **store the models in S3**.

## Building Heroku Dynos From Dockerfiles

To deploy a Docker image as a Heroku dyno, we need two things:

* we have to change the Heroku stack to a container using the `heroku stack:set container -a [app-name]` command

* we must include the `heroku.yml` file in the root path of the application repository. The file tells Heroku what should be deployed as the web application, and in the case of deploying Docker images built from Dockerfiles, the configuration looks like this:

```yaml
build:
  docker:
    web: Dockerfile
```

## Preparing the Dockerfile

Now, we must define the Dockerfile. In the file, we will use the Python 3.6 Alpine Linux image as the base, and we will install the required software:

```Dockerfile
FROM python:3.6-alpine

RUN apk update
RUN apk add make automake gcc g++ subversion python3-dev musl-dev postgresql-dev nginx gettext apache2-utils

RUN pip install boto3 psycopg2 mlflow
# ...
```

In the second part of the Dockerfile, we copy the run script, Nginx configuration template and remove the default configuration file. We'll use the `run.sh` script as the Docker entry-point.

```Dockerfile
# the second part of the Dockerfile
COPY run.sh run.sh
RUN chmod u+x run.sh

COPY nginx.conf_template /etc/nginx/sites-available/default/nginx.conf_template

RUN rm /etc/nginx/http.d/default.conf

CMD ./run.sh
```

## Configuring Nginx Proxy With Basic Auth

Because Heroku randomly assigns the application HTTP port when it starts a new Dyno, we cannot put the HTTP port in the Nginx configuration. Instead of that, we'll use a placeholder variable. Later, we are going to overwrite the `run.sh` script before Nginx starts.

An Nginx configuration file is quite long, so I will not include the entire file here. To prepare a working Nginx configuration, you need to:

1. Copy the default Nginx configuration file (`/etc/nginx/http.d/default.conf`) and save it as the `nginx.conf_template` file. We will use it as the starting point to create the script.

2. Remove the virtual host import:

```
# Includes virtual hosts configs.
include /etc/nginx/http.d/*.conf;
```

3. In the `http` part, add a new `server` configuration. Of course, we use a placeholder instead of the listening ports. We have also configured the BasicAuth user file and the target for the proxy server.

```conf
server {
        listen  $HEROKU_PORT;

        access_log /var/log/nginx/reverse-access.log;
        error_log /var/log/nginx/reverse-error.log;

        location / {
            auth_basic "Restricted Content";
            auth_basic_user_file /etc/nginx/.htpasswd;

            proxy_pass                          http://127.0.0.1:$MLFLOW_PORT/;
            proxy_set_header Host               $host;
            proxy_set_header X-Real-IP          $remote_addr;
            proxy_set_header X-Forwarded-For    $proxy_add_x_forwarded_for;
        }
    }
```

## Implementing the Run Script

In the run script, we must generate the actual Nginx configuration file from the template and given ports, configure the BasicAuth user, start MLFlow and Nginx.

However, first, we have to deal with an annoying Heroku bug. The PostgreSQL Heroku addon generates an environment variable with a SQL connection string.  The connection string starts with the protocol `postgres://`, which is incompatible with the new `sqlalchemy` versions. Therefore, the first two lines of the file rewrite the `DATABASE_URL` into the supported format.

In the subsequent lines, we generate the Nginx configuration file by replacing `$HEROKU_PORT` with the content of the `$PORT` environment variable. However, Heroku may assign port 5000 to the Docker container. In this case, we cannot run MLFlow using port 5000. Because of that, we include an if statement to change the MLFlow port in case of a conflict.

The `htpasswd` command creates a new user with the username and password passed as Heroku environment variables.

The last three commands restart Nginx and start the MLFlow server.

```bash
database_without_protocol=$(echo "$DATABASE_URL" | cut -c 9-)
export BACKEND="postgresql$database_without_protocol"
export HEROKU_PORT=$(echo "$PORT")

if [[ $PORT -eq 5000 ]]
then
  export MLFLOW_PORT=4000
else
  export MLFLOW_PORT=5000
fi

envsubst '$HEROKU_PORT,$MLFLOW_PORT' < /etc/nginx/sites-available/default/nginx.conf_template > /etc/nginx/sites-available/default/nginx.conf

htpasswd -bc /etc/nginx/.htpasswd $BASIC_AUTH_USER $BASIC_AUTH_PASSWORD

killall nginx

mlflow ui --port $MLFLOW_PORT  --host 127.0.0.1 --backend-store-uri $BACKEND --default-artifact-root $S3_LOCATION &

nginx -g 'daemon off;' -c /etc/nginx/sites-available/default/nginx.conf
```

## Creating an AWS Account With S3 Access Using Terraform

The Terraform file below contains a definition of a new IAM user who has access to the S3 location used as the artifact storage. The Terraform configuration will store the user's access key and the secret key in the AWS Secrets Manager.

```json
resource "aws_iam_user" "mlflow_username" {
  name = "mlflow-username"
}

resource "aws_iam_access_key" "mlflow_access_key" {
  user = aws_iam_user.mlflow_username.name
}

resource "aws_secretsmanager_secret" "mlflow_access_api_keys" {
  name = "mlflow_access_api_keys"
  description = "API Key and Secret Key for MLFLow"
}

resource "aws_secretsmanager_secret_version" "mlflow_access_api_keys_v1" {
  secret_id     = aws_secretsmanager_secret.mlflow_access_api_keys.id
  secret_string = jsonencode({"AccessKey" = aws_iam_access_key.mlflow_access_key.id, "SecretAccessKey" = aws_iam_access_key.mlflow_access_key.secret})
}

resource "aws_iam_user_policy" "mlflow_access_policy" {
  name = "mlflow_access_policy"
  user = aws_iam_user.mlflow_username.name

  policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:DeleteObject",
        "s3:PutObject"
      ],
      "Resource": ["arn:aws:s3:::bucket_name/parent/key/name/*"]
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket"
      ],
      "Resource": ["arn:aws:s3:::bucket_name"]
    }
  ]
}
EOF
}
```

After applying the Terraform changes, you will find the user's access keys in the AWS Secrets Manager.

## Setting Heroku Environment Variables

In the `run.sh` script, we have used several environment variables. All of them must be setup in Heroku. Heroku automatically provides the `$PORT` variable during the deployments, and we don't need to worry about it. The `$DATABASE_URL` is added automatically when we configure a PostgreSQL addon.

In addition to those variables, we need:

* `$S3_LOCATION` defines the S3 path to the bucket and parent key used to store the artifacts. For example: `s3://bucket_name/parent/key/name`
* `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` contain the credentials of the AWS user
* `BASIC_AUTH_USER` and `BASIC_AUTH_PASSWORD` are the username the and password used to access MLFlow

## What About HTTPS?

Fortunately, Heroku handles it automatically, and we don't need to care about it.