Dmytro Hrimov

Senior software engineer

Set up a private PyPi repository

In this tutorial, I will go through setting up a private PyPi server. I will be using pypiserver, Docker, Terraform and AWS EC2.

Why?

I could not find free or cheap private PyPi repository options, so I started my own. It can run on a small instance as `m2.micro` and cost you around $10 monthly. It is also an AWS Free Tier eligible instance type, so it can even be free for the first year of usage. Running a private PyPi server on the `m2.micro` EC2 instance is the cheapest way to have a private PyPi server that exists at the moment of writing.

Scope

I am going to cover the process of setting up an EC2 instance, configuring it, installing a pypiserver, and protecting access with a password.

I will leave out of the scope of this tutorial the process of signing up on AWS, Terraform code execution and management, Route53 domain configuration, and HTTP connectivity. Please let me know via the contacts page if you want me to cover any of these.

Alternatives

Since the private server is not entirely free, I wanted to quickly familiarize you with the alternative solution to using a private PyPi server. The alternative is to use git links to your repository with a specific release ref, like this in Poetry:

some-library = {git = "git@github.com:owner/some-library.git", rev = "v1.0.0"}

The cons of this solution are:

  • You need to update refs for every release manually
  • You need to configure your build system with git keys so it can load the data from the repository

Set up an EC2 instance

Now, let me start with setting up the private server. Before I even get to the server part, I need a virtual machine up and running. But to start a machine, I need to do the other three things first.

Generate an SSH key and create a key pair

I will access a virtual machine via SSH using an SSH key. To generate a key, I need to run a command:

ssh-keygen -t ed25519 -C "your_email@example.com"

If you are using a Windows machine, you want to use this command instead:

ssh-keygen -t rsa -b 4096 -C "your_email@example.com"

Executing this command will generate a key pair for me.

One more step to do is to limit the permissions to the private key file:

chmod 400 ~/.ssh/<key_name>

Now, I need to grab a public part of the key (stored in a file with `.pub` extension) and then use it for the following terraform resource:

resource "aws_key_pair" "pypi_key_pair" {

  key_name = "pypi-key"

  public_key = "<public_key_goes_here>"

}

Create a security group

Next, I need to create a security group that AWS will attach to the virtual machine.

resource "aws_security_group" "pypi_security_group" {

  name = "pypi-server-security-group"

  

  # Allow ssh access into the machine

  ingress {

    from_port = 22

    to_port = 22

    protocol = "tcp"

    cidr_blocks = ["0.0.0.0/0"]

    ipv6_cidr_blocks = ["::/0"]

  }



  # Allow the machine to go to the internet

  egress {

    from_port = 0

    to_port = 0

    protocol = "-1"

    cidr_blocks = ["0.0.0.0/0"]

    ipv6_cidr_blocks = ["::/0"]

  }

}

One thing I intentionally missed here is the port that the server will be listening to. The port depends on how you configure the environment. The easiest solution is to allow access on port 80 or 8080. So, if you just want this to work, insert another ingress block for the port you want to be listening to, and I will move on.

Create an EBS volume

Since I want to be safe and not rely too much on the virtual machine, I want my package data to go to a separate EBS volume. If something happens to the machine, or I want to upgrade the server or do anything else, I can attach this same volume to a different machine and not lose my packages.

To create an EBS volume, I’ll be using this piece of terraform code that will create a 10GB volume of `gp3` type, which is a general-purpose storage provided by Amazon:

resource "aws_ebs_volume" "pypi_storage" {

  availability_zone = "<region>"

  size = 10

  type = "gp3"

}

Start an EC2 instance

Finally, I am ready to write a terraform code for my VM:

resource "aws_instance" "pypi" {

  ami = "<amy_of_a_choice>"

  availability_zone = "<region>"

  instance_type = "t2.micro"

  associate_public_ip_address = true

  key_name = aws_key_pair.pypi_key_pair.key_name

  security_groups = [aws_security_group.pypi_security_group.name]

}



resource "aws_volume_attachment" "ebs_att" {

  device_name = "/dev/sdf"

  volume_id = aws_ebs_volume.pypi_storage.id

  instance_id = aws_instance.pypi.id

}

You can use the AMI you want. I used the AWS console to find the one I needed. I used Amazon Linux AMI for myself. You can grab the latest AMI ID AWS has in the AWS console instead of using something outdated if I type it. The ID looks like this: `ami-1a2b3c`.

After executing the Terraform code above, the EC2 instance should be up and running with the public IP address that you can find in the AWS console.

Log into the machine and install Docker

I will be running the pypi server within the docker container, so I will need to install Docker on the virtual machine.

First, I will SSH into the machine:

ssh -i <key_name> ec2-user@<vm_public_ip_address>

Now, I will update the machine first:

sudo yum update -y

Now, let’s quickly find out what drives we have available as I am using a separate storage device to store packages, so I need to mount it real quick:

[ec2-user@ip-192.168.0.1 ~]$ lsblk

NAME       MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS

xvda       202:0   0  8G   0 disk 

├─xvda1    202:1   0  8G   0 part  /

├─xvda127  259:0   0  1M   0 part 

└─xvda128  259:1   0  10M  0 part  /boot/efi

xvdf       202:80  0  10G  0 disk

Here, I see my 10GB drive in the end, so I need to format it. The drive formatting must be run only once and on an empty volume. If you run this on an existing volume, it will erase all of its data.

$ sudo mkfs -t xfs /dev/xvdf

If the previous format command does not work, you need to install an additional package:

$ sudo yum install xfsprogs

Now I mount my drive:

$ sudo mkdir /data

$ sudo mount /dev/xvdf /data

When the drive is mounted, I will install the Docker:

$ sudo yum install -y Docker

$ sudo service docker start 

$ sudo curl -L "https://github.com/docker/compose/releases/download/v2.24.6/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose 

$ sudo chmod +x /usr/local/bin/docker-compose

Quick check that everything is up and running:

[ec2-user@ip-192.168.0.1 ~]$ docker --version

Docker version 24.0.5, build ced0996



[ec2-user@ip-192.168.0.1 ~]$ docker-compose --version

Docker Compose version v2.24.6

Now I need to add `ec2-user` to the `docker` user group to run Docker commands without the need to use `sudo`:

$ sudo usermod -a -G docker ec2-user

For this last change to take effect, I need to exit the SSH session and log back in again.

PyPi setup

Before I get to the server setup itself, I need to do some preparation steps first.

Create a user

As I don’t want this server to run unprotected, I need to create a user and a password first:

$ sudo yum install -y httpd-tools

$ mkdir /home/ec2-user/pypi/auth

$ cd /home/ec2-user/pypi/auth

$ htpasswd -sc .htpasswd <SOME-USERNAME>

Then, enter the password when prompted.

Packages directory

I will create the packages directory where the server will be storing all of the Python packages:

$ mkdir /data/packages

Server setup

To set up a PyPi server, I need to make a directory and navigate to it:

$ mkdir /home/ec2-user/pypi

$ cd /home/ec2-user/pypi

Here, I need to create a `docker-compose.yml` file with the following content:

version: '3.7'



services:

  pypi-server:

    image: pypiserver/pypiserver:latest

    ports:

      - 8082:8080

    volumes:

      - type: bind

        source: /home/ec2-user/pypi/auth

        target: /data/auth

      - type: bind

        source: /data/packages

        target: /data/packages

    command: -P /data/auth/.htpasswd -a update,download,list /data/packages

    restart: always

Notes on the `docker-compose.yml`:

  • I map port 8082 on the VM to port 8080 inside of the Docker container that pypiserver is listening to.
  • I bind auth folder so that server can use it for authentication.
  • I also bind packages folder instead of using a Docker volume so that the pypiserver can store packages on the external volume I created.

Finally, I can start the server:

$ docker-compose up -d --build

If I now go to my browser and open the following URL, I should see my PyPi server interface:

http://<your_public_ip_address>:8082

Configure Poetry

I use Poetry in most of the projects, and I can recommend it if you want to use it as well. So, to be able to upload and download the packages, I need to do some configuration.

First, I want to add my repository:

$ poetry config repositories.mypypi http://<your_public_ip_address>:8082

And then configure the authentication:

$ poetry config http-basic.mypypi <username>

Enter the password when prompted.

Publish the package

To publish the package, you need just to let Poetry know which repo to publish to:

$ poetry build

$ poetry publish -r mypypi

Use the package

To use the package from the private repository, I need to add this to pyproject.toml file:

[[tool.poetry.source]]

name = "mypypi"

url = "http://<your_public_ip_address>:8082"

After this block is added I can use poetry add <your_private_package_name and it should work.

Conclusion

In the tutorial, I went through the steps of setting up the private PyPi repository, which should cost very little to keep up and running. If you have questions or think I could cover some pieces better, please message me on social media. Thank you so much for taking the time to read this!