/aws

Fun with FTP and AWS

Introduction

I just moved into a house we've been renovating for the past year and as part of this we did a complete re-wire. This gave me the option to specify data cabling to be installed as part of the build and I had some run to the eaves of the house so I could install PoE cameras.

The camera I'm using is the Reolink RLC-410 which supports PoE, is IP-66 rated and is quite easy to mount. They are also quite cheap and easy to get hold of.

These cameras support motion detection and when they detect motion they can either send an email or upload the photos to an FTP server. Email is not useful and FTP is a pretty old and horrible protcol. I wanted to find a way to get the images into an S3 bucket for later processing: One option is to use AWS Transfer, but this is really expensive to keep running 24x7 - more than $200 USD per month.

FTP to S3

The first building block of the solution is a quick app I built in Python which uses pyftpdlib to expose an FTP server which waits for files to be uploaded and then in turn uploads them to an S3 bucket. You can find the application here https://github.com/richardjkendall/ftp-to-s3.

FTP and Docker

I run most of my services on an AWS ECS cluster so I wanted to package it as a docker image. FTP is a very old protocol and there are some issues with it and docker. There are two data transfer modes the protocol supports:

  • Active: clients connect to the server on port 21 for control communication, when data transfers are needed the client nominates a local port and tells the server the port number. The server then initiates a connection to the client on that port and this connection is used to transfer data. Generally this mode is not well supported because most clients are behind NAT or firewalls.
  • Passive: clients connect to the server on port 21 for control communication, when data transfers are needed the server nominates a port and the client connects to the server using this new port. This connection is used for data transfers.

Given the challenges with active transfers lots of setups use passive. This creates a small challenge for docker because as well as opening port 21 you also need to be able to accept inbound connections on arbitary ports at run-time. You can open ranges of ports when using docker directly, but container managers like K8s and ECS do not support this. The workaround is to use the host networking mode, which connects the running container directly to the network interface of the host.

Service Discovery

The next challenge using containerised services is integrating with Service Discovery. With my ECS cluster I use the AWS Cloud Map Service Discovery service and each of my services is registered with this tool. For my HTTP services I use a build of haproxy which I integrated with the AWS Service Discovery API to find the backend instances and route traffic to them - you can see that here https://github.com/richardjkendall/haproxy. This does not work for FTP though, because although it can find the backend and route traffic towards the command port (21), but it does not work for the data connections (which get made directly from the client to the server).

FTP Reverse Proxy

So we need a reverse proxy which can discover the backend instance running on ECS and then sit inbetween the cameras and the backend service to manage all the connections. There is a FTP proxy module available for the proftpd server which does what I need: https://github.com/Castaglia/proftpd-mod_proxy

As with haproxy, I have created a dockerised version which discovers the backend service (using the DNS SRV record) and then creates a configuration file to tell proftpd to reverse proxy connections to the FTP backend service. You can see that code here https://github.com/richardjkendall/ftp-rproxy and the docker image here https://hub.docker.com/r/richardjkendall/ftp-rproxy.

Overall Solution

My home network is connected to my AWS VPC via VPN. This means that my AWS ECS services are accessible from my local network without a lot of difficulty. My local DNS is also integrated with AWS Route 53 (subject of a later article) so I can resolve service discovery addresses locally. I typically run most of my local workloads on Raspberry PIs - but I've hit an issue with this service that I'll cover later.

solution diagram v2

So now when my cameras detect motion they upload the photos to a local FTP server which is actually a reverse proxy for an FTP server running on an AWS ECS cluster which is in fact not really an FTP server at all, it is an application server which uploads any file sent to it to an AWS S3 bucket. Now once my files are in S3 I can trigger workflows to analyse them and create alerts as needed. I'll cover that work in a later article.

Problems with ARMv7

When I initially built this, I created a prototype on an AWS a1 series machine, which uses their Graviton2 64bit ARM processor. I did this because my target runtime environment is a Raspberry Pi which has a 32bit ARMv7 processor (I'm using a Raspberry Pi 3B+ in this instance). To compile it as a 32bit binary I used a container running the arm32v7/alpine image.

Using this approach I was able to get something which compiled, but I started hitting a strange bug where FTP LIST commands had timeouts. Turns out this is a documented issue from 2018 https://github.com/Castaglia/proftpd-mod_proxy/issues/130. I've added some detail to the issue hoping that the author will be able to fix it :)

In the mean-time I'm now running my local FTP proxy on an amd64 machine.

-- Richard, Sep 2020