Terraform Tutorial: Build a Transparent Proxy in AWS VPC
In this article I’m going to be setting up an example network and deploying a transparent proxy to it.
To make this repeatable and to show exactly how it can be deployed in AWS VPC, I am using Terraform.
Terraform is an excellent tool for describing and automating cloud infrastructure.
All of the terraform code in this project can be found here: https://github.com/nearform/aws-proxy-pattern
I have chosen to use AWS for this article and the associated terraform code, but the concept is very general and terraform code could be written for any supported IaaS platform (e.g. Google Cloud, Azure, etc.)
A quick Overview
Let me start by defining some terms and explaining what a proxy is good for.
A transparent proxy is one where there is no configuration required on the applications using the proxy. The hosts may have some routing configuration to make it work, but applications are unaware.
This is in contrast to an explicit proxy, where applications are made aware and direct their traffic to the proxy’s IP address e.g. via a browser PAC (proxy autoconfiguration) file.
In large deployments, systems like Windows Group Policy and WPAD (PAC file discovery via DHCP or DNS) are used to configure a large number of hosts automatically.
Using a proxy can provide a degree of control over outbound web traffic.
For example, the proxy can monitor and keep audit logs of that traffic, or intervene and block traffic that is likely to pose a threat to the network.
There is also a very significant non-security related benefit in the form of caching, where repeated downloads of the same content can be served from the proxy instead.
However by intentionally making a proxy the sole means of internet access, a bottleneck and single point of failure is introduced, so typically some kind of high availability setup would be used, such as a load balancer and proxy group.
Historically, web proxies have been most useful to system administrators in corporate networks, but even a network hosting something like a SaaS product in a microservices architecture can benefit.
One of the first things a piece of malware will do (perhaps after attaining persistence on a host) will be to contact command and control infrastructure over the internet.
A proxy can make you aware of this, block the attempt and give you a chance to identify the affected machine.
I chose Squid for the proxy software. Squid is a caching web proxy and is a very popular and mature project. It has been around for more than 20 years and is easy to install and configure.
The diagram above shows the key components – a private subnet with an example host that will be trying to make requests to the internet.
The proxy receives traffic via a network default route, logging requests and filtering based on domain name.
A management host is used just to provide a means to SSH to the example host, which only has a private IP address.
Management networks are a topic for another article – but this simulates something which would normally happen over a VPN, if at all.
There is a private subnet with a default route to the proxy’s network interface. This means all internet-bound traffic will arrive at the proxy server first.
Disabling source and destination checks on the proxy instance is also necessary to allow it to receive traffic not destined for the instance’s own IP address:
Finally there are some iptables rules on the proxy instance which redirects the packets into the squid server (which is listening on ports 3129 for HTTP and 3130 for HTTPS).
From there, the fact that the proxy instance has a public IP address and the default route for the public subnet is the VPC’s internet gateway is sufficient to send the traffic out to the internet:
Provisioning the Proxy Instance
There were many ways I could have provisioned the EC2 instance used for the proxy.
In this excellent article on the AWS security blog, the proxy was built from source on the running instance.
Many package managers like apt or yum have a squid package available, though you often don’t get to run the latest version or with particular compiled-in features.
There are Chef and Puppet modules for squid to follow a more ‘declarative’ paradigm, where I specify the machine configuration I need and let the provisioning layer translate that into ‘imperative’ commands to achieve the goal.
Whichever provisioning method is used, a machine imaging solution like Packer could be used to build an AWS AMI ahead of time. This could be particularly useful to get an instance serving traffic quickly after boot, which is very useful in a load balanced proxy group setup.
In my case, I opted to use my own docker image, based on Alpine Linux and using the apk package manager to install squid. This produced an image around 15mb in size, and along with terraform, contributed to making this repeatable for others.
Just pulling a docker image and running it is a lot easier and repeatable in a variety of environments compared to the alternatives.
It also allowed me to try out some interesting solutions that NearForm are working on for vulnerability scanning of docker images, and it was good to see that the low attack surface of Alpine fed through to a clean bill of health on my image (at least for now).
The server still needed some provisioning to get to the point where it can run the squid container, and I used AWS user data scripts for this, which installs Docker, creates the squid configuration file, sets up an x509 certificate that squid needs for SSL inspection, runs the squid container and creates the aforementioned iptables rules.
Which runs this script on instance boot.
Proxy Configuration and SSL Inspection
The squid configuration looks like this:
Most of the configuration is straightforward and sets up ports and domain whitelists. The interesting part is the ssl_bump directives.
Intercepting HTTPS traffic is basically a form of Man-in-the-Middle attack so to avoid certificate warnings and client rejections, a proxy that wants to decrypt HTTPS traffic usually works by having clients install a root certificate (owned by the proxy) in advance, and issuing new certificates signed by this root for HTTPS domains on the fly.
This is a complex topic and is a bit of a security ‘can of worms’ which you can read more about here.
For this article, I’m doing something less controversial and just doing domain name filtering during the TLS handshake rather than decrypting traffic.
This is where the ssl_bump directives come in:
- Step 1 is the initial TCP connection.
- Step 2 is the TLS ClientHello message, where the client may specify the domain name they want to contact as part of the Server Name Indication extension.
- Step 3 is the TLS ServerHello message where the server provides a certificate which will include the domain name of the server.
By asking Squid to peek, peek and splice respectively, I’m instructing it to open an onward TCP connection, forward the ClientHello and make a note of the SNI domain (if any), extract the domain from the ServerHello and then make a decision.
Either it will splice the rest of the connection (pass through without decryption), or it will terminate the connection based on domain name.
Squid did a great job of explaining this peek and splice method in their online documentation.
Testing the Proxy
Bring up the Infrastructure
So that was plenty of theory! Let’s see if it actually works.
I bring up the infrastructure using
terraform plan and
terraform apply, ensuring that I have AWS credentials set up at
~/.aws/credentials with a profile called
Once the infrastructure is created, I should see some IP addresses in the output:
Open Ssh Connections to Run the Tests
Then I open an SSH connection through the management host to the example host, ensuring that I avoid the common pitfalls:
- The AWS security group for the management host needs an SSH rule from your IP to allow inbound SSH access.
- The username for Ubuntu instances on AWS is
- The SSH key pair you used to create the proxy should be registered and set up for agent forwarding:
If you have done all that, you should be able to get through to the example host:
In a separate window, you can SSH to the proxy instance to monitor the logs:
Make Some Allowed Requests
The sha512.badssl.com domain is one of the test endpoints on the BadSSL service which offers up a valid SSL connection – the proxy is allowing through connections on both HTTP and HTTPS for domains in the whitelist.
The proxy should have logged these requests in the other window:
Make Some Disallowed Requests
Requests to a non-whitelisted domain are prevented, as is a connection to a whitelisted domain where the certificate has expired. The access.log and cache.log record look like this:
Don’t forget to
terraform destroy at the end so you don’t rack up the costs in your AWS account for EC2 servers you don’t need!
Also check out our post about writing reusable terraform modules for an AWS based infrastructure.