I have chosen to use AWS for this article and the associated terraform code, but the concept is very general and terraform code could be written for any supported IaaS platform (e.g. Google Cloud, Azure, etc.)
A quick Overview
Let me start by defining some terms and explaining what a proxy is good for.
A transparent proxy is one where there is no configuration required on the applications using the proxy. The hosts may have some routing configuration to make it work, but applications are unaware.
This is in contrast to an explicit proxy, where applications are made aware and direct their traffic to the proxy’s IP address e.g. via a browser PAC (proxy autoconfiguration) file.
In large deployments, systems like Windows Group Policy and WPAD (PAC file discovery via DHCP or DNS) are used to configure a large number of hosts automatically.
Using a proxy can provide a degree of control over outbound web traffic.
For example, the proxy can monitor and keep audit logs of that traffic, or intervene and block traffic that is likely to pose a threat to the network.
There is also a very significant non-security related benefit in the form of caching, where repeated downloads of the same content can be served from the proxy instead.
However by intentionally making a proxy the sole means of internet access, a bottleneck and single point of failure is introduced, so typically some kind of high availability setup would be used, such as a load balancer and proxy group.
Historically, web proxies have been most useful to system administrators in corporate networks, but even a network hosting something like a SaaS product in a microservices architecture can benefit.
One of the first things a piece of malware will do (perhaps after attaining persistence on a host) will be to contact command and control infrastructure over the internet.
A proxy can make you aware of this, block the attempt and give you a chance to identify the affected machine.
I chose Squid for the proxy software. Squid is a caching web proxy and is a very popular and mature project. It has been around for more than 20 years and is easy to install and configure.
The diagram above shows the key components – a private subnet with an example host that will be trying to make requests to the internet.
The proxy receives traffic via a network default route, logging requests and filtering based on domain name.
A management host is used just to provide a means to SSH to the example host, which only has a private IP address.
Management networks are a topic for another article – but this simulates something which would normally happen over a VPN, if at all.
There is a private subnet with a default route to the proxy’s network interface. This means all internet-bound traffic will arrive at the proxy server first.
There were many ways I could have provisioned the EC2 instance used for the proxy.
In this excellent article on the AWS security blog, the proxy was built from source on the running instance.
Many package managers like apt or yum have a squid package available, though you often don’t get to run the latest version or with particular compiled-in features.
There are Chef and Puppet modules for squid to follow a more ‘declarative’ paradigm, where I specify the machine configuration I need and let the provisioning layer translate that into ‘imperative’ commands to achieve the goal.
Whichever provisioning method is used, a machine imaging solution like Packer could be used to build an AWS AMI ahead of time. This could be particularly useful to get an instance serving traffic quickly after boot, which is very useful in a load balanced proxy group setup.
In my case, I opted to use my own docker image, based on Alpine Linux and using the apk package manager to install squid. This produced an image around 15mb in size, and along with terraform, contributed to making this repeatable for others.
Just pulling a docker image and running it is a lot easier and repeatable in a variety of environments compared to the alternatives.
It also allowed me to try out some interesting solutions that NearForm are working on for vulnerability scanning of docker images, and it was good to see that the low attack surface of Alpine fed through to a clean bill of health on my image (at least for now).
The server still needed some provisioning to get to the point where it can run the squid container, and I used AWS user data scripts for this, which installs Docker, creates the squid configuration file, sets up an x509 certificate that squid needs for SSL inspection, runs the squid container and creates the aforementioned iptables rules.
Most of the configuration is straightforward and sets up ports and domain whitelists. The interesting part is the ssl_bump directives.
Intercepting HTTPS traffic is basically a form of Man-in-the-Middle attack so to avoid certificate warnings and client rejections, a proxy that wants to decrypt HTTPS traffic usually works by having clients install a root certificate (owned by the proxy) in advance, and issuing new certificates signed by this root for HTTPS domains on the fly.
This is a complex topic and is a bit of a security ‘can of worms’ which you can read more about here.
For this article, I’m doing something less controversial and just doing domain name filtering during the TLS handshake rather than decrypting traffic.
This is where the ssl_bump directives come in:
Step 1 is the initial TCP connection.
Step 2 is the TLS ClientHello message, where the client may specify the domain name they want to contact as part of the Server Name Indication extension.
Step 3 is the TLS ServerHello message where the server provides a certificate which will include the domain name of the server.
By asking Squid to peek, peek and splice respectively, I’m instructing it to open an onward TCP connection, forward the ClientHello and make a note of the SNI domain (if any), extract the domain from the ServerHello and then make a decision.
Either it will splice the rest of the connection (pass through without decryption), or it will terminate the connection based on domain name.
The document has moved <ahref="http://aws.amazon.com">here.
[email protected]:~$ curl https://sha512.badssl.com
<metaname="viewport"content="width=device-width, initial-scale=1"><linkrel="shortcut icon"href="/icons/favicon-green.ico"/><linkrel="apple-touch-icon"href="/icons/icon-green.png"/><linkrel="stylesheet"href="/style.css">
The sha512.badssl.com domain is one of the test endpoints on the BadSSL service which offers up a valid SSL connection – the proxy is allowing through connections on both HTTP and HTTPS for domains in the whitelist.
The proxy should have logged these requests in the other window:
Access control configuration prevents your request from being allowed at this time. Please contact your service provider if you feel this is incorrect.
... [email protected]:~$ curl https://baddomain.com curl: (35) gnutls_handshake() failed: The TLS connection was non-properly terminated. [email protected]:~$ curl https://expired.badssl.com curl: (60) server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none
Requests to a non-whitelisted domain are prevented, as is a connection to a whitelisted domain where the certificate has expired. The access.log and cache.log record this: