January 14 2012

Proxy securely through ANY corporate proxy/firewall

DISCLAIMER: Okay, probably still almost any firewall. There are a few posts on the internet about how SSH tunnels bypass "almost any firewall", I believe this proxy will probably bypass a whole lot more firewalls. So I had to do come up with something better than "almost any" .

When is this useful?

ProxyTunnel is awesome as it allows you to tunnel to SSH through--for example--port 443. And due to SSH supporting port forwards you can go from there to wherever you want. If I am correct, it requires that the proxy in question supports the CONNECT syntax.

Sometimes however, proxies are more restricted than that: CONNECT may not be supported; connections are not allowed to stream (i.e., file downloads are first downloaded by the proxy server, scanned for viruses, executables and other filetypes may be blocked); base64 may actually be decoded to see if it contains anything that isn't allowed, it may go as far as to inspect content of zip files and may have restrictions on the maximum file size for downloads (XX MB limit). In that case ProxyTunnel won't suffice.

If you're unfortunate enough to be behind such a firewall, no worries because now there is a way to tunnel through it! The only requirement for it to work is that you can receive plain text from a webpage, and post data to it. One that you own or have access to. Well If you can't do that, I suggest you look for another Job, because this is REALLY important!!!!1 (Not really but then this proxy solution won't work). Do not expect it to be very performant with broadband type of stuff by the way.

How it works in short

It works with three PHP scripts. And just like with Proxytunnel you need to run one of them on your local computer: localclient.php. This script binds to a local port, you connect with your program to this local port. Each local client is configured to establish a connection with some destination host + port. But the cool part is, it does so by simply reading plain old HTML from an url, and posting some formdata back to it. Well actually it appears to be plain old HTML, because it's the data prefixed with an HTML tag, followed by the connection identifier and the DES encrypted data (converted into base64).

The curl proxy (as I call it, because I use the cURL extension in PHP) retrieves HTML pages like this:

Example of packet with data "PONG :leguin.freenode.net", is sent as the following HTML:

<PACKET>a5bc97ba2f6574612MNIoHM6FyG0VuU6BTF/Pv/UcVkSXM5AbiUrF4BDBB4Q=
|______||_______________||__________________________________________|
       |                |                                           `=BASE64 OF ENCRYPTED DATA
       |                `=Session id / socket id
       `=Fake HTML tag

POSTing back sends a string with the same syntax back, basically only prefixed with "POST_DATA=".

In order for this to work, a second script has to be callable on the web, you must be able to access it, and the same machine has to be able to make the connections you want. For example: http://your-server/proxy.php (you could rename it to something less suspicious; there are some smart things you can do here, but I'll leave that to your imagination ). All proxy.php does is write and read files from a directory, nothing more.

Then a shellscript has to be started to run in background, with access to the same directory. This script scans that directory for instructions, specifically starting server.php processes for new connections. The actual connection is made in the server.php script. And all this script does is read from the same directory for packets received, which it will send to it's socket, any data read from the proxy is written back to the directory, which proxy.php will eventually sent back to the client.

Graphical explanation

You should follow the arrows in the same order as presented in the Legend. Click to enlarge the image.

Design decisions

When I had the idea to make it, I didn't feel like spending alot of time on it, so I hacked it together in a few hours. Then I tested it, it worked and it got me exited enough to refactor it and make a blog post out of it.

After the encryption of the packets I use base64 encoding, which increases the size of the messages, but it looks more HTML-like. If I wanted to send the encrypted data raw I'd have to do some more exotic stuff, maybe disguise it as a file upload, because AFAIK a plain old POST does not support binary data.
I use BASE64 and not urlencode on the encrypted data, because when I tested it urlencode produced even more overhead. Of course the BASE64 string is still "urlencoded" before POST, but only a few chars are affected.
I don't use a socket for communicating between proxy.php and server.php, but files and some lock-files because I preferred them somehow. A database would be nicer, but implies more configuration hassle.

Encryption used

define('CRYPT_KEY', pack('H*', substr(md5($crypt_key),0,16)));

function encrypt_fn($str)
{
    $block = mcrypt_get_block_size('des', 'ecb');
    $pad = $block - (strlen($str) % $block);
    $str .= str_repeat(chr($pad), $pad);

    return base64_encode(mcrypt_encrypt(MCRYPT_DES, CRYPT_KEY, $str, MCRYPT_MODE_ECB));
}

function decrypt_fn($str)
{   
    $str = mcrypt_decrypt(MCRYPT_DES, CRYPT_KEY, base64_decode($str), MCRYPT_MODE_ECB);

    $block = mcrypt_get_block_size('des', 'ecb');
    $pad = ord($str[($len = strlen($str)) - 1]);

    return substr($str, 0, strlen($str) - $pad);
}

If you prefer something else, simply re-implement the functions, you'll have to copy them to all three scripts (sorry, I wanted all three scripts to be fully self-contained).

I found my "ASCII key → md5 → 16 hexadecimal display chars → actual binary" a pretty cool find by the way. Did you notice it?

Demonstration

Note that first I demo it where the server is running on an Amazon AMI image. Appended to the video is a short demo where I run the server on my local windows pc (just to show how it it'd work on windows). This second part starts when I open my browser with the google page.

Remote desktop actually works pretty good through the curl proxy by the way. Establishing the connection is a little slow like with WinSCP, but once connected it performs pretty good. I could't demo it because I don't have a machine to connect to from home.

Sourcecode & downloads

Put it here on bitbucket: https://bitbucket.org/rayburgemeestre/curlproxy Placed it under MPL 2.0 license, which seamed appropriate. Basically this means that when you distribute it with your own software in some way, you'll have to release your code changes/improvements/bugfixes (applicable to curlproxy) to the initial developer. This way the original repository will also benefit and you're pretty much unrestricted.