Wednesday, February 23, 2011

Apache2 Reverse Proxy for Nodejs

I noticed a good article on using Nodejs alongside Apache2.

In a typical web application, Apache2 is already being used to serve up PHP and static files. Nodejs is probably being integrated into this existing web application, providing some very awesome functionality, probably through an HTTP interface. Apache2 is already running on port 80; how is the user's browser going to direct requests specifically to the Nodejs process running on the server?

Nodejs is going to have to listen for HTTP requests on a port other than Apache2's default port 80, such as port 8124, as given in the Hello World example. How is the browser going to direct HTTP requests to port 8124 on the server, when the page itself was retrieved from port 80? The temptation is to explicitly include the port when making HTTP requests, like so http://mydomain:8124 .

The browser would be downright miserable with this, since it breaks the same origin policy, which states that a page of a given origin, defined as [protocol, domain name, port], should interact only with resources of the same origin (with exceptions).

Another technique, then, is needed to specify the fact that Nodejs and not Apache2 is the intended destination for the request. By specifying a 'virtual' resource in the request, such as http://mydomain/nodejs/ (there is actually no such subdirectory), Apache2 can internally redirect the HTTP request to a given port (in fact, any resource on any server). To the client browser, the request appears to be sent to and returned from Apache2, on the standard HTTP port 80. Behind the scenes, Apache2 is behaving as a proxy for Nodejs, handing the Node server these requests on the sly, and serving up the responses on behalf on Node. Since this happens on the server-side, and not the client-side, as is typical in the process of Network Address Translation, Apache2 is said to be behaving as a reverse proxy.

We want Apache2 to obey the following rule: When a request is received on port 80 for the nodejs/ subdirectory, ditch the nodejs/ path and forward that request to port 8124 internally, behaving as a proxy.

Three Apache2 modules are required for this, which are compiled in by default on Ubuntu 10.10: mod_proxy, mod_proxy_http and mod_rewrite. To enable these modules, the following commands can be used:
$ sudo a2enmod proxy
$ sudo a2enmod proxy_http
$ sudo a2enmod rewrite

A notification will be issued if a module was already enabled. mod_rewrite relies on the former two modules to achieve its reverse proxy function. If one of those modules is missing, a 5xx Server Error will be issued.

The rewrite rule is then inserted. This should be specified in the Apache2 virtual host configuration for the site in question, in sites-available, within the Directory directive corresponding to the site's DocumentRoot.

(Equivalently, an .htaccess file can be specified, provided that the appropriate AllowOverride permissions are set in the virtual host configuration.)

Open the virtual host configuration for the site in question.
$ sudo vi /etc/apache2/sites-available/mydomain

Include the following within the DocumentRoot's Directory directive
# Nodejs reverse proxy
RewriteEngine On
RewriteRule ^nodejs/$ http://mydomain:8124/ [P]

The [P] specifies that Apache2 should behave as a reverse proxy. The http://mydomain/nodejs/ resource will be interpreted as an HTTP request intended for Nodejs, listening on port 8124, and Apache2 will forward the HTTP request there behind the scenes.

The web server is then restarted as follows
$ sudo /etc/init.d/apache2 restart


The resource http://mydomain/nodejs/ now effectively refers to the Nodejs server process listening on port 8124, satisfying the same origin policy.

How is Nodejs configured in your project?


Matt said...

What is better to do, in this case, is use mod_proxy instead of mod_rewrite as recommended by the Apache docs. Here is the Apache configuration you need:

ProxyPass /nodejs/ http://mydomain:8124/
ProxyPassReverse /nodejs/ http://mydomain:8124/
<Proxy *>
Order deny,allow
Allow from all

Andy said...

To anyone reading the comment Matt's comment, you REALLY, REALLY shouldn't do "Allow from all" and open your server up to the world and let anyone use it (and hide behind it)...

Holy Joe said...

Although Andy is right to be cautious, in this case it is not necessary because you are setting up a reverse proxy (configured with ProxyPass/ProxyPassReverse), not the dangerous forward proxy (configured with ProxyRequests). Reverse proxies can't send traffic anywhere but to the place you specify (in this case, your nodejs server), as pointed out in the apache docs (see mod_proxy). In fact, in this case, I don't think you even need the Proxy stanza; just the ProxyPass and ProxyPassReverse lines should do it.

USTASI VAR said...

Is this for server or client side?

Thomas Lyon said...

Nice post.thanks to sharing this post.more info