Wednesday, June 12, 2013

Nginx as a reverse proxy for Apache

First of all let’s see the general structure of Apache web server and why it would be a good idea to use a reverse proxy:

1. Client initiates request to the Apache server.

2. His browser connects to the Apache server.

3. The Apache server creates new thread/process to handle the request.

4. If client requested dynamic content, web server spawns CGI process or executes dynamic content handling module (i.e. mod_php) and waits while request will be processed. When it receives resulted web-page, it sends it back to the client.

5. If client asked for some static file, web server sends this file to client

6. Client’s browser receives answer, closes connection to web server and displays content.


As we can see, when there are many requests coming to the web server, the server needs to create many parallel threads/processes and keep them running while client will close connection. If client has slow connection, web server process will wait too long and resource consumption will increase very fast. In the same time, if the website has lots of static content, Apache will create lots of parallel processes which will quickly eat up the memory only to serve a static file. If the money are not a problem we can continue to buy more and more RAM and more powerful processors.


However, there is a more efficient solution. We can use a small piece of software (nginx for example) in front of the Apache web server which will handle all the requests for static content and forward to Apache only the requests for dynamic content. This way Apache will open new threads only to process the dynamic requests and will close them quickly as it won’t need to wait after the client to close the connection (this is going to be nginx’s job).


But (there’s always a but) this approach will bring us the following problems:

1. Each time a vhost is added to apache, we’ll have to add it also to nginx.

2. Because nginx is a reverse proxy layer on top of Apache, Apache will think that all connections originate from the server running nginx. Every entry in the Apache access logs will appear to come from the IP of the nginx server and securing sessions by checking that a user’s IP address hasn’t changed becomes more difficult.


I believe the first problem doesn’t bring such great hassle. I’ll show you how to add vhosts to nginx and you’ll see it’s easy.

It can even be created a small bash script which can read apache’s vhosts and generate the nginx’s ones. But this is not going to be covered in this post.


As for the second problem we’ll use the Apache mod_rpaf module to populate the REMOTE_ADDR using a special HTTP header inserted by nginx.


A typical request would work as follows:

- 1.2.3.4 sends HTTP request to nginx server;

- nginx determines that it needs to proxy pass the request to a back-end Apache server (e.g. by looking at the content-type or virtual host).

- nginx adds an HTTP header “X-Forwarded-For” with the client’s real IP

- nginx forwards (proxy_pass) the request to back-end Apache server

- mod_rpaf in Apache detects that the request is coming from the nginx IP, then substitutes REMOTE_ADDR with the contents of X-Forwarded-For

- Apache handles request as normal. Applications do not need to be aware of the reverse proxy.


Ok, enough with talking, let’s see how are we going to install nginx as a reverse proxy for a back-end Apache server. Of course we are going to use a Slackware server.


First of all we have to build mod_rpaf and configure apache to use it. Go to mod_rpaf website and get the latest version available. In this tutorial i’m going to use mod_rpaf 0.6.


cd /usr/local/src

wget http://stderr.net/apache/rpaf/download/mod_rpaf-0.6.tar.gz

tar -xzvf mod_rpaf-0.6.tar.gz

cd mod_rpaf-0.6

apxs -i -c -n mod_rpaf-2.0.so mod_rpaf-2.0.c

This is going to install mod_rpaf-2.0.so on /usr/lib/httpd/modules/


Now edit /etc/httpd/httpd.conf and add the following:


LoadModule rpaf_module lib/httpd/modules/mod_rpaf-2.0.so


RPAFenable On

RPAFsethostname On

RPAFproxy_ips 127.0.0.1

RPAFheader X-Forwarded-For

Also we have to configure apache to listen only on localhost:


Listen 127.0.0.1:80

Edit /etc/httpd/extra/httpd-vhosts.conf and make sure you use the following directive:


NameVirtualHost *:80

Make sure each of your virtual hosts are defined as:




Now we have to build and configure nginx. If you want you can use the following nginx slackware package made by me:


nginx-0763-i486-1.txz

However if you’re like me and you like to build things manually:


cd /usr/local/src

wget http://sysoev.ru/nginx/nginx-0.7.63.tar.gz


mkdir /etc/nginx

mkdir /var/log/nginx

mkdir /var/run/nginx


tar -xzvf nginx-0.7.63.tar.gz

cd nginx-0.7.63

./configure –prefix=/usr \

–conf-path=/etc/nginx/nginx.conf \

–error-log-path=/var/log/nginx/error.log \

–pid-path=/var/run/nginx/nginx.pid


make

make install


groupadd nginx

useradd -g nginx -d /var/www/htdocs -s /bin/false nginx

The next step is to edit /etc/nginx/nginx.conf file. Here it is the file i’m using:


user nginx;

worker_processes 2;


error_log /var/log/nginx/error.log;

#error_log logs/error.log notice;

#error_log logs/error.log info;


pid /var/run/nginx.pid;


events {

worker_connections 1024;

}


http {

include mime.types;

default_type application/octet-stream;


log_format main ‘$remote_addr – $remote_user [$time_local] “$request” ‘

‘$status $body_bytes_sent “$http_referer” ‘

‘”$http_user_agent” “$http_x_forwarded_for”‘;


#access_log logs/access.log main;


sendfile on;

#tcp_nopush on;


#keepalive_timeout 0;

keepalive_timeout 65;


gzip on;


server {

listen 86.55.9.90:80 default;

listen 86.55.239.6:80 default;

server_name _;

access_log /var/log/nginx/default.access.log main;


location / {

proxy_pass http://127.0.0.1:80;

include /etc/nginx/proxy.conf;

}

}

include /etc/nginx/vhosts/*[^~];

}

Create /etc/nginx/proxy.conf template file which is going to be used on every vhost:


proxy_redirect off;

proxy_set_header Host $host;

proxy_set_header X-Real-IP $remote_addr;

proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

client_max_body_size 10m;

client_body_buffer_size 128k;

proxy_connect_timeout 90;

proxy_send_timeout 90;

proxy_read_timeout 90;

proxy_buffers 32 4k;

Now you have to set up your virtual hosts. Here it is how /etc/nginx/vhosts/razvan.ws vhost file looks like:


server {

listen 86.55.239.2:80;

server_name www.razvan.ws razvan.ws;

# I’m not going to use nginx to create an access log as this is created by apache anyway

#access_log /var/log/nginx/razvan.ws.access.log main;


location / {

proxy_pass http://127.0.0.1:80;

include /etc/nginx/proxy.conf;

}


location ~* ^.+\.(jpe?g|gif|png|ico|css|zip|tgz|gz|rar|bz2|doc|xls|exe|pdf|ppt|txt|tar|mid|midi|wav|bmp|rtf|js|swf|avi|mp3)$ {

expires 1d;

root /var/www/htdocs/razvan.ws;

}

}

Let’s create a nice slackware-like /etc/rc.d/rc.nginx script to start/stop/restart nginx:


#!/bin/sh

#

# /etc/rc.d/rc.nginx

#

# Start/stop/restart the nginx web server.


nginx_start() {

/usr/sbin/nginx -c /etc/nginx/nginx.conf

}


nginx_stop() {

killall nginx

rm -f /var/run/nginx.pid

}


nginx_restart() {

nginx_stop

nginx_start

}


case “$1″ in

‘start’)

nginx_start

;;

‘stop’)

nginx_stop

;;

‘restart’)

nginx_restart

;;

*)

echo “Usage: $0 {start|stop|restart}”

;;

esac

Now the only thing left is to restart apache and start nginx:


apachectl restart

/etc/rc.d/rc.nginx start

If everything went well, we’ll have the following processes:


4263 ? Ss 0:00 nginx: master process /usr/bin/nginx -c /etc/nginx/nginx.conf

4264 ? S 0:00 \_ nginx: worker process

4265 ? S 0:00 \_ nginx: worker process

4836 ? Ss 0:00 /usr/bin/httpd -k start

4837 ? S 0:00 \_ /usr/bin/fcgi- -k start

4896 ? Ss 0:00 | \_ /usr/bin/php-cgi

4897 ? S 0:00 | \_ /usr/bin/php-cgi

4898 ? S 0:00 | \_ /usr/bin/php-cgi

4838 ? Sl 0:00 \_ /usr/bin/httpd -k start

4866 ? Sl 0:00 \_ /usr/bin/httpd -k start

Check out your apache access logs to see if everything is ok. Also if you want to make sure nginx is serving the static files, enable nginx access log as well and see if it gets the hits correctly.


Now the server is able to handle more traffic with the less resources. Everything is done transparently for the currently written scripts and for the users.



Nginx as a reverse proxy for Apache

No comments:

Post a Comment