Archive

Posts Tagged ‘nginx’

Capistrano + Nginx + Thin deployment on Linode

November 10, 2009 6 comments

This was long lost post I had written about 8 months ago (converted from wiki to HTML – so pardon typos if any)

Terminologies

Capistrano is a ruby gem which helps in remote deployment. As against widely known convention, Capistrano can be used for any deployment, not just a rails app!

Nginx is a web-proxy server. This is simply a light weight HTTP web-server which received requests on HTTP and passes them to other applications. This is way more preferable than Application servers like Apache! Moreover, nginx is very easily configurable and can support multiple domain-names very easily. It has an in-build load-balancer which can send requests to apps based on its internal load-balancing mechanism.

Thin is the next-generation lean, mean rails server. Its much faster, lighter in memory than mongrel. Its has an internal event based mechanism for request processing and a very high concurrency performance ratio than other rails servers.

Linode is a VPS (a Virtual Private Server) that is hosted by www.linode.com. As the name suggests ;), its a “Linux Node”.  We are using Ubuntu 8.10 (Tip: To find Ubuntu release, issue command: lsb_release -a) NOTE: In the linode we had, it was a raw machine with no packages installed. Please read Linode RoR package installation for details.

Steps

Capistrano Configuration Follow the steps provided by Capistrano for basic instructions: Capistrano – From The Beginning Some modifications that you may need (as I needed for deployment):

  • Edit Capfile and add the following to it. This ensures that remote capistrano deployment does not fork a remote shell using command “sh -c”. Some hosting servers do not allow remote shells.
  • default_run_options[:shell] = false
  • In addition to changes mentioned in Capistrano tutorial, add the following to config/deploy.rb. This ensures that “sudo” is not used (default for Capistrano) and the user is “root”. Not usually a good practice.. but what the hell!
  • set :use_sudo, false            set :user, "root"
  • Since capistrano uses default script/spin and script/process/reaper, we need to override the deploy:start, deploy:stop and deploy:restart to ensure that we can start/stop the thin service and the ferret_server. I know that in deply:restart, there is a copy-paste involved but I am trying to find out how to invoke a rake task from another rake task.
namespace :deploy do
    desc "Custom AceMoney deployment: stop."
    task :stop, :roles => :app do

        invoke_command "cd #{current_path};./script/ferret_server -e production stop"
        invoke_command "service thin stop"
    end

    desc "Custom AceMoney deployment: start."
    task :start, :roles => :app do

        invoke_command "cd #{current_path};./script/ferret_server -e production start"
        invoke_command "service thin start"
    end

    # Need to define this restart ALSO as 'cap deploy' uses it
    # (Gautam) I dont know how to call tasks within tasks.
    desc "Custom AceMoney deployment: restart."
    task :restart, :roles => :app do

        invoke_command "cd #{current_path};./script/ferret_server -e production stop"
        invoke_command "service thin stop"
        invoke_command "cd #{current_path};./script/ferret_server -e production start"
        invoke_command "service thin start"
    end
end

Thin Configuraion I looked up most of the default configuration of Thin and Nginx on Ubunto at Nginx + Thin. Some extra configuration or differences are mentioned below.

  • The init script for starting thin and nginx during startup is configured during package installation. Leave them as they are.
  • The following command generates the /etc/thin/acemoney.yml for 3 server starting from port 3000. Note that the -c option specifies the BASEDIR of the rails app. Do NOT change any settings in this file as far as possible.
  • thin config -C /etc/thin/acemoney.yml -c /home/josh/current --servers 3 -e production -p 3000
  • Starting and stopping thin is as simple as
  • service thin start
    service thin stop
  • This will read the acemoney.yml file and spawn the 3 thin processes. I noticed that each thin server took about 31MB in memory to start with and with caching went upto ~70MB. On the contrary, a mongrel server (tested earlier) started with 31MB but exceeded 110MB later!

Nginx Configuration Installation on nginx is simple on Ubuntu ;)

apt-get install nginx

Configure the base /etc/nginx/nginx.conf. The default configuration are fine but I added / edited a few more for as recommended at Nginx Configuration

        worker_processes  4;

        gzip_comp_level 2;
        gzip_proxied any;
        gzip_types  text/plain text/html text/css application/x-javascript
                    text/xml application/xml application/xml+rss text/javascript;

According to this configuration above, nginx will spawn 4 worker threads and each worker thread can process 1024 connections (default setting). So, nginx can now process ~4000 concurrent HTTP requests !!! See performance article of thin at Thin Server

Configure the domainname, in our case acemoney.in. Ensure that acemoney.in “A record” entry points to this server! Check this by doing a nslookup or a ping for the server. In /etc/nginx/sites-available create a file by the domainname to be hosted. So I added /etc/nginx/sites-available/acemoney.in. In /etc/nginx/sites-enabled create a symbolic link to this file.

ln -s /etc/nginx/sites-available/acemoney.in /etc/nginx/sites-enabled/acemoney.in 

Now add the contents in /etc/nginx/sites-available/acemoney.in This is the key configuration to hook up nginx with thin.

upstream thin {
     server 127.0.0.1:3000;
     server 127.0.0.1:3001;
     server 127.0.0.1:3002;
}

server {
    listen 80;
    server_name acemoney.in;

    root /home/josh/current/public;

    location / {
      proxy_set_header X-Real-IP $remote_addr;
      proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header Host $http_host;
      proxy_redirect false;

      if (-f $request_filename/index.html) {
        rewrite (.*) $1/index.html break;
      }
      if (-f $request_filename.html) {
        rewrite (.*) $1.html break;
      }
      if (!-f $request_filename) {
        proxy_pass http://thin;
        break;
      }
    }

    error_page 500 502 503 504 /50x.html;
    location = /50x.html {
      root html;
    }
}

To analyze this configuration, here are some details:

The following lines tell nginx to listen on port 80 for HTTP requests to acemoney.in. The ‘root’ is the public directory for our rails app deployed at /home/josh/current!

server {
    listen 80;
    server_name acemoney.in;

    root /home/josh/current/public;

Now, nginx will try to process all HTTP requests and try to give the response.. for static HTML’s it will automatically give the data from the ‘root’. If it cannot find the HTML file, it will ‘proxy_pass’ it to thin. “thin” in the code below is an ‘upstream’ directive that tells nginx where to forward the current request it cannot directly serve.

if (!-f $request_filename) {
        proxy_pass http://thin;
        break;
      }

The upstream code is where load-balancing plays a role in nginx. The following code tells nginx which all processes are running on which different ports and it forwards requests to any of the servers based on its internal load balancing algorithm. The servers can be on different machines (i.e. different IP addresses) if needed. In AceMoney, we have started 3 thin servers on 3 different ports!

upstream thin {
     server 127.0.0.1:3000;
     server 127.0.0.1:3001;
     server 127.0.0.1:3002;
}

Performance Statistics Nothing is complete without them. Here is what I found out for 3 thin servers and 1 ferret_server.

top - 14:06:10 up 7 days, 22:58,  2 users,  load average: 0.00, 0.00, 0.00
Tasks:  84 total,   1 running,  83 sleeping,   0 stopped,   0 zombie
Cpu0  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    553176k total,   530868k used,    22308k free,    16196k buffers
Swap:   524280k total,     2520k used,   521760k free,    87280k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
12424 mysql     18   0  127m  42m 5520 S    0  7.9   0:23.01 mysqld
18338 root      15   0 77572  70m 4392 S    0 13.1   0:06.79 thin
18348 root      15   0 71176  64m 4388 S    0 11.9   0:06.51 thin
18343 root      15   0 68964  62m 4384 S    0 11.5   0:07.20 thin
18375 root      18   0 70912  54m 2660 S    0 10.0   2:34.24 ruby
 8141 www-data  15   0  5176 1736  820 S    0  0.3   0:00.07 nginx
 8142 www-data  15   0  5176 1724  816 S    0  0.3   0:00.01 nginx
 8144 www-data  15   0  5152 1720  816 S    0  0.3   0:00.06 nginx
 8143 www-data  15   0  5156 1656  784 S    0  0.3   0:00.00 nginx

As can be seen:

  • Each thin server takes around 70M
  • The Mysql server takes 41M
  • Ruby process (18375 above) is the ferret_serve which takes 54M
  • 4 nginx threads take about 1.7K in memory.
 Overall: (3 thin server cluster + Mysql + ferret): 300MB

Migrating Acemoney onto a different server with nginx+passenger

October 27, 2009 Leave a comment

Acemoney is a hosted application built by us i.e. Josh Software. Currently, its hosted on a linode with nginx+thin configured. The problem here is that there are 3 thin servers which consume humongous ‘stagnant’ memory. We have decided to migrate to nginx+passenger so that we an control in greater detail the number of instances, the memory and the performance.

Some things in the post are specific to Josh Software and its client. Overall, this should give a good clean idea about migrating to passenger.

1. Checkout from the branch from the respository (the 2.3.4 version)

2. Ensure  Rails 2.3.4 is installed

3. Edit the nginx configuration. /opt/nginx/conf/servers/acemoney.in

server {

 listen 80;
 server_name acemoney.in;
 passenger_enabled on;
 root <path-to-deployment>/acemoney/public;
}

4. Restart Nginx. In case you get an error — something like:

2009/10/27 16:02:07 [error] 32685#0: *9 directory index of
"/home/gautam/deployment/acemoney/public/" is forbidden, client: 121.247.65.47,
server: acemoney.in, request: "GET / HTTP/1.1", host: "acemoney.in"

Check syntax in the conf file (I had forgotten a ‘;’) OR check permissions of the root directory to see if you have given r+x permissions

5. The current gem setup was:

$:/opt/nginx/conf/servers$ sudo gem search

*** LOCAL GEMS ***

abstract (1.0.0)
actionmailer (2.3.4, 2.3.3, 2.2.2, 2.1.2)
actionpack (2.3.4, 2.3.3, 2.2.2, 2.1.2)
activerecord (2.3.4, 2.3.3, 2.2.2, 2.1.2)
activeresource (2.3.4, 2.3.3, 2.2.2, 2.1.2)
activesupport (2.3.4, 2.3.3, 2.2.2, 2.1.2)
capistrano (2.5.5)
cgi_multipart_eof_fix (2.5.0)
chronic (0.2.3)
contacts (1.0.13)
daemons (1.0.10)
engineyard-eycap (0.4.7)
erubis (2.6.4)
eventmachine (0.12.6)
fastthread (1.0.7)
gem_plugin (0.2.3)
heywatch (0.0.1)
highline (1.5.0)
hoe (1.12.2)
json (1.1.6)
memcache-client (1.7.2)
mislav-will_paginate (2.3.10)
mongrel (1.1.5)
mongrel_cluster (1.0.5)
mysql (2.7)
net-scp (1.0.2)
net-sftp (2.0.2)
net-ssh (2.0.11)
net-ssh-gateway (1.0.1)
packet (0.1.15)
passenger (2.2.5)
rack (1.0.0, 0.9.1)
rails (2.3.4, 2.3.3, 2.2.2, 2.1.2)
rake (0.8.4)
RedCloth (4.1.9)
right_aws (1.10.0)
right_http_connection (1.2.4)
rubyforge (1.0.3)
rubyist-aasm (2.0.5)
thin (1.0.0)
tidy (1.1.2)
xml-simple (1.0.12)

So, I had to add the following gems:

$ sudo gem install acts_as_reportable

This added the following dependencies:

Successfully installed fastercsv-1.2.3
Successfully installed archive-tar-minitar-0.5.2
Successfully installed color-1.4.0
Successfully installed transaction-simple-1.4.0
Successfully installed pdf-writer-1.1.8
Successfully installed ruport-1.6.1
Successfully installed acts_as_reportable-1.1.1
Successfully installed json_pure-1.1.9
Successfully installed rubyforge-2.0.3
Successfully installed rake-0.8.7

$ sudo gem install prawn
$ sudo gem install ferret
$ sudo gem install acts_as_ferret

6. Then I got the latest database dump from Acemoney server and configured the database locally. Edit the config/database.yml and the following:

production:
  adapter: mysql
  user: acemoney
  password: <password>
  host: localhost

Created a new user in mysql and granted ALL  permissions to acemoney database

mysql> grant all on acemoney.* to 'acemoney'@'localhost' identified by '<password>'

Then create the database and dump the contents from the backup

$ RAILS_ENV=production rake db:create
$ mysql -uacemoney acemoney -p < <backupfile>

Start the ferret_server

$ ./script/ferret_server -eproduction start

Build the index:

$ RAILS_ENV=production rake ace:rebuildFerretIndex

NOW, we are good to go. Make some local changes in /etc/hosts file on your machine to point acemoney.in to the linode IP address. Then http://acemoney.in should take you to the hosted application on nginx+passenger.

Phusion Passenger on Nginx – Internal Overview

September 2, 2009 Leave a comment

So, working with nginx and passenger has been really simplified. There is an excellent screen cast about how exactly how to get it working. (http://www.modrails.com/videos/passenger_nginx.mov). What I was really interested in finding out was, what happens under covers.

It turns out that this is the core of what passenger does is irrespective of whether its running via Apache or Nginx. This was interesting as, I believe this is one step we take for granted!

When passenger_enabled is ‘on’, the Rails (railz in Passenger terminology) or the Rack, the processes are not exec’ed but forked! This is HUGE – unlike mongrel or thin. Processes have a different process id but they are child processes – ie. They share all the parents file descriptors. NOW, since we can fork on demand, it makes a huge difference when we are trying to balance the load on servers and reap them on lower load.

There are different types of ways the server is spawned:

  • conservative

  • smart

  • smart lv2 (default)

The conservative method of spawning is the same as what happens with mongrel and thin – the processes are independent. Not recommended at all as this kills the performance.

Smart and smart lv2 are the pre-cached spawners. In the smart way, the framework is spawned and all applications of this framework are spawned. Slightly heavy considering that we do not deploy like this usually.

Smart lv2 is the best of the lot and does a cached application loading – this reduces the load time when the application is requested even for the first time. This is the one we shall dissect.

Questions which arose:

  1. Where is the ruby process?

  2. How much memory does it take?

  3. What is the load time?

If its a smart spawn, I would assume that since one process is forked for every framework application, there should be only one ruby process for rails and one for rack. Need to really investigate this ;)

If its a smart-lv2 spawner, I assumed that there should be at least one ruby process for every application we spawn – however, this is not the case somehow. At all times, I see only one ruby process – how? I created 2 applications and tried to start them from a single nginx server – still only one ruby process – that too consuming 6MB in resident memory. What’s going on here??

Now, I created 2 basic scaffolds, one for each application and invoked the routines – voila, here is my answer! I see 4 ruby processes of 21MB each for 2 applications. So, this explains it – its not rocket science.

Initially, since there are no HTTP requests processed – just the basic application and welcome to rails page, only a single application got spawned. Hence it was only 6MB. When there are some HTTP rails requests, the rails apps were spawned. To verify this:

  • I killed the nginx process and the ruby child processes disappeared – expected.

  • I restarted nginx and invoked form browser. Only the welcome page – no ruby process spawned

  • I invoked a rails request on 1 of the apps – 2 ruby processes (21mb) spawned.

  • I invoked a rails request on the second apps – 2 more ruby processes surfaced. Passenger-status shows 2 domains and 1 PID (probably implies 1 process and 1 child worker thread?)

Now, the question remains – when are the processes reaped – is there always 2 instances of the app in memory – which is still huge as each consumes 21mb! Now, I was trying find out if and when the instances get reaped – Still doing some investigation on this.

More later!


Categories: nginx, Ruby On Rails Tags: ,
Follow

Get every new post delivered to your Inbox.