Archive

Archive for October, 2009

Delayed_job for background processing in Rails

October 29, 2009 23 comments

The first thing to do obviously is to install DelayedJob. There are plenty of forked versions available on git-hub. I chose collectiveidea beacuse it was recommend on railscasts. I did refer to this site extensively for setting up delayed_job.

$ sudo gem install collectiveidea-delayed_job

Job half done. I followed instructions on the github page and ensure that I have my environment setup properly. I added the following to the environment.rb:

 config.gem 'collectiveidea-delayed_job', :lib => 'delayed_job',
                     :source => 'http://gems.github.com'

and the following to the Rakefile:

begin
      require 'delayed/tasks'
rescue LoadError
      STDERR.puts "Run `rake gems:install` to install delayed_job"
end

Then issue the command:

./script/generate delayed_job
rake db:migrate

To start the delayed_job server in production mode, issue:

$ RAILS_ENV=production ./script/delayed_job start

To start it in development mode, issue:

$ rake jobs:work

Interesting find – obvious but cost me a lot of time: Nothing stops one from running BOTH the above commands. Infact, in production mode I ran the delayed_job as a daemon AND also using rake. Silly me – I forgot that if I change any code I would need to restart both. I wrongly assumed that jobs:work only ‘showed’ the console — it starts the delayed_job server. So, if you do it this way, you will have twice the number of background jobs floating around 😉

Well, anyway, once I had this properly configured as a daemon, I set about changing code. This was the really aweome part of DelayedJob — no dependencies!! I changed the code from:

user = User.find_by_name('xyz')
user.get_daily_call_statistics

to

user = User.find_by_name('xyz')
user.send_later(:get_daily_call_statistics)

AND IT WORKS !! Awesome.  While digging around a little more, I realized the there is not enough scope for debugging in the default way. I looked up github and google for some help and found this useful tip:

1. Create a config/initializers/delayed_job_config.rb and add:

Delayed::Job.destroy_failed_jobs = false
Delayed::Worker.logger = Rails.logger

This logs all delayed job output to the environment log files and ensures that failed jobs are not destroyed. There are other settings to reduce the failure attempts and the time for the delayed job but I was too excited to try them out immediately.

2. Suppose I have some global variables in the helpers, the are not accessible in the model methods called via delayed_job. Maybe a bug in delayed_job – I do plan to dig deeper into this and figure this one out — either way. I had to break my head trying to figure this one out.

To conclude, what I had earlier was:

Processing DashboardController#explicit_refresh_daily_statistics (for 121.247.65.47 at 2009-10-29 13:16:33) [GET]
Parameters: {"action"=>"explicit_refresh_daily_statistics", "controller"=>"dashboard"}
last seen ...........
Redirected to http://acemoney.in/dashboard
Completed in 74414ms (DB: 16912) | 302 Found [http://acemoney.in/dashboard/explicit_refresh_daily_statistics]

Now after adding the send_later, I have:

Processing DashboardController#explicit_refresh_daily_statistics (for 121.247.65.47 at 2009-10-29 15:16:41) [GET]
Parameters: {"action"=>"explicit_refresh_daily_statistics", "controller"=>"dashboard"}
last seen ...........
Redirected to http://acemoney.in/dashboard
Completed in 420ms (DB: 99) | 302 Found [http://acemoney.in/dashboard/explicit_refresh_daily_statistics]

This means my response time fell from from 74 seconds to 0.5 seconds

Now, I already had backgrounDrb tasks configured earlier and want to migrate them ‘somehow’ to DelayedJob with minimal code. Stay tuned, this post will be updated.

Advertisements

Migrating Acemoney onto a different server with nginx+passenger

October 27, 2009 Leave a comment

Acemoney is a hosted application built by us i.e. Josh Software. Currently, its hosted on a linode with nginx+thin configured. The problem here is that there are 3 thin servers which consume humongous ‘stagnant’ memory. We have decided to migrate to nginx+passenger so that we an control in greater detail the number of instances, the memory and the performance.

Some things in the post are specific to Josh Software and its client. Overall, this should give a good clean idea about migrating to passenger.

1. Checkout from the branch from the respository (the 2.3.4 version)

2. Ensure  Rails 2.3.4 is installed

3. Edit the nginx configuration. /opt/nginx/conf/servers/acemoney.in

server {

 listen 80;
 server_name acemoney.in;
 passenger_enabled on;
 root <path-to-deployment>/acemoney/public;
}

4. Restart Nginx. In case you get an error — something like:

2009/10/27 16:02:07 [error] 32685#0: *9 directory index of
"/home/gautam/deployment/acemoney/public/" is forbidden, client: 121.247.65.47,
server: acemoney.in, request: "GET / HTTP/1.1", host: "acemoney.in"

Check syntax in the conf file (I had forgotten a ‘;’) OR check permissions of the root directory to see if you have given r+x permissions

5. The current gem setup was:

$:/opt/nginx/conf/servers$ sudo gem search

*** LOCAL GEMS ***

abstract (1.0.0)
actionmailer (2.3.4, 2.3.3, 2.2.2, 2.1.2)
actionpack (2.3.4, 2.3.3, 2.2.2, 2.1.2)
activerecord (2.3.4, 2.3.3, 2.2.2, 2.1.2)
activeresource (2.3.4, 2.3.3, 2.2.2, 2.1.2)
activesupport (2.3.4, 2.3.3, 2.2.2, 2.1.2)
capistrano (2.5.5)
cgi_multipart_eof_fix (2.5.0)
chronic (0.2.3)
contacts (1.0.13)
daemons (1.0.10)
engineyard-eycap (0.4.7)
erubis (2.6.4)
eventmachine (0.12.6)
fastthread (1.0.7)
gem_plugin (0.2.3)
heywatch (0.0.1)
highline (1.5.0)
hoe (1.12.2)
json (1.1.6)
memcache-client (1.7.2)
mislav-will_paginate (2.3.10)
mongrel (1.1.5)
mongrel_cluster (1.0.5)
mysql (2.7)
net-scp (1.0.2)
net-sftp (2.0.2)
net-ssh (2.0.11)
net-ssh-gateway (1.0.1)
packet (0.1.15)
passenger (2.2.5)
rack (1.0.0, 0.9.1)
rails (2.3.4, 2.3.3, 2.2.2, 2.1.2)
rake (0.8.4)
RedCloth (4.1.9)
right_aws (1.10.0)
right_http_connection (1.2.4)
rubyforge (1.0.3)
rubyist-aasm (2.0.5)
thin (1.0.0)
tidy (1.1.2)
xml-simple (1.0.12)

So, I had to add the following gems:

$ sudo gem install acts_as_reportable

This added the following dependencies:

Successfully installed fastercsv-1.2.3
Successfully installed archive-tar-minitar-0.5.2
Successfully installed color-1.4.0
Successfully installed transaction-simple-1.4.0
Successfully installed pdf-writer-1.1.8
Successfully installed ruport-1.6.1
Successfully installed acts_as_reportable-1.1.1
Successfully installed json_pure-1.1.9
Successfully installed rubyforge-2.0.3
Successfully installed rake-0.8.7

$ sudo gem install prawn
$ sudo gem install ferret
$ sudo gem install acts_as_ferret

6. Then I got the latest database dump from Acemoney server and configured the database locally. Edit the config/database.yml and the following:

production:
  adapter: mysql
  user: acemoney
  password: <password>
  host: localhost

Created a new user in mysql and granted ALL  permissions to acemoney database

mysql> grant all on acemoney.* to 'acemoney'@'localhost' identified by '<password>'

Then create the database and dump the contents from the backup

$ RAILS_ENV=production rake db:create
$ mysql -uacemoney acemoney -p < <backupfile>

Start the ferret_server

$ ./script/ferret_server -eproduction start

Build the index:

$ RAILS_ENV=production rake ace:rebuildFerretIndex

NOW, we are good to go. Make some local changes in /etc/hosts file on your machine to point acemoney.in to the linode IP address. Then http://acemoney.in should take you to the hosted application on nginx+passenger.

Musings on cache-money – Part I

October 10, 2009 5 comments

So, I always wanted to find a way of memcaching via ActiveRecord, without having to re-invent the wheel 😉 My investigations initially took me via CachedModel, cache_fu and finally I settled on cache-money. Seems to be *almost exactly* what I wanted – any lookup goes via memcache, any update /edit goes via ActiveRecord + memcache and then if needed to the database.

This is the first of my stunts:

1. Create a basic rails project with a simple posts controller. Being as lazy as I am, I used the ./script/generate scaffold help contents to create my Posts controller! 😉

2. Install the cache-money gem

3. Install memcachd server and configure the rails project (all from the github README of cache-money)

— config/memcache.yml —

production:
    ttl: 604800
    namespace: 'josh1'
    sessions: false
    debug: false
    servers: localhost:11211

— environments/production.rb —

# Use a different cache store in production
config.cache_store = :mem_cache_store
memcache_options = {
   :c_threshold => 10000,
   :compression => true,
   :debug => false,
   :namespace => 'josh1',
   :readonly => false,
 :urlencode => false
}

# require the new gem, this will load up latest memcache
# instead of using the built in 1.5.0
require 'memcache'

# make a CACHE global to use in your controllers instead of
# Rails.cache, this will use the new memcache-client 1.7.2
CACHE = MemCache.new memcache_options

# connect to your server that you started earlier
CACHE.servers = '127.0.0.1:11211'

# this is where you deal with passenger's forking
begin
 PhusionPassenger.on_event(:starting_worker_process) do |forked|
 if forked
 # We're in smart spawning mode, so...
 # Close duplicated memcached connections - they will open themselves
 CACHE.reset
 end
end

# In case you're not running under Passenger (i.e. devmode with mongrel)
rescue NameError => error
end

— config/initializers/cache_money.rb

require 'cache_money'

config = YAML.load(IO.read(File.join(RAILS_ROOT, "config",
                                     "memcached.yml")))[RAILS_ENV]
$memcache = MemCache.new(config)
$memcache.servers = config['servers']

$local = Cash::Local.new($memcache)
$lock = Cash::Lock.new($memcache)
$cache = Cash::Transactional.new($local, $lock)

class ActiveRecord::Base
 is_cached :repository => $cache
end

4. Benchmarking – Instead of running benchmarking, I wrote a few lines of code myself to create a 1000 posts with random text ranging from 1 to 100000 letters.

Setup: Mac OS 1.5 (Leopard) with nginx + passenger + memcached on a MacBook Pro laptop. I used Ruby 1.8.6 (default Mac OS 1.5 Version) and Rails 2.3.3

[term1] $ memcached -uroot -vv

$ ./script/console production
Loading production environment (Rails 2.3.3)
>>  alphanumerics = [('0'..'9'),('A'..'Z'),('a'..'z')].map {
?>       |range| range.to_a}.flatten
=> ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A", "B", "C", "D",
"E","F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S",
"T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w",
"x", "y", "z"]
>>  1000.times do |i|
?>    Post.create(:title => i.to_s, :body => (0...100000).map {
?>      alphanumerics[Kernel.rand(alphanumerics.size)] }.join)
>>  end

Results & Analysis

1. Memcache quickly scaled upto 64MB (default setting) which was cool. I could see the memcache screen scrolling fast to update all objects – in reality it will tries an LRU style purging and gets rid of the oldest objects once its actual memory is ‘full’. Memcache did not crash or stall or thrash — which was great!

2. For each object created in cache, we see the following memcache update:

<22 add lock/Post:1/id/15 0 30 6
>22 STORED
<22 set Post:1/id/15 0 86400 280
>22 STORED
<22 delete lock/Post:1/id/15 0
>22 DELETED

A GET for any posts results in:

<23 get Post:1/id/15
>23 sending key Post:1/id/15
>23 END

An UPDATE / CREATE / DELETE for any posts results in:

<23 add lock/Post:1/id/15 0 30 6
>23 STORED
<23 set Post:1/id/15 0 86400 361
>23 STORED
<23 delete lock/Post:1/id/15 0
>23 DELETED

Excellent!!

2. To confirm if we were getting it right, I checked the logs:

First time for GET:

Processing PostsController#show (for 127.0.0.1 at 2009-10-10 16:06:05) [GET]
 Parameters: {"id"=>"15"}
Rendering template within layouts/posts
Rendering posts/show
Completed in 5ms (View: 2, DB: 95) | 200 OK [http://josh1.local/posts/15]

Second Time for the same GET request:

Processing PostsController#edit (for 127.0.0.1 at 2009-10-10 16:06:42) [GET]
 Parameters: {"id"=>"15"}
Rendering template within layouts/posts
Rendering posts/edit
Completed in 6ms (View: 5, DB: 0) | 200 OK [http://josh1.local/posts/15/edit]

Excellent!!

2. Just to push the pedal, I ran another iteration of 1000 posts with random body text and saw that memcache memory stayed put at 62-64mb. I could see expired cache objects hitting the database and get cached and objects already in the cache NOT hitting the database. Exactly what I wished for.

Some caveats:

  1. a. Every ‘find’ request hits the database. Understandable but I wonder if this too can hit via ‘cache’ – its not safe or synch’ed but I wonder.
  2. money-cache still does not support joins or includes or nested attributes. (Time to contribute!! )

Overall: VERY VERY GOOD

Next Steps in Part II:

  • Test money-cache on a live project with about 1 million records.
  • It has currently ~35 Requests Per Minute.
  • Some controllers calls take almost 50% of the time. Gotta reduce that to 5% (hopefully)

Cheers!

Ruby On Rails

October 2, 2009 5 comments