Engine Yard Developer Center

Configure and Deploy Resque

Resque is a Ruby job queue. You can find it and all you need to know about coding for it in the readme for Resque.

Importat note:  the discussion below covers how to run Resque on Engine Yard, whil the examples regarding custom Chef are tailored to the v4 stack.  The content is still valid for stack v5, but custom Chef examples & specifics for Resque on v5 can be found here.

Anatomy of our Resque configuration

The files used to configure Resque in your application are:

File Description
/data/app_name/shared/config/resque.yml                                     Points to redis database, is symlinked to your $RAILS_ROOT/config directory              
resque_?.conf One of each of these per instance to list the queues for that worker
/engineyard/bin/resque Monit wrapper provided by engineyard
/etc/monit.d/resque_app_name.monitrc Monit configuration file for Resque

Quick start guide

If you want to get going and work the rest out later, here’s the quick version.

To configure and deploy Resque for an Engine Yard Cloud environment

    1. Be familiar with Engine Yard docs about:
    2. Boot a utility instance and give it the name "redis". Then enable the following recipes in your custom chef cookbooks:
      • The "redis" recipe to install Redis on the "redis" utility instance.
      • The "redis-yml" recipe to add a redis.yml file to your app instances.
      • The "resque" recipe to install the Resque gem and the resque_x.conf config files.

        For example:

        include_recipe "redis"

        include_recipe "redis-yml"

        include_recipe "resque"

    3. Modify the redis-yml recipe to point at the correct instance (see the comments at the top of cookbooks/redis-yml/recipes/default.rb).
    4. Add the following to your after_symlink deploy hook to ensure that Resque restarts when you deploy:
      if node[:instance_role] == 'util' 
        worker_count = 3
        worker_count.times do |count|
          run "sudo monit restart resque_APPNAME_{count}"
        end
      end
    5. Connect Resque to Redis by adding the following into config/initializers/resque.rb:
      redis_config = YAML.load_file("#{Rails.root}/config/redis.yml") 
      Resque.redis = Redis.new(redis_config[Rails.env])

You can now queue up Resque jobs in your application and they will be processed on your utility instance.

For more information on configuring Resque, see the Readme.

Thinking through the configuration

Redis configuration

All our environments use redis on the master database instance (or on the solo instance if just one slice is used with out a separate database) as part of our cloud application infrastructure. Typically for Resque, you use this redis instance. However, you can use a custom chef recipe to put a redis instance on your utility or any slice you want and use that.

Queues vs workers

First decide how many workers you need and how you will allocate those workers.

When writing your Resque code, you have assigned jobs to queues. It is important to note that queues and workers aren’t synonymous. A worker can service many queues, and each queue can be serviced by many workers.

You can have as many workers as your resources allow. Each worker in our default setup is monitored by monit and so has one stanza in our monit configuration per worker.

Each intended worker has a conf file in /data/app_name/shared/config called resque_conf_name.conf.

So, for three workers, in an application called myapp, you might have:

/data/myapp/shared/config/resque_0.conf 
/data/myapp/shared/config/resque_1.conf
/data/myapp/shared/config/resque_2.conf

Each of these has a QUEUE statement as described in the Resque readme. The default is QUEUE=*. However, you may customize it to list the queues you’d like handled by that worker. By choosing how you allocate your queues to your workers, you essentially prioritize the queues.

Each worker, when run, has a memory footprint approximately the size of one of your Unicorn or Passenger workers at start up. Every time it gets a job, it forks a child which is also be about that size, and it grows as big as it needs to.

Stopping jobs

At different times, you need to stop or restart your workers. Perhaps a job has exceeded its allowed memory, or you need to deploy new code or for any number of other reasons.

Workers can be asked to stop one of two ways, with either a SIGTERM or a SIGQUIT (kill -15 or kill -3).

If they receive a SIGQUIT, they allow an already running job to finish before quitting. If they receive a SIGTERM, then, if there is a job running, that is killed immediately, along with the worker.

So, the two things that need consideration are how long your job will run for and what are the consequences of a job being terminated during processing.

To TERM or to QUIT

If terminating your job mid-process leaves your databases in a consistent state, doesn’t result in half drawn thumbnails, or cause other embarrassing mishaps, then SIGTERM is the way forward.

This involves a line in the monit configuration like:

stop program "/engineyard/bin/resque myapp term production resque_0.conf" 

If for any reason the worker doesn’t stop, the script checks for and kill its child and then itself with kill -9.

If, however, your job can’t be interrupted, you need to ask it to stop with QUIT. This involves a line in the monit configuration like:

stop program "/engineyard/bin/resque myapp quit production resque_0.conf" 

This allows your script 60 seconds to finish its job before the wrapper script ensures that it has, in fact, died. Note that for the sake of following conventions used in other monit wrapper scripts, quit and stop are synonyms.

Time to die

However, we have customers with jobs that 5, 10, and 30 minutes, and even up to 12 hours.

To cater for this, you can set a GRACE_TIME environment variable:

stop program "/bin/env GRACE_TIME=300 /engineyard/bin/resque myapp stop production resque_0.conf" 

This causes the wrapper script to wait 300 seconds before forcing the death of the worker.

Deploy time considerations

It is important that Resque gets restarted when you deploy. Firstly, because, if you don’t, your Resque jobs are carried out with redundant code, possibly against the wrong database schema. Secondly, because only three releases are kept (by default), after the third deploy, the jobs are running on code that is deleted from the disk. This is likely the case if you are intermittently seeing NameError: Uninitialized Constant.

The correct way to have Resque restarted on each deploy is to have a line like:

run "monit restart all -g app_name_resque" 

in your after_symlink deploy hook (where app_name is the name of your application).

However, it is also likely that you don’t want your deploy to run while there are jobs still in action or for Resque to start a new job while the deploy is underway. So, in either your before_symlink or before_migrate deploy hook, code like this is in order:

Case 1. We have monit configured to use SIGQUIT and want the workers to stop when they’ve finished the current job. We also don’t want the deploy to proceed if jobs are running.

run "sudo monit stop all -g fractalresque_resque" 
if %x[ps axo command|grep resque[-]|grep -c Forked].to_i > 0
raise "Resque Workers Working!!"
end

Case 2. Monit is configured using SIGTERM - but we want the workers to stop when they’ve finished the current job and we don’t want the deploy to proceed if jobs are running. However if they’re not running, we want the workers stopped.

if %x[ps axo command|grep resque[-]|grep -c Forked].to_i > 0 
raise "Resque Workers Working!!"
else
run "sudo monit stop all -g fractalresque_resque"
end

In both cases, make sure to explicitly start Resque after your deploy has finished. Add a before_restart deploy hook similar to this:

run "sudo monit start all -g fractalresque_resque" 

These are suggested starting points, you need to consider what needs to happen in your own situation.

Debugging

Resque logs its activity to: /data/app_name/shared/log/resque_?.log

So, for the worker associated resque_0.conf, its activity can be seen in /data/app_name/shared/log/resque_0.log

Resque changes the verbosity of its logging, when VERBOSE or VVERBOSE environment variables are set. To set these your monit config’s start line will look like

start program “/bin/env VERBOSE=1 /engineyard/bin/resque my_app start production resque_0.conf”


On top of that the monit resque script logs its handling of Resque to

/var/log/syslog

  

Frequent small jobs

If you have a queue that is servicing frequent small jobs, we’ve experienced bottle neck that you may need to grapple with.

Class caching

So, you’re in production, you’ve followed the Resque readme and loaded the environment in your rake file, and you’ve got config.cache_classes = true (which is the default).

In case you’re not aware, this setting in config/environments/production.rb is why you don’t see all those pesky SHOW FIELDS (assuming MySQL) statements in your production.log, like you do while you’re developing. Its also why you need to restart your application server when you deploy code, unlike in development. In development, the appropriate models are loaded on each request (complete with changes), in production they’re loaded on demand, the first time their called.

So why is that a problem? Because in Resque the worker doesn’t do the work, the child it forks does. For all useful purposes, it has a copy of the worker with your Rails application. At this stage no models have been accessed, and this is what the forked child inherits.

After this child starts processing your job, as each model pertinent to that job is touched, the class code defining that model is ran. This involves issuing a SHOW FIELDS for each model involved to the database which has locking implications for your database. Further some fat models, may also have a substantial time cost spent in ruby itself. In fact for a quick job, most of the time could be spent instantiating your models.

A simple solution is to modify your Rakefile or, wherever you set your environment up, to change this line:

task "resque:setup" => :environment 

to something like this:

task "resque:setup" => :environment do 
User.columns
Post.columns
end

Or perhaps as a way to hit all your models at once:

task "resque:setup" => :environment do 
ActiveRecord::Base.send(:subclasses).each { |klass| klass.columns }
end

If you have feedback or questions about this page, add a comment below. If you need help, submit a ticket with Engine Yard Support.

Was this article helpful?
10 out of 10 found this helpful
Have more questions? Submit a request

Comments

  • Avatar
    Petteri Räty

    The Frequent small jobs section has not been needed since resque 1.18. That version started to automatically preload classes. I do however recommend using >= 1.19 as you can see from history that it was patched multiple times.

    https://github.com/defunkt/resque/blob/master/HISTORY.md

  • Avatar
    John Yerhot

    Hi Petteri,

    Great catch!  I'll notify our documentation team of that!

    Best,

    John

  • Avatar
    Petteri Räty

    John: Further investigation shows that one part of the frequent small jobs chapter is still relevant. The information that follows is based on investigating Rails 3.1.  Rails eager loading of the class files does not load the columns so the part about preloading column information is still accurate. The current solution for it does have a couple shortcomings / issues though. The first is that calling ActiveRecord::Base.send(subclasses) is likely to come out with an empty array unless your initializers are loading your models. The resque call to Rails.application.eager_load! comes after running resque:setup. The second problem is that klass.columns is not the only method going to the database to query schema information. The other I saw when looking into what happens during a single worker is .primary_key. In the end here's the code I put to an initializer to take load schema information when Resque calls to eager_load!

    <code>

    class ActiveRecord::Base

      module SchemaPreload

        def inherited(subclass)

          super subclass

          subclass.primary_key

          subclass.columns

        end

      end

      extend SchemaPreload

    end

    </code>

  • Avatar
    Dave Bryand

    John, can you please confirm if Petteri's comment is correct of if we can safely ignore the section on Frequent Small Jobs?

  • Avatar
    John Yerhot

    Hi Dave,

    Newer versions of Resque should behave better with smaller jobs, however, as Petteri pointed out it is still relevant.

    If you experience issues with class loading, trying the initializer Petteri mentioned may work for you.  Another alternative you can try for lots of small jobs is using resque-multi-job-forks as well.  We've had too luck with that, but it's not for every app.

    Thanks!

  • Avatar
    Ilya Scharrenbroich

    When I restart my server I occasionally get the following error:

    Redis::InheritedError: Tried to use a connection from a child process without reconnecting. You need to reconnect to Redis after forking.

     

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis/client.rb:285:in `ensure_connected'

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis/client.rb:177:in `block in process'

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis/client.rb:256:in `logging'

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis/client.rb:176:in `process'

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis/client.rb:84:in `call'

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis.rb:1159:in `block in sadd'

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis.rb:36:in `block in synchronize'

    /usr/lib64/ruby/1.9.1/monitor.rb:211:in `mon_synchronize'

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis.rb:36:in `synchronize'

    [GEM_ROOT]/gems/redis-3.0.3/lib/redis.rb:1158:in `sadd'

    [GEM_ROOT]/gems/redis-namespace-1.2.1/lib/redis/namespace.rb:257:in `method_missing'

    [GEM_ROOT]/gems/resque-1.23.1/lib/resque.rb:227:in `watch_queue'

    [GEM_ROOT]/gems/resque-1.23.1/lib/resque.rb:172:in `push'

    [GEM_ROOT]/gems/resque-1.23.1/lib/resque/job.rb:51:in `create'

    [GEM_ROOT]/gems/resque-1.23.1/lib/resque.rb:271:in `enqueue_to'

    [GEM_ROOT]/gems/resque-1.23.1/lib/resque.rb:252:in `enqueue'

    lib/later.rb:37:in `later'

    app/services/metrics_service.rb:11:in `async_record'

    app/services/places_service.rb:8:in `find_places'

    app/controllers/api/general/places_controller.rb:11:in `index'

     

    Any ideas as to why this is happening? It seems to have started after I installed resque-scheduler and occurs when I try to add a background job from my web app to redis. The failure happens a few times an then there are no more exceptions. This makes me think that there is some kind of issue with the passenger workers when they get forked.

     

    Thanks!

     

    • Ilya
  • Avatar
    Don Johnson

    Hi Ilya, 

    Please open a support ticket so that we can take a closer look into this--it definitely sounds like an interesting issue. 

    -Don

  • Avatar
    Brian Hall

    I noticed that the stop commands are either TERM or QUIT, but that the latest recipes have this entered:

     

    stop program = "/engineyard/bin/resque <%= @app\_name %> stop <%= @rails\_env %> resque\_<%= num %>.conf" with timeout 90 seconds

     

    Is the TERM/QUIT issue being handled automatically or something now?

  • Avatar
    Tom Hoen

    Hey Brian - According to the docs, STOP is an alias for QUIT.

  • Avatar
    Pawel Mikolajewski

    Can i setup resque without redis utility instance? It seems that redis-server is not installed by default now on typical solo/app_master server (it was before)

  • Avatar
    Ralph Bankston

    Hello Pawel,

     

    You can run redis via a recipe on the solo instance. We removed the running redis from the solo instance because by default the database wasn't being backed up so it was removed to discourage customers using the default instances and having data loss. Our cloud recipe can be modified to install redis on the solo instance.

  • Avatar
    Scott Sherwood

    For anyone trying to set this up using the v4 stack and a utility server, I had to make the following changes whcih are not documented above to the cookbooks:

    1. redis-yml cookbook: change line 1 to include the util server.to be 'if ['app_master', 'app', 'util'].include?(node[:instance_role])'

    2. redis cookbook: follow instructions at the top of the readme file

    Guess it would be nice if both of these were setup as default in the cookbooks too.

  • Avatar
    Christopher Haupt

    Any EY specific recommendations for standing up resque-web in Cloud? Some clients like to have the dashboard, which at least in one possible configuration requires setting up resque-web itself and the nginx rules (say on app-master) to securely access it.

  • Avatar
    Evan Machnic

    Christopher,

    Users who want to configure either resque-web or sidekiq-web typically do so by mounting the app within the Rails application. That way, there doesn't really need to be any Nginx configuration and it should be possible to just hit the URL.

    Evan

  • Avatar
    Dan Moore

    There's a typo that caught me out and meant the resque wokers weren't getting  restarted when I deployed. 

    run "sudo monit restart resque_APPNAME_{count}"

    is missing a # in front of the count variable. It should be 

    run "sudo monit restart resque_APPNAME_#{count}"

     

     

  • Avatar
    Joe Heth

    What's the proper parameter order for restart jobs with monit?  I see this line for resque (above)

    "monit restart all -g app_name_resque" 

    and I see this line on the delayed jobs cookbook README

    "monit -g dj_<app\_name> restart all"

    Does parameter order matter and why is it app_name_resque and dj_app_name? Should that be consistent?

     

Please sign in to leave a comment.

Powered by Zendesk