Memory leaks, cached_model and backend jobs

drbrain | Thu, 24 Aug 2006 21:08:36 GMT

One of my long-running problems with Rails (and Ruby in general) is that it’s difficult to debug memory leaks. I’ve had a number of cases where I’ve stuck something into a long-lived array or hash and discovered much later that my Ruby process was eating over 100 MB of RAM. While ps makes it easy to see when Ruby’s using lots of RAM, actually figuring out where it went is a lot harder.

[…] I asked the Seattle Ruby Group for help, and Ryan Davis gave me a quick little memory leak spotter that he uses. I made a few additions to it, and it helped me discover that my Typo development tree was leaking 1-3 strings per hit

Memory leak profiling with Rails via scottstuff: Memory leak profiling with Rails

I just used Scott and Ryan’s script to find a memory leak in our backend jobs related to cached_model. We have the local cache enabled for for our sites because we reset the cache every page request. In our backend jobs we don’t clear the local cache so we end up with a Hash that holding every ActiveRecord object we’ve ever retrieved from the database.

So if the warnings in the documentation weren’t enough, here it is again. If you’re using CachedModel in a backend job, be sure to disable the local cache or call cache_reset periodically to allow items to be garbage collected.

Posted in ,  | no comments

Mashup Creepiness

drbrain | Mon, 07 Aug 2006 23:56:46 GMT

We consume flickr photos as part of 43 Places and use their tags to automatically add them to places. One new user creepily found his own face as his hometown's featured photo.

Posted in  | no comments

Image Serving with WEBrick

drbrain | Tue, 28 Mar 2006 00:44:07 GMT

When we first started 43 Things we placed all our images on every web server. That quickly became annoying to back up (using Amanda), so I moved it all to one machine that the other servers all mounted via NFS. Eventually we ran out of disk and needed something I wouldn’t need to touch for a long time, so I chose MogileFS.

Since there wasn’t a MogileFS library for Ruby I wrote mogilefs-client. Now I needed a way to get the images back out.

FastCGI wasn’t fast enough, copying the image into the process before sending was a big performance hit. Using mod_ruby created httpd processes that were much too large. The best way to go was a web server using sendfile() that I could easily map URIs to MogileFS keys.

Ordinarily Perlbal is used with MogileFS, but Perlbal disagrees with FreeBSD. Rather than trying to fix perl modules I decided to use WEBrick instead. WEBrick already had all the important features of Perlbal with the exception of sendfile() and I had a minty-fresh MogileFS library for ruby.

First, I wrote socket_sendfile (but later learned about ruby-sendfile which supports more platforms) and integrated it with WEBrick. Then I wrote a WEBrick servlet to map URIs into MogileFS keys that would then turn around and send the file to the client using sendfile().

With this setup I managed to get about half as fast as Apache serving raw files but the load was too high for a single process WEBrick server. To distribute the load I reworked the default WEBrick server to fork multiple processes all listening on the same server socket. We ended up with each of our four servers running eight WEBrick processes.

Adding support for If-Modified-Since also improved image serving speeds, but wasn’t quite enough, so I threw in a dirty trick. Instead of going to MogileFS to verify the image on an If-Modified-Since request I just return a 304 immediately. I can trust the web browser to do the right thing and save myself a trip to MogileFS and a stat() since our images won’t disappear unless their links also disappear.

The last two tricks I used to speed-up WEBrick was disabling access logging and giving WEBrick a dedicated IP via the load balancer. Removing the extra work of logging hits resulted in a significant speedup, around twenty percent. Running requests directly to WEBrick was another twenty percent speedup since we weren’t running through Apache’s mod_proxy.

I’ve packaged up all my WEBrick speed-ups into the webrick-high-performance gem. Unfortunately the sendfile() code is still FreeBSD-specific. (I don’t have a Linux machine so I can’t test a socket_sendfile written for Linux.)

Posted in , ,  | 9 comments

socket_sendfile and socket_accept_filter

drbrain | Fri, 24 Mar 2006 01:19:00 GMT

These are two packages that I use to speed up WEBrick image serving now freshly released. Unfortunately I haven’t tested them on any platform other than FreeBSD so please file bugs if they don’t work for you.

$ sudo gem install socket_sendfile $ sudo gem install socket_accept_filter

socket_sendfile adds sendfile(2) to Socket and forms the cornerstone of our WEBrick image serving.

socket_accept_filter makes it easy to set the SO_ACCEPTFILTER socket option so you can enable the accf_http(9) and accf_data(9) accept filters. Accept filters delay the return from accept(2) until enough data has arrived on the socket for processing.

Shortly I’ll have enough software released to do a full write-up of high-volume image serving with WEBrick.

Posted in , ,  | 2 comments

Mogilefs for Ruby

drbrain | Wed, 22 Mar 2006 21:44:00 GMT

We are using MogileFS to store all of our images and serving them up with WEBrick. To do this, we needed a MogileFS library, and I just released it.

You can install it as a gem:

$ sudo gem install mogilefs-client

Or go download mogilefs from Rubyforge.

I’ve got the mogilefs RDoc up on dev.robotcoop.com which will give you an overview of how to use it.

WARNING! I’ve only been able to test NFS mode in production, so HTTP mode is not proven to work. If you find any bugs in it, please report them at the Rubyforge tracker.

Posted in , ,  | 4 comments

Version Control and Sysadmin

drbrain | Tue, 21 Mar 2006 18:19:00 GMT

Every part of the system configuration you change belongs under version control (with a few exceptions). If you’re going to be making changes to your configuration you might do something wrong and need to roll back. You might wonder why or who made a change in the future. Version control will perform CYA duties for you.

There are a few things you probably don’t want under version control. /etc/master.passwd shouldn’t be flying across the wire (and Kerberization or similar works much better for distributing passwords). Sudo will get mad if you go and touch /usr/local/etc/sudoers inappropriately, especially if it has the wrong owners.

Configuration files in /etc, /usr/local/etc, /boot, custom rc.d and periodic scripts, anything you’re going to change, add or even break needs to be under version control. (I haven’t figured out a good way of putting crontabs under version control, ideas?)

For The Robot Co-op, each machine’s configuration is in its own branch in a subversion repository to allow care-free copying of changes between machines. A change to the httpd.conf on one machine is a commit and a couple of merges away from being accurately changed on all the machines. No typos from multiple manual changes.

Posted in  | 9 comments

Robot Co-op Software

drbrain | Mon, 20 Mar 2006 19:28:00 GMT

I’ve seen a lot of comments asking for information on our software setup, so here it is. If you’d like more detail just ask, I’ll fill you in as best I can either in a comment or in a future post.

UPDATE: Added link to Wikipedia’s MySQL configuration

Read more...

Posted in  | 14 comments

Robot Co-op Hardware

drbrain | Thu, 16 Mar 2006 02:12:00 GMT

There’s been interest in the hardware that has driven the sites of The Robot Co-op over 2.5 million requests/day so here it is:

QuantityCPUMemoryDisksFunctions
4Dual 3GHz Xeon6GB70GB RAID 1Apache, FastCGI, MogileFS storage node, memcached, image serving
1Dual 3GHz Xeon2GB70GB RAID 1Staging, mail, backend jobs
1Dual Opteron 24612GB5x 73GB in RAID 5MySQL

The four web servers are more fluke than planning, we don’t need the capacity they have just yet. We started with two webservers, a database server and a staging/mail/backend server, all dual 3GHz Xeons. We then added a third webserver and after that the Opteron MySQL box. The old database server was recently repurposed as a webserver.

Site traffic is currently spread across all four web boxes as each box runs all of our sites by a hardware load balancer of unknown manufacture. Eventually we’ll switch to running the 43 Things on a pair of machines and all other sites on the remaining machines.

Images are routed through a separate IP directly to WEBrick running a custom HTTPServlet that interacts with MogileFS to serve and resize images.

Posted in  | 29 comments

2.5 million

drbrain | Tue, 07 Mar 2006 06:40:00 GMT

On Saturday March 4th the sites of The Robot Co-op handled 2,587,240 requests through Rails. That number includes redirects and error pages but excludes images, CSS and static javascript (we dynamically generate some JS for blog posting).

Posted in ,

Go WEBrick go!

drbrain | Thu, 16 Feb 2006 07:54:00 GMT

WEBrick is a nifty little HTTP server written in Ruby. Since we’ve been having random-image-fun with Apache’s RewriteMap I pulled out my next-best tool, WEBrick, to serve images.

In order to get some speed back I’m using a sendfile(2) extension written using RubyInline. Due to the double-dispatch images end up being noticable slower, but at least I won’t have to restart whole webservers every couple hours to get images back on track.

Posted in ,