mogilefs-client version 1.2.1 has been released!
A Ruby MogileFS client. MogileFS is a distributed filesystem written
by Danga Interactive. This client supports NFS and HTTP modes.
Changes in 1.2.1:
- Switched to Hoe.
- Moved to p4.
- Fixed bug #7273 in HTTP mode of client where data would not get returned. Submitted by Matthew Willson.
When we first started 43 Things we placed all our images on every web server. That quickly became annoying to back up (using Amanda), so I moved it all to one machine that the other servers all mounted via NFS. Eventually we ran out of disk and needed something I wouldn’t need to touch for a long time, so I chose MogileFS.
Since there wasn’t a MogileFS library for Ruby I wrote mogilefs-client. Now I needed a way to get the images back out.
FastCGI wasn’t fast enough, copying the image into the process before sending was a big performance hit. Using mod_ruby created httpd processes that were much too large. The best way to go was a web server using sendfile() that I could easily map URIs to MogileFS keys.
Ordinarily Perlbal is used with MogileFS, but Perlbal disagrees with FreeBSD. Rather than trying to fix perl modules I decided to use WEBrick instead. WEBrick already had all the important features of Perlbal with the exception of sendfile() and I had a minty-fresh MogileFS library for ruby.
First, I wrote socket_sendfile (but later learned about ruby-sendfile which supports more platforms) and integrated it with WEBrick. Then I wrote a WEBrick servlet to map URIs into MogileFS keys that would then turn around and send the file to the client using sendfile().
With this setup I managed to get about half as fast as Apache serving raw files but the load was too high for a single process WEBrick server. To distribute the load I reworked the default WEBrick server to fork multiple processes all listening on the same server socket. We ended up with each of our four servers running eight WEBrick processes.
Adding support for If-Modified-Since also improved image serving speeds, but wasn’t quite enough, so I threw in a dirty trick. Instead of going to MogileFS to verify the image on an If-Modified-Since request I just return a 304 immediately. I can trust the web browser to do the right thing and save myself a trip to MogileFS and a stat() since our images won’t disappear unless their links also disappear.
The last two tricks I used to speed-up WEBrick was disabling access logging and giving WEBrick a dedicated IP via the load balancer. Removing the extra work of logging hits resulted in a significant speedup, around twenty percent. Running requests directly to WEBrick was another twenty percent speedup since we weren’t running through Apache’s mod_proxy.
I’ve packaged up all my WEBrick speed-ups into the webrick-high-performance gem. Unfortunately the sendfile() code is still FreeBSD-specific. (I don’t have a Linux machine so I can’t test a socket_sendfile written for Linux.)
You can install it as a gem:$ sudo gem install mogilefs-client
Or go download mogilefs from Rubyforge.
WARNING! I’ve only been able to test NFS mode in production, so HTTP mode is not proven to work. If you find any bugs in it, please report them at the Rubyforge tracker.
Just to salvage my brain, here’s what I need to do to make it work on 43 Things & co.:
- Make sure www has a homedir (actually, no, fix the code to not need sendfile())
- Make sure mogstored is set up with a umask of 002
- Make sure www is in the mogilefs group for all hosts
- Mount all the nfs shares in the right spots
Right now I just support file operations, and only over NFS since FreeBSD + HTTP mode don’t like each other. I’ll get the admin operations finished next, then I’ll have something shippable.
I couldn’t get HTTP mode working, but I didn’t try very hard. It looks like the components only conditionally include Linux::AIO, but I’m not sure.
I submitted a documentation patch for MogileFS + NFS to the mailing list to help out the next people who try this. Its really only two extra sentences that you need.