<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="/stylesheets/rss.css"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/">
  <channel>
    <title>Segment7: Image Serving with WEBrick</title>
    <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick</link>
    <language>en-us</language>
    <ttl>40</ttl>
    <description>The Blog</description>
    <item>
      <title>Image Serving with WEBrick</title>
      <description>&lt;p&gt;When we first started 43 Things we placed all our images on every web server.  That quickly became annoying to back up (using &lt;a href="http://www.amanda.org/"&gt;Amanda&lt;/a&gt;), so I moved it all to one machine that the other servers all mounted via &lt;span class="caps"&gt;NFS&lt;/span&gt;.  Eventually we ran out of disk and needed something I wouldn&amp;#8217;t need to touch for a long time, so I chose &lt;a href="http://www.danga.com/mogilefs/"&gt;MogileFS&lt;/a&gt;.&lt;/p&gt;


	&lt;p&gt;Since there wasn&amp;#8217;t a MogileFS library for Ruby I wrote &lt;a href="http://dev.robotcoop.com/Libraries/mogilefs/index.html"&gt;mogilefs-client&lt;/a&gt;.  Now I needed a way to get the images back out.&lt;/p&gt;


	&lt;p&gt;FastCGI wasn&amp;#8217;t fast enough, copying the image into the process before sending was a big performance hit.  Using mod_ruby created httpd processes that were much too large.  The best way to go was a web server using sendfile() that I could easily map URIs to MogileFS keys.&lt;/p&gt;


	&lt;p&gt;Ordinarily &lt;a href="http://www.danga.com/perlbal/"&gt;Perlbal&lt;/a&gt; is used with MogileFS, but Perlbal disagrees with FreeBSD.  Rather than trying to fix perl modules I decided to use WEBrick instead.  WEBrick already had all the important features of Perlbal with the exception of sendfile() and I had a minty-fresh MogileFS library for ruby.&lt;/p&gt;


	&lt;p&gt;First, I wrote &lt;a href="http://dev.robotcoop.com/Libraries/socket_sendfile/index.html"&gt;socket_sendfile&lt;/a&gt; (but later learned about &lt;a href="http://rubyforge.org/projects/ruby-sendfile"&gt;ruby-sendfile&lt;/a&gt; which supports more platforms) and integrated it with WEBrick.  Then I wrote a &lt;a href="http://segment7.net/projects/ruby/WEBrick/servlets.html"&gt;WEBrick servlet&lt;/a&gt; to map URIs into MogileFS keys that would then turn around and send the file to the client using sendfile().&lt;/p&gt;


	&lt;p&gt;With this setup I managed to get about half as fast as Apache serving raw files but the load was too high for a single process WEBrick server.  To distribute the load I reworked the default WEBrick server to fork multiple processes all listening on the same server socket.  We ended up with each of our four servers running eight WEBrick processes.&lt;/p&gt;


	&lt;p&gt;Adding support for If-Modified-Since also improved image serving speeds, but wasn&amp;#8217;t quite enough, so I threw in a dirty trick.  Instead of going to MogileFS to verify the image on an If-Modified-Since request I just return a 304 immediately.  I can trust the web browser to do the right thing and save myself a trip to MogileFS and a stat() since our images won&amp;#8217;t disappear unless their links also disappear.&lt;/p&gt;


	&lt;p&gt;The last two tricks I used to speed-up WEBrick was disabling access logging and giving WEBrick a dedicated IP via the load balancer.  Removing the extra work of logging hits resulted in a significant speedup, around twenty percent.  Running requests directly to WEBrick was another twenty percent speedup since we weren&amp;#8217;t running through Apache&amp;#8217;s mod_proxy.&lt;/p&gt;


	&lt;p&gt;I&amp;#8217;ve packaged up all my WEBrick speed-ups into the &lt;a href="http://dev.robotcoop.com/Libraries/webrick-high-performance"&gt;webrick-high-performance&lt;/a&gt; gem.  Unfortunately the sendfile() code is still FreeBSD-specific.  (I don&amp;#8217;t have a Linux machine so I can&amp;#8217;t test a socket_sendfile written for Linux.)&lt;/p&gt;
</description>
      <pubDate>Mon, 27 Mar 2006 16:44:07 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:78d3cf1f-823a-479d-bda6-e5a02b35ef9a</guid>
      <author>drbrain@segment7.net (Eric Hodel)</author>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick</link>
      <category>MogileFS</category>
      <category>Robot Co-op</category>
      <category>WEBrick</category>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by john</title>
      <description>&lt;p&gt;we&amp;#8217;re using Netapps for storing files, but it could be just any ole NFS volumes, really.&lt;/p&gt;


	&lt;p&gt;custom ? yes and no.  when an image does get uploaded, there is a process that runs outside of php to actually write the file, yes.  but the main purpose of the storage process is to just choose an available NFS volume to write the file to.&lt;/p&gt;


	&lt;p&gt;as for serving, requests for images are mod_rewritten to map to the mount points of the volumes, which are unique. apache serves to squid, and squid serves to the people. :)&lt;/p&gt;


	&lt;p&gt;thanks for sharing the details.&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 18:14:11 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:ee7b1918-a26e-4e2a-abb7-8180bcffdc45</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-149</link>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by Eric Hodel</title>
      <description>&lt;p&gt;Are you using custom software or something third-party for your file storage?&lt;/p&gt;


	&lt;p&gt;Supporting If-Modified-Since (especially the cheating way) drastically reduces the load on the WEBrick processes.  When it gets high enough we&amp;#8217;ll add squid or akamai.&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 16:44:08 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:3cb14eb2-a168-42f9-82c7-13e9b473d416</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-148</link>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by john</title>
      <description>&lt;p&gt;I see.&lt;/p&gt;


	&lt;p&gt;Where I work we do actually just simply add more filesystems, (NFS or otherwise) and we store quite a lot of small-ish files with a decently large request rate (even with our caching) and we do have those things: scalability, reliability, redundancy. (volumes are synced and never is an image in less than two places)&lt;/p&gt;


	&lt;p&gt;But it looks like MogileFS works well for you, and a novel approach.  Not being able to cache is a no-go for us.&lt;/p&gt;


	&lt;p&gt;(that said, I assume that you could front WEBrick with a cache (like squid) in reverse-proxy mode, if your request rate gets high enough.&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 15:08:57 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:340e5540-f0c1-492f-9f14-1a6404469fd7</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-147</link>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by Eric Hodel</title>
      <description>&lt;p&gt;Adding NFS volumes to distribute images runs into a problem that MogileFS solves (redundancy, scalability and reliability for storing many small files) so I need it.&lt;/p&gt;


	&lt;p&gt;Throwing extra software on top of the minimum necessary to make MogileFS work runs afoul of YAGNI so I&amp;#8217;m going to avoid it until it becomes absolutely necessary.&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 14:40:54 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:59baa82f-acc8-4f20-a832-2c4d7af9e4c9</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-146</link>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by john</title>
      <description>&lt;p&gt;I see.  So you chose WEBrick for (mostly) MogileFS, and chose MogileFS because adding NFS volumes wouldn&amp;#8217;t work for you ?&lt;/p&gt;


	&lt;p&gt;I would think that having MogileFS would count as another piece to have to maintain.&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 13:13:56 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:6d8b35fc-6a23-434d-95d1-79485289711e</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-145</link>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by Eric Hodel</title>
      <description>&lt;p&gt;Squid can&amp;#8217;t look up files in MogileFS.  It&amp;#8217;s also another piece of software I&amp;#8217;ll have to maintain.  WEBrick can do the job all by itself, so that&amp;#8217;s the best solution.&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 13:04:26 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:ee78f741-b28d-4bb7-b554-5a676ba4a682</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-144</link>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by john</title>
      <description>&lt;p&gt;I&amp;#8217;m curious why you didn&amp;#8217;t want to use squid.&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 12:35:00 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:e7bc3b71-45c5-4975-b670-700afc072c48</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-143</link>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by Eric Hodel</title>
      <description>&lt;p&gt;Nope, not coincidence.  My release of socket_sendfile made the author of ruby-sendfile pipe up then Zed Shaw integrated that with Mongrel.&lt;/p&gt;


	&lt;p&gt;Mongrel adds an extra dependency and I&amp;#8217;ve got enough software to track so I&amp;#8217;m unlikely to use it.&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 10:18:36 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:d4fe9dcb-bd38-4eb5-8dfc-2aa31f7a82cc</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-142</link>
    </item>
    <item>
      <title>"Image Serving with WEBrick" by Adam</title>
      <description>&lt;p&gt;Have you looked into working with Zed Shaw on implementing some of this into Mongrel? I noticed his latest release has sendfile support. Coincidence?&lt;/p&gt;</description>
      <pubDate>Tue, 28 Mar 2006 05:37:33 -0800</pubDate>
      <guid isPermaLink="false">urn:uuid:338bef44-e6d1-4001-9a66-41628eb7e38f</guid>
      <link>http://blog.segment7.net/articles/2006/03/27/image-serving-with-webrick#comment-141</link>
    </item>
  </channel>
</rss>
