Net::HTTP is not slow

Eric Hodel | Fri, 07 May 2010 08:56:00 GMT

Posted in ,

You're just using it wrong.

Some time back there was a blog post about Net::HTTP being slow, but that's not true anymore, and probably wasn't as true then as it was claimed to be.

The way to make Net::HTTP go fast is to use a persistent connection so you don't have to re-connect to the server every time. Unfortunately the original benchmarks referenced above don't seem to make more than one request per implementation so Net::HTTP couldn't give its best possible showing.

If you're doing a one-off file transfer or only fetching content from one site at a time it's ok to avoid Net::HTTP for another library. If you're requesting data from the same server over and over, like a web service, it's nearly immoral to connect to it over and over.

In order to help you use Net::HTTP the right way I've released net-http-persistent. It's a thread-safe wrapper for Net::HTTP that performs persistent connections for you. Here's an example:

require 'net/http/persistent'
uri = URI.parse 'http://example.com/awesome/web/service'
http = Net::HTTP::Persistent.new
response = http.request uri # performs a GET

# perform a POST
post_uri = uri + 'create'
post = Net::HTTP::Post.new post_uri.path
post.set_form_data 'some' => 'cool data'
response = http.request post_uri, post # URI is always required

net-http-persistent is incredibly tiny, so maybe you can add some convenience methods to it. I haven't had a need to.

Benchmark

I wrote the following three benchmark blocks to return the same request body for a URL I’m sure will work (return 200 OK with a payload). A static file was used to minimize server processing latency.

Each iteration:

  • sends an HTTP request

  • cleans up after itself (to be friendly to the network)

  • extracts the body

Loopback

When running across loopback with all three benchmarks I received the following result with N=20_000 using uri_2k:

ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-darwin10.2.0]
Rehearsal ---------------------------------------------------------
TCPSocket               1.330000   2.130000   3.460000 (  9.601254)
Net::HTTP               8.410000   2.400000  10.810000 ( 17.333671)
Net::HTTP::Persistent   8.110000   0.880000   8.990000 ( 12.190094)
----------------------------------------------- total: 23.260000sec

                            user     system      total        real
TCPSocket               1.340000   2.160000   3.500000 (  9.759389)
Net::HTTP               8.390000   2.370000  10.760000 ( 17.381197)
Net::HTTP::Persistent   8.070000   0.880000   8.950000 ( 11.493741)

With N=50_000 and the Net::HTTP benchmark disabled:

ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-darwin10.2.0]
Rehearsal ---------------------------------------------------------
TCPSocket               3.290000   5.340000   8.630000 ( 24.503025)
Net::HTTP::Persistent  20.090000   2.160000  22.250000 ( 28.822468)
----------------------------------------------- total: 30.880000sec

                            user     system      total        real
TCPSocket               3.290000   5.340000   8.630000 ( 23.874741)
Net::HTTP::Persistent  20.100000   2.150000  22.250000 ( 29.188237)

So raw TCPSocket is about 20% faster than Net::HTTP::Persistent.

This was expected as the initial connection setup and teardown round-trips will be very fast on the loopback interface which gives Net::HTTP::Persistent the worst-possible showing.

Unfortunately you miss out on easy error checking and all that other Net::HTTP and Net::HTTP::Persistent goodness using TCPSocket.

Real Internet

Depending upon your link speed, creating TCPSockets across the Real Internet may drastically reduce the performance of TCPSocket.

This benchmark was run with N=500 from my home internet connection and uri_2k. traceroute shows 16 hops between the client and server. At the time of the benchmark run ping -c 20 showed:

20 packets transmitted, 19 packets received, 5.0% packet loss
round-trip min/avg/max/stddev = 74.564/91.412/147.863/18.092 ms

ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-darwin10.2.0]
Rehearsal ---------------------------------------------------------
TCPSocket               0.180000   0.220000   0.400000 ( 99.048004)
Net::HTTP::Persistent   0.340000   0.120000   0.460000 ( 46.229385)
------------------------------------------------ total: 0.860000sec

                            user     system      total        real
TCPSocket               0.210000   0.280000   0.490000 (112.646966)
Net::HTTP::Persistent   0.340000   0.140000   0.480000 ( 47.381403)

In this case Net::HTTP::Persistent is about 140% faster than TCPSocket.

Running this benchmark

The data files I used were created by dd:

dd if=/dev/zero of=~/Sites/zeros-1k bs=1024 count=1

If you’re running this benchmark repeatedly make sure you wait until the sockets fall out of TIME_WAIT before re-running, you should see 0 (or near 0):

netstat -an | grep TIME | lc

TCPSocket and Net::HTTP::Persistent should show similar times on a fast link (like loopback). If TCPSocket ends up vastly slower you’ve probably run out of sockets.

When running this benchmark with high N you may need to increase the ephemeral port range.

With an N of 50_000 and the following configuration I can run the TCPSocket or the Net::HTTP requests along with Net::HTTP::Persistent, but not both.

$ sysctl -a net.inet.ip.portrange
net.inet.ip.portrange.lowfirst: 1023
net.inet.ip.portrange.lowlast: 600
net.inet.ip.portrange.first: 10000
net.inet.ip.portrange.last: 65535
net.inet.ip.portrange.hifirst: 10000
net.inet.ip.portrange.hilast: 65535

What about Curb?

I tried to write a benchmark using curb 0.7.1 but failed to make one that performed even as well as plain Net::HTTP.

I couldn’t get curb to use a persistent connection. curl_easy_perform(3) says that libcurl will create a persistent connection if you call it multiple times with the on the same handle. I can see this behavior using `strace curl URL URL`.

With curb I see a new socket created per sendto(2)/recvfrom(2) pair. I also see a bunch of calls to close(2) when ruby performs its final GC pass.

I couldn’t see a way to make curb shut down its socket manually. The only way to do this is to wait for the GC to collect the socket. Leaving file descriptors hanging around for the GC is not good. (It also seemed to spend most of the time in the benchmark waiting for sockets to close.)

I started looking through curb to see why it would behave this way, but in Curb::Easy::new it calls curb_easy_init(3) and doesn’t check the return value despite the man page saying it may return NULL and gave up.

I filed the issues 29, 30 and 31 on the curb github tracker for these problems instead.

The Code

require 'rubygems'
require 'benchmark'
require 'net/http'
require 'net/http/persistent'

uri_1k  = URI.parse 'http://localhost/~drbrain/zeros-1k'
uri_2k  = URI.parse 'http://localhost/~drbrain/zeros-2k'
uri_10k = URI.parse 'http://localhost/~drbrain/zeros-10k'

uri = uri_2k

N = 5_000

Benchmark.bmbm do |bm|
  bm.report 'TCPSocket' do
    # HTTP/1.1 requires handling of chunked transfer-encoding
    tcp_request = <<-HTTP
    GET #{uri.request_uri} HTTP/1.0\r
    Host: #{uri.host}\r
    Connection: close\r
    \r
    HTTP

    N.times do
      s = TCPSocket.open uri.host, uri.port
      s.write tcp_request
      data = s.read
      s.close # hopefully reduces TIME_WAIT duration
      data.split("\r\n\r\n", 2).last # get body
    end
  end

  bm.report 'Net::HTTP' do
    N.times do
      response = nil
      Net::HTTP.start uri.host, uri.port do |http|
        # Net::HTTPRequest can't be recycled
        request = Net::HTTP::Get.new uri.request_uri
        response = http.request request
      end
      response.body
    end
  end

  bm.report 'Net::HTTP::Persistent' do
    http_p = Net::HTTP::Persistent.new

    N.times do
      response = http_p.request uri
      response.body
    end
  end
end
5 comments

Comments RSS FEED

Yup, Net::HTTP is actually a good library, it’s also quite complete, contrary to almost every other http offering we have in the language.

People bitch about it’s API all the time, I think that’s down to the “printing” methods and odd top level methods, and the fact that there’s inconsistency between using hashes and precompiled strings in parameters, and the requirement to use objectified requests in order to perform really advanced requests. Personally I think objectified requests is the correct way to go, the parameter inconsistency could be fixed and the printing methods are useless for me personally.

Net::HTTP is reasonably efficient, especially considering it’s implemented in pure ruby. That being said, something that cleanly abstracts away socket handling and parsing at the C level should really be faster. Indeed, for concurrent requests our (incomplete) http client libraries for EventMachine are much faster than Net::HTTP. Indeed linear performance /should/ be slightly better, on the grounds that EventMachine is generically faster than using TCPSocket (on 1.8 particularly) for a number of reasons.

Anyway, good call. I use net/http a lot, and I almost always begin with http = Net::HTTP.new(server, port).start, looks like this new lib can save me boilerplate :-)

raggi said about 1 hour later

Beating TCPSockets so handily over the net connection was curious to me. Most web servers will do gzip out of the box, and in 1.9 Net::HTTP will also request and handle that transparently (see http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/12693). A file full of zeroes will zip up quite nicely and I wonder if you’re comparing apples to apples there.

But your main point was made. Good post.

Ben Lavender said 2 days later

Now that the issues you filed on Github for Curb have been resolved, I’d be interested in seeing an update to the post comparing Net::HTTP::Persistent with Curb.

Doug Ramsay said 3 days later

Apache on OS X 10.6 doesn’t have gzip compression enabled:

$ telnet localhost 80
Trying ::1...
Connected to localhost.
Escape character is '^]'.
GET /~drbrain/zeros-2k HTTP/1.1
Host: localhost
Accept-Enocding: gzip;q=1.0,deflate;q=0.6,identity;q=0.3

HTTP/1.1 200 OK
Date: Mon, 10 May 2010 20:24:21 GMT
Server: Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.8l DAV/2
Last-Modified: Thu, 06 May 2010 03:42:10 GMT
ETag: "9583550-800-485e4ba364880" 
Accept-Ranges: bytes
Content-Length: 2048
Content-Type: text/plain
Eric Hodel said 3 days later

Doug, I’m going to wait for the release. I’m still seeing updates to the tickets.

Eric Hodel said 3 days later

Comments are disabled