Rubinius' Foreign Function Interface

drbrain | Tue, 15 Jan 2008 09:22:00 GMT

Posted in

I really, really, really love Rubinius’ Foreign Function Interface (FFI) since it allows you to replace C code with Ruby code. Earlier today I wrote Socket::getaddrinfo in C for Rubinius, and just now I finished a rewrite using FFI and Ruby. I’ve commented the code for clarity.

def self.getaddrinfo(host, service, family = nil, socktype = nil,
                     protocol = nil, flags = nil)
  service = service.to_s

  # MemoryPointer.new is kind-of like malloc(3), but understands what's inside
  hints_p = MemoryPointer.new Socket::Foreign::AddrInfo.size

  # Socket::Foreign::AddrInfo is a struct addrinfo wrapper with friendly accessors
  hints = Socket::Foreign::AddrInfo.new hints_p
  hints[:ai_family] = family || 0
  hints[:ai_socktype] = socktype || 0
  hints[:ai_protocol] = protocol || 0
  hints[:ai_flags] = flags || 0

  # getaddrinfo(3) asks for a struct addrinfo **.
  # This creates a pointer to a pointer
  res_p = MemoryPointer.new :pointer

  # call out to C
  err = Socket::Foreign.getaddrinfo host, service, hints_p, res_p

  # check for errors
  raise SocketError, Socket::Foreign.gai_strerror(err) unless err == 0

  # now we read out the pointer that getaddrinfo() passed us, and cast it
  # to a struct addrinfo *
  res = Socket::Foreign::AddrInfo.new res_p.read_pointer

  addrinfos = []

  loop do
    addrinfo = []

    # Extract data
    addrinfo << Socket::Constants::AF_TO_FAMILY[res[:ai_family]]

    ai_sockaddr = res[:ai_addr].read_string res[:ai_addrlen]

    sockaddr = Socket::Foreign::unpack_sa_ip ai_sockaddr, true

    addrinfo << sockaddr.pop # port
    addrinfo.concat sockaddr # hosts
    addrinfo << res[:ai_family]
    addrinfo << res[:ai_socktype]
    addrinfo << res[:ai_protocol]

    addrinfos << addrinfo

    # struct addrinfo is a linked list, so if we've hit the end, stop
    break unless res[:ai_next]

    # otherwise, down the linked-list
    res = Socket::Foreign::AddrInfo.new res[:ai_next]
  end

  return addrinfos
ensure
  # like a C code, we have to free our MemoryPointer objects
  hints_p.free if hints_p

  if res_p then
    # also, we have to do any C-side cleanup
    Socket::Foreign.freeaddrinfo res_p.read_pointer
    res_p.free
  end
end


getaddrinfo(3), freeaddrinfo(3) and gai_strerror(3) are wrapped up by FFI like this:

attach_function "gai_strerror", :gai_strerror, [:int], :string

attach_function "getaddrinfo", :getaddrinfo,
                [:string, :string, :pointer, :pointer], :int

attach_function "freeaddrinfo", :freeaddrinfo, [:pointer], :void


The first argument is the C function name, the second is the Ruby name, the third is the input arguments, and the fourth is the return type. Currently, FFI can only wrap up C functions with six or fewer args.

The AddrInfo struct is wrapped up like this:

class AddrInfo < FFI::Struct
  config("rbx.platform.addrinfo", :ai_flags, :ai_family, :ai_socktype,
         :ai_protocol, :ai_addrlen, :ai_addr, :ai_canonname, :ai_next)
end


The config method pulls pre-generated struct information out of a Rubinius config file and hooks up accessors to each of the struct’s fields. The accessors know which offset into the struct the data lives at and what type to convert data from and to when working with the struct. The information is collected at Rubinius build time by a small bit of C code.

I still have some confusion between passing an FFI::Struct like Socket::Foreign::AddrInfo vs. passing a MemoryPointer instance (which is what an FFI wrapped function understands) to an FFI-wrapped function, so we’re going to clean up that part of the API to make it more natural. Instead you’ll be able to initialize an FFI::Struct directly and pass it to the FFI-wrapped function. This will make the code quite a bit cleaner. comments

Comments RSS FEED

Is the C version too long to show?

Bil Kleb said about 2 hours later

So basically there is no need to write C code while implementing ruby wrappers over low level libraries … great!

Dee Zsombor said about 2 hours later

What is the speed of execution compare to C code with the ruby code with FFI?

Shin Guey said about 3 hours later

The part I don’t fully understand about the Rubinius FFI is how to set up the C files so they can actually be hosted by it properly. A week ago I tried to hook up to the SHA1 implementation in the shotgun/lib directory but I couldn’t figure out how to do it. After the behind-the-scenes pipes are hooked up, then the FFI works great…

Ryan said about 7 hours later

Aagh, you just reminded me about addrinfo and struct sockaddr and friends…

As long as you never mention strtok.

Laurel Fan said about 8 hours later

The speed question interests me as well

she said about 10 hours later

The Rubinius FFI system ends up generating native functions, so other than the time spent executing the Ruby code you see in the post there, it shouldn’t be slower than writing it in C.

Either way you end up invoking C from Ruby, it’s just a matter of where the Ruby stops and the C starts.

Wilson Bilkovich said about 11 hours later

@Bil: The C version is a bit shorter because it’s less verbose, and things like traversing a linked-list are more natural.

Eric Hodel said about 12 hours later

@Shin: I have no idea how fast it is. Making things work is more important than making things fast right now.

Eric Hodel said about 12 hours later

@Ryan: Before hooking in to a C library, it must be loaded into the process. The easiest way to do that is with subtend, Rubinius’ implementation of the Ruby/C interface. That way you can just require the library, and use FFI to do the rest.

I’ll write up a post on that in the future.

Eric Hodel said about 13 hours later

Are nested pointers supported? Those were really painful with ruby’s Array#pack, String#unpack methods.

Justin

Justin Bailey said about 13 hours later

Just to add a bit to Wilson’s explanation, each #attach_function call directly generates a thin intermediary function or “shim” in native machine code (so no compilation is necessary) and this is wrapped in a Ruby object. When the Ruby-side method is actually called, it unwraps the shim and executes it. The shim handles calling the type conversion functions for parameters and the return value and it of course calls the actual function we are bridging to. The performance should be roughly equivalent to a compiled extension written in C (one additional function call.) This can certainly be further optimized, but right now it just works.

@Ryan: The best thing to do at this time would be to write the C that you need and have it built along with rest of Rubinius. You also already have access to all functions in a Rubinius process such as those from libc and support for allowing linking against arbitrary libraries through dynamic loading is already in but disabled until the API is more refined — so eventually you can just ask for a particular symbol to be loaded from libpcre, for example.

Eero Saynatkari said about 15 hours later

Justin: You’ll have to instantiate the inner struct from it’s outer struct pointer, but that’s just an extra line, just like how I walk down the linked-list for each struct addrinfo.

Eric Hodel said about 16 hours later

Interesting. Do you have an example of how callbacks would work?

Dan

Daniel Berger said 2 days later

@Dan: Callbacks aren’t currently supported in the FFI layer, but they are supported through subtend, the C extension compatability layer. This is because Rubinius is © stackless, and so requires a whole new C stack (woo getcontext) in order for C to be able to call back into Ruby land (when it switches back to the original context). This is exactly what happens in subtend, but complicates FFI a bit. In the future, if we’re going to support callbacks from FFI, we’ll need to have a way to hint that some function uses callbacks and so will need a new context. In the meantime, it’s a simplicity and performance optimization, as we’ve made a fairly clear distinction up to this point that FFI calls are one way.

Kevin Clark said 2 days later

Comments are disabled