A RubyGems + GitHub proposal

drbrain | Thu, 05 Feb 2009 02:33:09 GMT

Posted in

I know many people have added GitHub to their RubyGems sources list and find it sub-optimal. For example, Nokogiri is installed via gem install nokogiri from RubyForge and gem install tenderlove-nokogiri from GitHub. Furthermore, it’s possible to create a username/gem name combo on GitHub that overlaps a RubyForge name which could lead to pain and suffering for GitHub users.

I’ve come up with a potential solution to this problem:

  • Add an alias name attribute to gem specifications that point to the “RubyForge name” for the gem
  • Add an index to the gem server that maps alias names to “RubyForge names”
  • Only signed gems with an alias name will be included in this index
  • When RubyGems looks for a gem to install it considers aliased gems as exact matches for a name, provided they satisfy the user’s trust policy

Using this solution, a user could install a gem that has a dependency on nokogiri. If nokogiri is signed on GitHub and there’s a newer version on GitHub than on RubyForge, the GitHub version would be installed.

Here are some discussions points this solution presents:

  • GitHub currently builds gems for authors, so it is impossible for these gems to be signed. GitHub would have to store the author’s private key for signing.
  • By default RubyGems sets no security policy, so it doesn’t address the name overlap problem (this default could be changed)
  • Furthermore, it would not prevent a trusted author from turning rogue
  • Using a trust policy, a user can choose to pull gems from GitHub for specific authors by trusting the author’s public key (e.g. only install signed gems, only install trusted gems)
  • There’s no infrastructure for easily trusting an author’s key (beyond gem cert)
  • It doesn’t give GitHub a central authority for gems, but one could be built through a web of trust
comments

Comments RSS FEED

Awesome ideas, I’m really liking the web of trust concept…especially since there doesn’t seem to be one and that may be a problem.

Something you may want to help out with is a project I’ve started called gemcutter that aims to fix the issues with RubyForge and gem hosting as a whole.

Perhaps these improvements could be realized through this project. Edit on the wiki or let me know if you think so! :)

Nick Quaranto said about 3 hours later

That would be great. For ruby 1.9.1 I had to do a lot of git clone + gem build, simply for having gems with the same name. This would really solve issues like that and clean up my gem list a lot.

Konstantin Haase said about 7 hours later

What are the downsides of having RubyGems “pull” all built gems from a designated github repo (or any other repo)? Wouldn’t this avoid the trust issue by relying on the project owner’s rubyforge credentials? I’m just wondering why this wouldn’t be easier. If the only answer is that it would be hard to add to RubyForge, that can be good enough :)

Also, are you going to link this post on the rubygems dev list?

— Chad

Chad Woolley said about 7 hours later

Why give special status to Rubyforge at all? Why not either A) iterate through all sources looking for a name and if there’s a clash prompt the user for the source repository to install from or B) allow the user to specify an order of preference (by default Rubyforge can be the preferred repository) and in the case of name clashes simply choose the version from the preferred repo and display something like “Installing nokogiri 2.0.0 from gems.rubyforge.org” Perhaps I’m missing something?

Christian Romney said about 15 hours later

Sorry for the second post. It occured to me you might want to install from your non-preferred repository on a per-gem basis, so something along the lines of

gem install nokogiri —prefer github

I’m kind of shooting from the hip here. It’s not real clear to me that —prefer doesn’t overlap with —source substantially. I do like only having to specify a minimally unique string for the source, though.

Christian Romney said about 15 hours later

Github might be the first, but it definitely won’t be the last. We should be decentralizing RubyGems anyway. Even before Github, there were a number of people using SCM besides what Rubyforge had, and then just routing their gem releases through Rubyforge because that was the only way to get them out. You and Ryan do that, right?

I think URIs should be part of the solution somehow ideally. I’d love to be able to type:

sudo gem install http://gems.zenspider.net/flog

Though perhaps this seriously mess with the current mirrors system, I dunno.

Francis Hwang said 1 day later

I Like Francis’s idea, though not sure how it would be implemented.

sanbit said 1 day later

I guess I don’t understand why people are creating gems with different names on github. I’m not so sure we should be catering to this.

Francis, you can already do “sudo gem install flog —source http://gems.zenspider.net”.

Daniel Berger said 1 day later

Daniel: People are doing it because there are now multiple versions of a particular project that you might want to install. Decentralized development requires decentralized packaging.

Wilson Bilkovich said 1 day later

Chad: Having RubyForge pull gems won’t solve the problem since there could be several appropriate gems to pull.

Eric Hodel said 5 days later

Christian: Trusting an author’s key takes care of your preference without having to supply —prefer all the time.

Also, resolving dependencies on nokogiri should happen without being prompted. Usually I don’t want to use nokogiri straight up, instead I’ll install mechanize or something else that uses nokogiri and want to automatically get the latest bug-free version available.

Per-repository preferences don’t help since RubyGems is version based, not repository based. RubyGems always picks the latest version regardless of where it’s coming from. I’m trying to solve a multiple name problem here, not a where-do-I-get-it-from problem.

Eric Hodel said 5 days later

Francis: Yes, we develop in perforce then release via RubyForge.

I’m not sure that URIs are as user-friendly as names, though.

Eric Hodel said 5 days later

Daniel: Wilson: I believe the GitHub people saw two alternatives when creating gems for GitHub projects, either have a gem repository per user or prefix the gem name with the user’s name. They chose the latter since it was easier to implement.

People aren’t giving gems different names, they’re forced into it. Aaron’s nokogiri gem on GitHub is “tenderlove-nokogiri”. If I forked nokogiri and made a gem, it would be “drbrain-nokogiri”.

Eric Hodel said 5 days later

Eric,

Thanks for helping me better understand the problem. I’m wondering though, what if you and I both fork nokogiri 1.0.0 and each make some meaningful change to the gem that we each then release as version 1.0.1? Without some sort of name-spacing, how could Rubygems know I wanted one version or the other?

Christian Romney said 5 days later

Yeah, URIs are definitely less user-friendly than the current gem names. I guess the question is whether that’s worth the tradeoff. I feel like everywhere I look I see people installing stuff from URIs now, so I feel like things are going that way.

Francis Hwang said 6 days later

one thought i’m having – what if only one source (rubyforge) were ever considered authoritative but an easy mechanism were built to install via proxy. so if one installed

sudo gem install lockfile

that of course installs lockfile from rubyforge. however, my gem itself, via extconf.rb or other proxy.rb or some sure, could do

github.install ‘ahoward-lockfile’

the basic concept is that people trust my gem, if i make that gem install from github that should be okay. this also provides a trusted mapping between rubyforge gems and github ones which could be searched.

with something like this we would simply not require people to use —source as a rule.

ara.t.howard said 9 days later

In general I’m not certain that RubyGems is the right place to set any significant trust policies. It’s always been possible to write and distribute a gem that deletes your whole hard drive when you try to install it, right?

I think there’s a difference between trust and security, and just insuring no name collisions on the other. I think the most elegant no-name-collisions idea would be to use URIs, though I’m not going to fool myself about how easy that would be to implement. (Also, since I wouldn’t have much time to contribute myself, my right to have a say in it is obviously limited.) But I think an elegant way to have cross-source dependencies is fairly important, and URIs could solve it in a clean way.

But maybe I’m just trying to sneak urirequire back in the system somehow …

Francis Hwang said 9 days later

i really think the solution is for there to be one master and proxy installs. that way we can solve the issue of trust exactly one time for all other sources to come – not only github. this even solves the domain issue since a proxy install could span domains. basically inverting the problem makes it really easy:

. rubyforge is the master . gems can (as they always have been able to) do whatever they please during install, so simple proxy out to other sources if needed via a super light weight helper

this also solves the ‘what happens when something better than git comes along’ issue by keeping all stubs on rubyforge. when ‘got’ comes out and it’s the latest hotness we can all just change our gemspecs to pull from http://gothub.com/ – in insulating the community from the tides of scm frothting and foaming

ara.t.howard said 9 days later

Ara, I’m wondering if we need to start thinking about a future where you aren’t pulling from just two or three sources, but hundreds. The idea of Ryan & Eric running their own gem servers off of gems.zenspider.com or gems.segment7.net seems fairly natural to me.

So in that scenario the idea of maintaining mappings so that RubyGems knows that

install zenspider.autotest

pulls a gem from

gems.zenspider.com

seems like an unnecessary wrapping of DNS to me.

Of course, I could just be living in some bikeshed fantasy-land right now. Maybe this feature wouldn’t be that useful if it were built.

Francis Hwang said 9 days later

Wilson: No, that doesn’t follow. To me this is a process issue, not a technical issue.

Daniel Berger said 12 days later

More bikeshedding here: http://fhwang.net/2009/02/18/URIs-in-RubyGems

Francis Hwang said 14 days later

Comments are disabled