Mail Filtering
Eric Hodel | Fri, 28 Apr 2006 20:53:22 GMT
I’ve been using IMAPCleanse to clean out my list inboxes and have discovered that I get about 500 emails a day from mailing lists. I read no more than ten to twenty mails out of all those, and respond to maybe two of the mails I’ve read.
I want a bayesian filter for my mail that tells me what to read. Priming the filter with interesting mail is going to take time, so to accelerate that I have this list:
- Messages I respond to are interesting to me
- Messages I write are interesting to me
- Responses to those messages are interesting to me
- If I unflag an automatically flagged list it should never be re-flagged
I think the next tool I’m going to write is a tool that flags messages I write, messages I’ve responded to and responses to mails I’ve written. This should be easy to figure out from the \Answered flag, so I won’t have to do too much searching for In-Reply-To and References headers. I should also be able to keep track of auto-flagged messages with IMAP keywords.
autotest Sucks
Eric Hodel | Tue, 25 Apr 2006 23:51:58 GMT
You really might think this is a strange thing to say, despite how awesome some people say it is, it still sucks. It doesn’t suck because it doesn’t work well, it sucks because its insides still bear the scars of its birth.
My first version of autotest was written at OOPSLA 2005 after seeing Don Roberts and John Brandt show off a Smalltalk class browser that would automatically run tests whenever methods where changed.
At that time I was TDDing a personal Rails project and realized a tool to automatically run my tests as I made changes would probably speed up my development as much as switching to TDD from web browser reloading development did, so I wrote the first version of autotest.
That first version of autotest was probably about 100 lines of code. It was tied directly to my development process and got stuck occasionally, but it worked well for me.
Sometime in January I imported autotest into ZenTest and made it work for more generic ruby code. To do that I took my tiny ruby script and wrapped it up in a class. I pulled the Rails stuff out into a subclass and made a few other cleanups.
This didn’t do anything for the cleanliness of the code, but I did manage to add a bunch of tests that I hadn’t during my OOPSLA coding spree. I still had bugs though, and those took another four months to shake out with a few minor refactorings along the way.
Now autotest is nearly perfect functionally, but the implementation sucks. Some methods that do two or three things when they should be doing one. The test running algorithm is scattered across several methods.
autotest needs a major refactoring and I’m hoping to get to it in May. Refactoring will make the code cleaner, more straightforward and more maintainable. I may even do something to better support custom testing styles. I could add features to autotest, but that would only make the refactoring harder, so I won’t while adding those features will make the code more convoluted.
As autotest accumulated more suck it became less fun to work on. I don’t like the sound of software in pain so I’m going to listen and fix its suckage before adding any more fanciness.
Making dircproxy Fast
Eric Hodel | Wed, 12 Apr 2006 13:24:52 GMT
I use dircproxy to maintain my IRC connection to #ruby-lang so I can flip through the scrollback when I rejoin. Unfortunately tons of people come and go from #ruby-lang during the course of a day. All these quits and joins get logged and all of them get sent when I reconnect by default, which can take ten to twenty seconds.
Flipping through the dircproxy man page I found the following configuration settings will stop logging the quits and joins and nearly eliminate reconnect time:
log_events -join,-part,-quit,-nick
So my dircproxy configuration for Freenode looks like this:
server_throttle 10
disconnect_existing_user yes
listen_port ...
log_events -join,-part,-quit,-nick
connection {
password "..."
server "irc.freenode.net"
join "#ruby-lang, #seattle.rb, #ruby2c"
}
Speeding up Test Runs with fork
Eric Hodel | Sun, 09 Apr 2006 03:47:00 GMT
Loading Rails takes a significant portion of your test run time, especially when you want to run only one test file or one test method. On my Powerbook loading Rails takes between four and six seconds. If you're frequently running unit tests this constant overhead can quickly become annoying.
When using autotest I may have to wait as much as ten seconds (five seconds between scans for changes, four seconds to load rails, one second to run the test) before I know if my changes fixed a problem or not. Ten seconds is past the threshold where I can keep paying attention which makes my mind wander. (A wandering mind is no good for productive work.) Also, those extra four seconds of loading Rails per test start to add up. I may load rails hundreds of times in a day just to run a tiny test.
There's one already existing way to reduce or eliminate that constant overhead of loading Rails. In development mode Rails reloads files to keep things running without restarting Rails on every change. I prefer to have an environment that is guaranteed to be clean when the tests start and reloading files removes this option.
Since I want Rails loaded without any application code I chose to create a process that would load rails then open up a server socket and wait for connections. When a connection comes in the process will fork to make a copy of the environment that can then load the application and run the tests.
A regular test run for just one file runs like this:
$ time ruby test/controllers/route_controller_test.rb Loaded suite test/controllers/route_controller_test Started ...................................................... Finished in 13.192465 seconds. 54 tests, 268 assertions, 0 failures, 0 errors real 0m17.884s user 0m8.147s sys 0m1.424s
The difference between the real time and the Test::Unit run time accounts for Rails and app loading overhead, about five seconds.
I've tentatively named the parent process spawner 'ruby_fork' and the client 'ruby_fork_client', so you start up the parent process:
$ RAILS_ENV='test' ruby_fork -r rubygems -e 'require_gem "rails"' /Users/drbrain/Links/ZT/bin/ruby_fork Running as PID 3570 on 9084
ruby_fork understands -r, -I and -e just like regular ruby so I can just load Rails and none of the rest of my application.
Then I run ruby_fork_client which takes its arguments and passes them across to the child process and then reads from the socket and prints to STDOUT.
$ time ruby_fork_client -r test/controllers/route_controller_test.rb Loaded suite /Users/drbrain/Links/ZT/bin/ruby_fork Started ...................................................... Finished in 12.442556 seconds. 54 tests, 268 assertions, 0 failures, 0 errors real 0m13.947s user 0m0.077s sys 0m0.022s
Now that extra time spent loading Rails is gone and I'm left with application loading and Test::Unit overhead which is miniscule in comparison.
ruby_fork is not Rails specific. The server and client can do anything they like, so this has applications beyond testing Rails (for example, handling incoming mail) or even Rails itself.
I'd like to release ruby_fork and ruby_fork_client as part of ZenTest but I'll be holding it until 3.3.0. Currently ZenTest is almost ready for release and ruby_fork and ruby_fork_client needs to act more like a regular invocation of ruby.
Rubyholic and hCal Microformat
Eric Hodel | Fri, 31 Mar 2006 11:17:00 GMT
At SXSW I attended a panel on microformats which are a way of embedding other formats into HTML using semantic classes.
When we first wrote Rubyholic we talked about adding iCal support, but thought it would be a big pain to implement. The hCal microformat along with Technorati’s Events Feed Service made it nearly painless, so now Rubyholic groups now have calendars!
You can subscribe to a Rubyholic calendar on any group page from the link at the bottom of the schedule. For example, here’s the calendar for the Seattle Ruby Brigade.
Finder Automator Plug-ins
Eric Hodel | Thu, 30 Mar 2006 22:41:46 GMT
Apple’s Automator allows you to perform drag and drop scripting for OS X applications. So far I’ve only written two Automator workflows, one that loads selected photos into iPhoto I call “Import Photos” and one script that attaches selected files to a new email I call “Mail Selection”.
For the Import Photos workflow grab the “Get Selected Finder Items” action then drop the “Import Photos into iPhoto” action below it. I add my photos to Library. Then select Save As Plug-in from the File menu and it will show up in the Automator item of Finder’s context menu.
The attach as email workflow is practically identical, it consists of “Get Selected Finder Items” followed by “New Mail Message”. Selecting “Mail Selection” from Finder gives me a new mail message with whatever items I had selected attached.
Typo and MarsEdit
Eric Hodel | Wed, 29 Mar 2006 20:31:38 GMT
When posting to Typo using MarsEdit or the admin tools I would end up with a 500 error after submitting the post. Poor MarsEdit didn’t like this very much and would leave the post as a draft which made me unhappy.
While at last night’s Seattle.rb meeting I asked Scott Laird why this might be happening. He said that Typo was probably trying to ping other sites but was taking too long.
So I updated my Apache vhost to set the idle timeout to 120 seconds and added a second process. Now MarsEdit can happily post my blog entries without error!
Thanks Scott!
IMAPCleanse
Eric Hodel | Wed, 29 Mar 2006 06:38:40 GMT
I am a lazy person. I don’t like to delete my mailing list mail. I don’t need any of them, there’s a copy of all the mailing list mails somewhere on the internet. But since I’m lazy I never clean out my mailing list mail boxes.
So I used Net::IMAP and wrote IMAPCleanse to automatically clean out my mailing list mailboxes of messages older than a threshold that are read and not flagged. (So if I flag a message it’ll stay around forever.)
You can install IMAPCleanse as a gem:
$ sudo gem install IMAPCleanse
And read all about how to use it in the RDoc.
The only problem I had was that Net::IMAP didn’t support PLAIN authentication over SSL, so I added that. (I need to whip up some patches for Net::IMAP and fold it back in.)
Autotest is Better than Ever
Eric Hodel | Wed, 22 Mar 2006 21:27:35 GMT
Last night at Seattle.rb’s weekly hacking night I polished off several bugs and features for autotest in preparation for a release of ZenTest this week, possibly even today!
The most-interesting new feature (inspired by comments from Pat Eyler) is a mini continuous-integration mode I added. By running autotest -vcs=cvs, autotest will perform a cvs up every 5 minutes in the course of running its tests. Autotest also understands how to update svn and p4 repositories as well.
There is one open report related to testrb not being found when using the One-Click Ruby Installer. Not having a win32 machine, I can’t confirm whether the bug is ours, in the one-click installer, or a user configuration error.
Ruby Obfuscator update
Eric Hodel | Tue, 21 Mar 2006 08:06:00 GMT
Ryan writes about our accomplishments in our most recent Ruby obfuscator hacking session.
I must add that making blocks work will continue to be a pain the way we’ve implemented obfuscation. The interpreter gets to cheat because it has the AST lying around. We don’t so we’d either have to rebuild it (too fragile to consider) or build a chunk of AST that calls back to C (still fragile). We went with a simple to implement approach that isn’t as forgiving for users but won’t fail when you switch Ruby versions.

Articles