Spider: API changes, setup and teardown, HTTP headers

The newest version of Spider, 0.3.0, is hitting your gem tree Real Soon Now. This release features: Set the headers to a HTTP request. This can be used to set the cookies, user agent, and many other fine things. setup and teardown handlers. Seems like a good place to set the headers if the headers […]

Spider bug fix release

John Nagro immediately reported errors with the Spider Ruby gem, so I’ve fixed them in 0.2.1. You should upgrade, especially if you want support for: URLs without any path component (e.g. http://example.com?s=1). HTTP redirects. HTTPS. John also had some good ideas, so here is what is in the works: The ability to construct a complete […]

An updated way to spider the Web with Ruby

I’ve released version 0.2.0 of Spider. Everything has changed: Use RSpec to ensure that it mostly works. Use WEBrick to create a small test server for additional testing. Completely re-do the API to prepare for future expansion. Add the ability to apply each URL to a series of custom allowed?-like matchers. BSD license. The new […]

Spider the Web with Ruby

I wrote a Ruby library for crawling the Web. Use it to take down The Man, like so: I used it to gets people’s addresses from around the Web. I plan to put them on a map. I like putting things on maps. It once took obscene amounts of memory, until I discovered that Ruby […]

Follow

Get every new post delivered to your Inbox.