Monthly Archives: November 2007

Spider bugfix

There were two issues with version 0.4.0 of Spider, both caught by Henri Cook. These are now fixed in 0.4.1: As documented, you use IncludedInMemcached like this: require ‘spider/included_in_memcached’ . Sometimes HTTP redirects assume a base URL; this is now handled. Advertisements

Spider with memcached

The problem with Spider has been that it can use all your memory. The reason is that the Web is a graph, and to avoid cycles Spider stores each URL it encounters. Since the Web is a really, really, really gigantic graph, you eventually run out of memory. Now you can use memcached to use […]

Proxied Spider

Aha: if you need to proxy your Spider calls, look no further than the HTTP Configuration gem. I didn’t write this, and have yet to use it, but I think it goes like this: So next up will be a tutorial with stuff like this and other cool stuff, plus a way to use memcached […]

Spider: API changes, setup and teardown, HTTP headers

The newest version of Spider, 0.3.0, is hitting your gem tree Real Soon Now. This release features: Set the headers to a HTTP request. This can be used to set the cookies, user agent, and many other fine things. setup and teardown handlers. Seems like a good place to set the headers if the headers […]