Trackbacks

It's not a big secret: I'm writing blogging software. This, in a sense, sucks because it's already been written many, many times; while I coded I often (like, every 20 minutes) tried to think of how this could be abstracted into a plugin or five. Since this is a big task I might do it later.

I recently had the Pingback vs. Trackback debate with myself. I'm under a deadline and I want to get as many features in before the deadline as possible, so I had to pick one. Our target market doesn't know blogging, so they won't notice if one of these is missing at first.

I chose to do Pingback:

  • It's simple to discover.
  • It's simple to send.
  • It uses existing standards correctly and intuitively (HTTP, XML-RPC).
  • It has a very clear specification.

So I set in on it and discovered that it's not dead-easy to do from Rails. Sending a Pingback is dead easy: just open a XML-RPC connection:

def send_pingback(pb_url, url)
  uri = URI.parse(pb_url)
  server = XMLRPC::Client.new(uri.host, uri.request_uri)
  server.call('pingback.ping', self.url, url)
end

To receive them requires an XML-RPC server, and Action Web services, and lots of reading about how to do Web services in Rails. Okay, I'll save it for later. On to Trackbacks.

Receiving a Trackback is easy: just have a method in the blog controller that takes the blog post ID and the URL, title, excerpt, and blog name. Save those. Send back a simple XML according to the spec and bam, done.

Sending a Trackback is a process with an overly-complex step. So you have the URL, u, that you need to track back; now you need the Trackback URL. This is embdedded in the XHTML of u. So you do a GET on u and parse out the RDF. However, since the spec also deals with HTML, the RDF might not actually be embdedded in some XHTML; it might be in a comment in a HTML document. So you regexp it out:

page =~ /(<[^><\/: ]*:*RDF *.*<\/[^><\/: ]*:RDF>)/m
rdf = $1

Now you have a string containing RDF. This might not be the correct chunk of RDF, but the practice of figuring out which RDF is correct didn't match up perfectly with the theory, so I'm just assuming there's only one RDF.

So, now you need to get the value of the ping predicate. I did some Web searching to see how people did it in Rails; they haven't. Okay, let's see the example implementation that the spec shows: oh, it uses a regexp to parse RDF/XML. Good. That's what I like to see. Yeah.
I decided to implement a shortened version of that by using Redland:

model = Redland::Model.new
parser = Redland::Parser.new
parser.parse_string_into_model(model, rdf, url)
predicate = Redland::Node.new(URI.parse('http://madskills.com/public/xml/rss/module/trackback/ping&#39;))
urls = model.find(nil, predicate)

Now you have the URL in urls[0].object.value, unless urls.empty?.

Actually sending the Trackback info at this point is easy: just POST the title, blog_name, excerpt, and URL to the Trackback URL, with the Content-Type ' application/x-www-form-urlencoded':

u = URI.parse tb_url
res = Net::HTTP.start(u.host, u.port) do |http|
http.post(u.request_uri,
  { :title => self.title,
   :excerpt => self.body.first(254),
   :url => self.url,
   :blog_name => self.user.name }.,
  map {|k,v| "#{k}=#{v}"}.join('&'),
  { 'Content-Type' => 'application/x-www-form-urlencoded; charset=utf-8' })
end

I might try to package that into a plugin. It affects both the controller (receiving Trackbacks) and the model (sending Trackbacks on after_save). Maybe acts_as_trackbackable or something.

Advertisements

One Comment

  1. Posted March 25, 2010 at 12:32 pm | Permalink

    Interesting topic. I wished I could read more, but i have to go back to work now… But I’ll be back Polynesia — memory loss in parrots.


%d bloggers like this: