Tom MacWright

Owning my reading log

Since 2007, I’ve recorded my reading habits with goodreads. It’s always been a nice, simple place where I try to beat my old reading records and save reviews and ratings for future reference. I have few qualms with the site technically, but I’ve been reevaluating my presence on the internet and keeping a much closer eye on how my actions benefit AppAmaGooBookSoft - the Apple Amazon Google Facebook Microsoft oligopoly.

goodreads was acquired by Amazon in 2013, but haven’t made much of a deal about it: Amazon isn’t mentioned on their front page, ‘about us’ page - well, pretty much anywhere but the Jobs page, on which applicants will probably be surprised that they’ll be applying to a wing of Amazon.

Free work

I’m making more of a deal about this than it is: goodreads isn’t an evil company - they’re mostly sticking to their mission of cataloging books and connecting with readers. But still - when I read a book, contribute a rating, write a review, I’m essentially working for Amazon. I feel about it the same way I feel when I fill in a ReCAPTCHA - that ‘are you a robot’ form that was acquired by Google and now has you working to improve their machine learning and mapping systems.

Back when I was thinking about OpenStreetMap and ‘crowdsourcing’ more, the academic side referred to crowdsourcing as ‘volunteered geographic information’ - VGI, of course - which always felt weird to me. Perhaps it was just too accurate that what we see isn’t a commons in which the contributors are also the people getting the most value, but typically people donate their time and energy or are coerced to (ReCAPTCHA), and the entity deriving the most value is a corporation.

In this lens, sometimes I think about whether, instead of not using AppAmaGooBookSoft services for a day or a week, would it be possible to exist on the internet without assisting AppAmaGooBookSoft. Is it no longer possible to express a thought or contribute to the world’s knowledge without also training a TensorFlow model somewhere?

Hosting it myself

Which is to say, I didn’t want to participate on goodreads anymore: I wanted to own my own reading history. Which, now, I do. There’s now a ‘books’ item in my header, and the first review I wrote on-site is The Housing Monster.

The technical details are pleasantly dull. I’m using Jekyll for this, like I do for the rest of the site, I wrote a simple bit of JavaScript that translates between the goodreads CSV export and Jekyll-friendly YAML headers.

Book pages have a custom layout and style that embeds structured metadata. I’m iffy on the value of that metadata - whether annotations have any value. Semantic web has a real religious bent to it that I don’t feel. I want to see the implementations and the full working systems, and I think it’s been long enough since the introduction of RDFa, Microdata, and JSON-LD that we should be seeing practical, real uses of them. And I’m just not seeing those uses.

But it doesn’t hurt to throw some extra attributes on a page. It’ll be easier to scrape for the bots.


I’d like to eventually support POSSE: Publish (on your) Own Site, Syndicate Elsewhere. I think that’s a really good strategy, and I don’t do it much yet - though I do consider Tapiriik, the service I use to sync my running data, to be a good example.

I’ve also been researching book numbers and writing a ‘universal ID translator’ service to generate all types of IDs - ISBN, EAN, OLID, OCLC, goodreads, ASIN, and so on, given one ID. This is mostly because for some of the books I’ve read, I’ve cited their eBook editions, and those don’t always have ISBN numbers.