Program or be Programmed

After reading Jürgen’s blog post “The computer as an appliance” on Friday, I followed the link he placed to Douglas Rushkoff‘s latest book, “Program or be Programmed“. The great title alone sparked my interest, and the description on the publisher’s page got me to immediately order an e-book copy. A few seconds later I started reading it (it comes in multiple formats – I chose to read the EPUB version with Aldiko on my Android phone – it’s a great way to read a book!)

So, just finished it, and I really liked it. It got me thinking a lot, and examining my own life in the Digital World. It also gave me a new perspective on anonymity on the net.

Douglas, thanks for your great work! 🙂

How to extract a list of pages containing a string from a MediaWiki XML dump

Here comes one of those “I’ve got to write that down somewhere, and maybe it will be useful for someone else, too” posts:

I needed to get a list of MediaWiki page names of pages that contained a certain string (“needle”) from a MediaWiki XML dump. This is how I got it, using XMLStarlet:

xml sel -N mw=http://www.mediawiki.org/xml/export-0.3/ \
 -t \
  -m "/mw:mediawiki/mw:page/mw:revision/mw:text[contains(string(.), 'needle')]" \
  -n \
  -v "../../mw:title" wikiexport.xml \
| xml unesc