How to extract a list of pages containing a string from a MediaWiki XML dump

Here comes one of those “I’ve got to write that down somewhere, and maybe it will be useful for someone else, too” posts:

I needed to get a list of MediaWiki page names of pages that contained a certain string (“needle”) from a MediaWiki XML dump. This is how I got it, using XMLStarlet:

xml sel -N mw=http://www.mediawiki.org/xml/export-0.3/ \
 -t \
  -m "/mw:mediawiki/mw:page/mw:revision/mw:text[contains(string(.), 'needle')]" \
  -n \
  -v "../../mw:title" wikiexport.xml \
| xml unesc

Creating multi-page PDF files with GIMP and `convert`

Occasionally I have to sign some document (old style, with a pen) and send it electronically. Sometimes those are multi-page documents. Since it is uncommon to send it back as multiple image files after scanning, and multi-page image formats are uncommon as well, I’d like to send them as PDF file. Before I discovered this method, I used to insert the scanned images into OpenOffice Writer, and then create the PDF with it. This works, but it is a bit cumbersome to tell OpenOffice Writer to maximise the images (eliminating page borders, etc.), especially when there are a lot of pages. It just doesn’t feel like a real solution.

So, here we go:

Prerequisites:

  • GIMP (I’m currently at version 2.6.8, but this will probably work with older versions as well)
  • GraphicsMagick (tested with 1.3.8) or ImageMagick (tested with 6.5.8.8)

Procedure:

  1. Get the scanned pages opened as layers of one image in GIMP. If they are available as files already, you can use File / Open as Layers….
  2. Make sure that the layers are ordered in the following way: Page 1 must be the bottom layer, the last page must be the top layer. You can reorder them via the “Layers” dialogue (activate it via the Windows / Dockable Dialogues menu if you don’t see it)
  3. Save As… and choose “MNG animation” or just add “.mng” to the filename. (In case you are wondering, MNG is the animated counterpart to PNG).
    A dialogue window saying “MNG plug-in can only handle layers as animation frames” will come up – choose “Save as Animation” here and press the Export button. In the next dialogue you don’t need to make any changes to the defaults, just press the Save button.
  4. Now, open a console window and simply enter
    convert document.mng document.pdf

That’s it – you now have your PDF file ready for sending!

Update (2010-02-08):
As chithanh pointed out in comment 1, there is another convenient way to accomplish the same. It does not involve GIMP, but instead requires pdftk to concatenate PDF files. Please see comment 2 for details.

Update (2010-03-01):
And yet another way (definitely the most straight-forward one, if you have the pages as single image files already) was pointed out by goffrie in comment 5.

An equaliser that works with MPD and ALSA

A few months ago I was looking for a way to lower the amplification of the lower frequencies while listening to music on extremely bass-heavy in-ear headphones. The only way to accomplish that, seemed to be using JACK. Since that seemed overkill to me, I gave up.

Today I found a solution that does not require JACK: Charles Eidsness’ ALSAEQUAL. This is how you get it working with MPD on Gentoo:

  1. `emerge alsaequal`
  2. Create an .asoundrc file in the home directory of the user that runs MPD on your system (see `grep "^user" /etc/mpd.conf`) with the following content:

    ctl.equal {
      type equal;
    }
    
    pcm.plugequal {
      type equal;
      slave.pcm "plug:dmix";
    }
    
    pcm.equal {
      # Or if you want the equalizer to be your
      # default soundcard uncomment the following
      # line and comment the above line.
    # pcm.!default {
      type plug;
      slave.pcm plugequal;
    }
    

    (copied from the ALSAEQUAL website – modified so that dmix is being used, which allows playing multiple audio sources simultaneously)

  3. Find the audio_output section in your /etc/mpd.conf where you currently configure MPD to use ALSA’s default device. Change it so that it looks like
    audio_output {
      type    "alsa"
      name    "equal"
      device  "plug:plugequal"
    }
    
  4. `/etc/init.d/alsasound restart && /etc/init.d/mpd restart`
  5. Play a track with MPD, then run `alsamixer -D equal`*. Modify any of the frequency band sliders and observe the effect.

Other applications will be unaffected by the equaliser, if you don’t change the .asoundrc according to the comment, and if you don’t configure them to use the ‘equal’ ALSA device.

*) Note that alsamixer needs to be run as the user that runs mpd. See Ian’s comment for details.

Update (2012-06-16): Having the .asoundrc in place causes Skype to crash after a few seconds (see also this Gentoo Forums post).