Sunday, June 29, 2008

OCaml's Unix module and ARGV

Be warned: the string array argument to Unix.create_process et al. represents the entire argument vector: the first element should be the command name. I didn't expect this, since there is a separate prog argument to create_process, and ended up with weird behavior* like,


# open Unix;;
# create_process "sleep" [|"10"|] stdin stdout stderr;;
10: missing operand
Try `10 --help' for more information.
- : int = 22513

This can be a bit insidious—in many cases skipping the first argument will only subtly change the behavior of the child process.

Note that the prog argument is what matters in terms of invoking the sub-process---the first element of the argument vector is what just what is passed into the process. Hence,

# create_process "gcc" [|"foo";"--version"|] stdin stdout stderr;;
- : int = 24364
foo (GCC) 4.2.3 (Ubuntu 4.2.3-2ubuntu7)


* Actually, this "weird behavior" is the test that finally made me realize what was going on. The emergent behavior of my app was much more mysterious...

Tuesday, June 24, 2008

Resetting a Terminal

You tried to cat a binary file and now your terminal displays nothing but gibberish? Just type reset (it may look like ⎼␊⎽␊├).

It has taken me more than 10 years to learn this.

[UPDATE] Interestingly, this doesn't work in my (alas, ancient) Mac OS X 10.3.9 terminal. Any tips? Also, why did curl URL_TO_BINARY hose my terminal in the first place?

Monday, June 23, 2008

Ripping a Muxtape

So Muxtape is a pretty cool site, but a little frustrating. If a friend posts a really cool mixtape (maybe you know somebody who just barely entered the Aughties), it would be nice to be able to download it and save it, just like all those old cassette mixtapes sentimentally rotting underneath your bed.

Enter muxrip. This simple Ruby script takes the name of the mixtape, downloads it, and creates a playlist for you in M3U or iTunes format. (Acknowledgments: the script basically just adds some polish to this previous effort.)

PLEASE: Use this script responsibly. It would be a shame for Muxtape to get shut down.

ALSO: I wouldn't be surprised if this suddenly stopped working. It depends on elements of the page layout and URL scheme that might (almost certainly will) change without notice.

Saturday, June 21, 2008

An Open Letter to eMusic

I regret to inform you I am canceling my eMusic subscription,
effective immediately. Although I admire the fact that you have
provided DRM-free music downloads since the pre-Napster era and try my
best to support small, independent businesses, my dissatisfaction with
your service has been too great for too long and the convenience and
selection offered by your competitors (e.g., Amazon's MP3 store) is
too good to pass up. It pains me to see big players like Amazon and
Apple push companies like eMusic out of business, but if you are to
survive, you will have to be more innovative and customer-focused than
you have been in the time that I have subscribed. I hope that you will
re-think your business model, increase the value of your product, and
win me back as a customer in the future.

In that spirit, I want to offer some specific advice about how your
service could improve.

- Your site provides almost no information about what albums will be
available when. So far as I can tell, the only information provided
is a small "Coming Soon" box with no more than 8 artists---often
just the names of the artists without release dates---in the bottom
corner of the "New on eMusic" page. Albums that have been released
and are available for download elsewhere are not acknowledged on
the artist page, not even to say "this album will be available
soon." For example, Sloan's "Parallel Play" has been available on
Amazon since June 10. As of June 21, I can find no information on
your site about whether this album will ever be available, even
though you offer all of Sloan's previous albums on the same label.

- If I want to download an album with more tracks than I have in my
monthly subscription, a pop-up asks me if I want to upgrade my
subscription (i.e., to permanently increase my monthly fee and
download allotment). Although there are "Booster Packs" allowing
the one-time download of 10 or 20 tracks, this option is not
presented in the pop-up, nor in the page presented when one clicks
on "More Options"---only a savvy and determined user will find
them. The Booster Packs should not only be made easily available at
this point, there should be an additional option that you do not
provide: to download as many tracks as I have available within my
subscription and queue up the remaining tracks for download when my
account refreshes. This doesn't have to be the first option
presented---I understand the desire to nudge your users towards
more spending more money on the site---but it should be available
(and one should not cross the line from nudging your customers to
misleading them and ripping them off).

These two points may seem inconsequential, but they have been a
constant source of annoyance for me. It is small matters like these
that build a customer relationship that survives a spotty selection
and waiting for the latest indie hits.

Best regards,
Chris

Thursday, June 19, 2008

Sunday, June 08, 2008

Tweaking an RSS Feed in Python

I've been teaching myself a bit of Python by the just-in-time learning method: start programming, wait for the interpreter to complain, and go check the reference manual; keep the API docs on your hard disk and sift through them when you need a probably-existing function. Recently, I wanted to write a very simple script to manipulate some XML (see below) and I was surprised (though it has been noted before) at the relatively confused state of the art in Python and XML.

First of all, the Python XML API documentation is more or less "go read the W3C standards." Which is fine, but... make the easy stuff easy, people.

Secondly, the supposedly-standard PyXML library has been deprecated in some form or fashion such that some of the examples from the tutorial I was working with have stopped working (in particular, the xml.dom.ext module has gone somewhere. Where, I do not know).

So, in the interest of producing more and better code samples for future lazy programmers, here's how I managed to solve my little problem.

The Problem: Twitter's RSS feeds don't provide clickable links

The Solution: A script suitable for use as a "conversion filter" in Liferea (and maybe other feed readers too, who knows?). The script should:


  1. Read and parse an RSS/Atom feed from the standard input.

  2. Grab the text from the feed items and "linkify" them

  3. Print the modified feed on the standard output.


Easy, right? Well, yeah. The only tricky bit was using the right namespace references for the Atom feed, but again that's only because I refuse to read and comprehend the W3C specs for something so insignificant. I ended up using the lxml library, because it worked. (The script would be about 50% shorter if I hadn't added a command-line option --strip-user to strip the username from the beginning of items in a single-user feed and a third shorter than that if it only handled RSS or Atom and not both.)

Here's the code, in toto. (You can download it here.)

#! /usr/bin/env python

from sys import stdin, stdout
from lxml import etree
from re import sub
from optparse import OptionParser

doc = etree.parse(stdin)

def addlinks(path,namespaces=None):
for node in doc.xpath(path,namespaces=namespaces):
# Turn URLs into HREFs
node.text = sub("((https?|s?ftp|ssh)\:\/\/[^\"\s\<\>]*[^.,;'\">\:\s\<\>\)\]\!])",
"<a href=\"\\1\">\\1</a>",
node.text)
# Turn @ refs into links to the user page
node.text = sub("\B@([_a-z0-9]+)",
"@<a href=\"http://twitter.com/\\1\">\\1</a>",
node.text)

def stripuser(path,namespaces=None):
for node in doc.xpath(path,namespaces=namespaces):
node.text = sub("^[A-Za-z0-9_]+:\s*","",node.text)

parser = OptionParser(usage = "%prog [options] SITE")
parser.add_option("-s", "--strip-username",
action="store_true",
dest="strip_username",
default=False,
help="Strip the username from item title and description")
(opts,args) = parser.parse_args()

# For RSS feeds
addlinks("//rss/channel/item/description")
# For Atom feeds
addlinks( "//n:feed/n:entry/n:content",
{'n': 'http://www.w3.org/2005/Atom'} )

if opts.strip_username:
# RSS title/description
stripuser( "//rss/channel/item/title" )
stripuser( "//rss/channel/item/description" )
# Atom title/description
stripuser( "//n:feed/n:entry/n:title",
namespaces = {'n': 'http://www.w3.org/2005/Atom'} )
stripuser( "//n:feed/n:entry/n:content",
namespaces = {'n': 'http://www.w3.org/2005/Atom'} )

doc.write(stdout)


If there are any Python programmers in the audience and I'm doing something stupid or terribly non-idiomatic, I'd be glad to know.

Thanks in part to Alan H whose Yahoo Pipe was almost good enough (it doesn't handle authenticated feeds, as far as I can tell) and from whom I ripped off the regular expressions.

[UPDATE] Script changed per first commenter.

Top Chef and BSG Catch-Up

I have been remiss in blogging Top Chef and Battlestar Galactica this year. Suffice it to say I'm watching and enjoying, but my ardor for both has somewhat dimmed.

Unlike previous seasons of Top Chef, I don't have a real rooting interest in any of the cheftestants this year. If I were forced to choose I would guess Richard is probably going to win (he's about as well-liked as Stephanie and more consistent). I—along with the rest of the world—loathe Lisa, but she's just kind of a bad trip, not really a boo-hiss, lie-to-your-face villain in the Tiffani/Omarosa mold. An interesting bit of data, for those Lisa-haters who suspect they are suffering from an irrational aversion to her attitude, looks, and posture: she has—by far—the worst record of any cheftestant to appear in a Top Chef finale (1 Elimination win, 1 place, no Quickfire wins; she has been up for elimination or on the losing team in the last seven consecutive episodes (!)). Incidentally, Richard (3 Elimination wins, 5 places, and 2 Quickfire wins) and Stephanie (4 Elimination wins, 5 places, and 1 Quickfire win) have by far the best records of any previous cheftestant, period. (In comparison, the previous three winners (Harold, Ilan, and Hung) had only 4 Elimination wins total.)

On the other side, BSG has been doing a lot of the mythical flim-flam (I don't really care where Earth is or whether they ever find it) and not so much of the intense post-9/11 fractured-mirror business that made the first three seasons so addictive. The characters have been getting pushed around the chessboard willy-nilly without much attention paid to consistency or plausibility (to wit: President Lee Adama), all in service of a presumed "mind-blowing" series finale (to arrive not before calendar year 2009, as I understand it) that I am quite certain will disappoint (I'm not going to be X-Files'ed ever again).

So there's your TV-blogging for the year. Back to work.