silly Python unicode mistake
feedparser and twitter modules, and I am trusting them to handle unicode strings without trouble. With most well-written Python modules (and these two are no exception!) methods will return unicode strings as they see fit, and other methods will accept these unicode strings and handle all the nitty gritty encoding details for me.
A simplified version of my workflow would look like this:
def post(entry): title = entry.title print "posting [%s]" % title api.PostUpdate(title) # api is a twitter Api object feed = feedparser.parse(config["feed"]) for e in reversed(feed.entries): if not e.id in seen: post(e)
It’s the print statement. All the APIs I’m using have zero trouble with unicode, but print wants
to encode for your terminal and it’ll usually assume that that is ASCII. My ‘debugging’ output actually broke
the program. My workaround is to say title.encode("ascii","replace")
Brend on #python pointed out to me that the issue is not, exactly, print. The issue is interpolating title
into a non-unicode string. Depending on environment, using print on the unicode object might in fact work. For those environments, saying print u"posting [%s]" % title could help. In my case however, I ran into the issue
from cron with no locale set at all, so dumbing the string down to ascii is still the right thing to do.