thoughts, ideas, code and other things...

Friday, February 19, 2010

Minor 'ascii' codec issues when using urllib

Two times in my life I've come across something like this -
>>> d = {'data' : u'\u2013'}                                                                    
>>> urllib.urlencode(d)
Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python2.6/", line 1268, in urlencode
v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 0: ordinal not in range(128)

And now that I've a generic solution this this issue (thanks to Jarret Hardie and Grego), I would like to pin it here so that I know where to look for it in future.

# gets rid of ascii codec shite
def sanitize_codec(fooDict,charset):
return dict([(k, v.encode(charset)) for k, v in fooDict.items()])

weird_data = sanitize_codec(weird_data,'utf-8')

Pretty useful when you've to pass on some chunk of an rss feed to some other webapplication, say from gmail feed to twitter rest api.

Labels: , , ,


Post a Comment

Subscribe to Post Comments [Atom]

<< Home