May 20, 2011

Using bulkload cause UnicodeDecodeError (Google App Engine) GAEのbulkloadでUnicode Decode Error

I've been wondering a following loader.py causes unicode decode error when uploading utf-8 csv file by bulkloader to Google App Engine.
class A(bulkloader.Loader):
def __init__(self):
bulkloader.Loader.__init__(self, 'TEST',
[("__key", dummy), ("prop", lambda x: x.decode('utf-8'))])

def generate_key(self, i, values): return values[0]


I tried to avoid UnicodeDecodeError by
x.decode('utf-8')
but it didn't work.
After seeing around datastore.py of Google App Engine code, it turns out (of course) the key also should be unicoded.
So correct code is
class A(bulkloader.Loader):
def __init__(self):
bulkloader.Loader.__init__(self, 'TEST',
[("__key", dummy), ("prop", lambda x: x.decode('utf-8'))])

def generate_key(self, i, values): return values[0].decode('utf-8')
It took a few hours ...

No comments: