What is the holy grail to understand and solve encoding errors?
I get "UnicodeEncodeError: 'ascii' codec can't encode characters in position ..." or similar in my python code. What the hell did I do wrong?
To understand encoding problems you can face within python code in your Plone instance (probably form handlers or something similar), you have to understand one central design pattern of python (that has come to python in version 2.0):
Python understands 2 kinds of strings:
- unicode strings
- byte strings
If you declare them literal that would be u'This is a unicode string' (though unneccessary in english...) and 'This is a normal (byte) string'
both are objects and both have methods to convert to each other: unicode(bytestring) gives you a unicode string out of a byte string and unicodestring.encode('encoding') gives you a byte string out of a unicode string.
Important to know are some conventions and fundamental principles:
- python does not create unicode object strings by default (Note from the reviewer: but Zope 3 does)
- unicode strings are not the same as utf-8 encoded byte string (most important to understand!)
- python converts unicode strings to ascii by default on printing them and with "strict" policy (throw encoding errors on conversion problems)
As this is only a FAQ, that should be enough for you to read further on and find the correct tutorials and so on