Python Requests UnicodeEncodeError When Encoding to ASCII

The Error Message

While converting unicode to Ascii from Pythons requests module, a random exception will happen.

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 60: ordinal not in range(128)

How to Recreate

>>> test = u'\u2013'
>>> '{}'.format(test)
Traceback (most recent call last):
  File "<input>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 0: ordinal not in range(128)

This breaks because my default system encoding is ASCII and we are trying to convert an Unicode character that does not map to ASCII.

>>> import sys
>>> sys.getdefaultencoding()

Quick Fix 1

If two unicode strings were joined… there are no errors.

>>> u'{}'.format(test)

With this solution, you will have explicitly define each string as unicode.

Quick Fix 2

This is a quick fix for the one offs, but if this happens a lot.

A permanent or more global fix should be sought after, so you do not have to change each line.

This is nice if the only solution is to generate some type of human readable text. An example would be for logging or error messages.

If the unicode character is needed for the system that sent to understand the command, this might not be a good solution at all.

  • Find the code where your are converting Unicode to Ascii and tack on the encode method. .encode('ascii', 'ignore)
>>> test = u'\u2013'
>>> '{}'.format(test.encode('ascii', 'ignore))
  • Now you just get something blank, but it is readable.

More Global Fix (But Not Pretty to the Human Eyes)

At the top of your script (or Python file) type or paste:

from __future__ import unicode_literals

Now, when the same commands are run, you get.

>>> 'dan {}'.format(test)
u'dan \u2013'

The benefit of this is that no data is lost. This might be good to start with, then if you get some ugly output, use the .encode('ascii', 'ignore) option.

NOTE:This makes all strings unicode. Use this carefully as there could be unforeseen issues.

Here is a good discussion For unicodeLiterals or Not

With from __future__ import unicode_literals

>>> isinstance('', str)

Without from __future__ import unicode_literals

>>> isinstance('', str)

This is broken, right?

About Daniel Fredrick

Technology enthusiast, Programmer, Network Engineer CCIE# 17094

View all posts by Daniel Fredrick →

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.