In a nutshell, Python 3 will throw a TypeError exception when you try to encode a string without specifically specifying the encoding format (Windows -1255, UTF-8, UTF-16 and so forth).
Reproducing the typeerror exception
You have a Python string which you have read from a file, Pandas series or dataframe, Web API or a database and would like to serialize it in order to store it in disk or in memory for faster, more efficient processing. Still you will might want to convert your bytes to string in order to access string manipulation methods which are way broader than those supported by the bytes object.
You call the bytes() or bitearray() functions and pass your string as the parameter:
my_str = 'This is a Python string.' my_bytes = bytes(my_str) # this will throw the typeerror exception my_byte_array = bytearray(my_str) # also this line will raise a typeerror
This will lead to the following exception – screenshot from Jupyter, but similar error in any Python IDE such as PyCharm, VS Code etc’:
Fixing the string without encoding error
To solve this error you should pass the encoding format to the bytes() or bytes_array() function. This is also very important when decoding the bytes back to string.
my_bytes = bytes(my_str, 'windows-1255') print (my_bytes)
The code written above will render the following bytes object:
b'This is a Python string.'
Whereas, encoding with UTF-16:
my_bytes = bytes(my_str, 'UTF-16') print (my_bytes)
Will return the following:
b'\xff\xfeT\x00h\x00i\x00s\x00 \x00i\x00s\x00 \x00a\x00 \x00P\x00y\x00t\x00h\x00o\x00n\x00 \x00s\x00t\x00r\x00i\x00n\x00g\x00.\x00'
Hope it helps!