BlogsDope image BlogsDope

Python bytes() function

Feb. 1, 2021 PYTHON FUNCTION 22448

In Python, for binary data manipulation, we use the bytes() function. Before getting to know the bytes() function in Python, let's get a brief idea about Bits, Bytes, and Hexadecimal numbers.

Bits and Bytes

Computers work with binary numbers, or in other words, electrical values that can be represented by ones and zeros. In computing, a bit is the smallest unit of information and can only take two values; "0"(zero) or "1"(one). Everything on the computer is stored in units of bits. A byte is made of 8 bits. It represents an 8-digit binary number. A byte, historically, is a computer architecture term that refers to the memory storage of a single character. As an example, a stream of bits can constitute a visual image for a program that displays images.

Hexadecimal numbers

As we know, Decimal and Binary numbers use base-10 and base-2 numeral system respectively. Similarly, Hexadecimal numbers use the base-16 numeral system to represent any number. In Hexadecimal representation, 16 symbols {1,2,3,4,5,6,7,8,9,0, A, B, C, D, E, F} are used. For example,

The decimal number 0, is the hexadecimal number 0, which is the binary number 0. 

The decimal number 10, is the hexadecimal number A, which is the binary number 00001010. 

The decimal number 255, is the hexadecimal number FF, which is the binary number 11111111.

Now, let's see what is bytes() function in Python:

Whats is bytes() in Python?

In Python, there are six sequence types namely:
  1. strings
  2. byte sequences
  3. byte arrays
  4. lists
  5. tuples
  6. range objects
Bytes and Byte arrays are a sequential data type in Python3 (not supported in Python2) that are used to store and represent a sequence of byte values. The bytes() function in Python creates a bytes object which is immutable - you cannot change it after creation. A bytes object is like a string but it uses only byte characters consisting of a sequence of 8-bit integers in the range 0<=x<256. It is used to convert objects into bytes objects or create empty bytes object of the specified size.

Points To Remember:

  • The bytes object is immuatble
  • The range of integers in bytes object should be 0<=x<256.

Syntax of bytes()

bytes(source, encoding, error)

As seen in the syntax above, the bytes() function accepts 3 parameters:

  • source(optional) - This parameter allows us to initialize the bytes object. It is used while creating the bytes object. The source can be integers, strings, iterable(list, tuple, etc).
  • encoding(optional) - If the source parameter is a string, then you should pass an encoding( like UTF/ASCII) then converts the string to bytes.
  • error(optional) - specifies a way to handle the error if any arises when encoding conversion fails. (used only when source parameter is a string)
​All three parameters seen in the above syntax are optional.

Return Type of bytes()

print(type(bytes()))

​<class 'bytes'>

How does bytes() work?

Let's see some examples of the bytes() function:

Example 1: Integer as a Parameter

If the source parameter is an Integer, Python's bytes() function will create an array of a provided integer size, all initialized to NULL.

print(bytes())

a = 4
print(bytes(a))

b''

b'\x00\x00\x00\x00'

As seen in the above example:​

  1. Without an argument, an array of size 0 is returned.
  2. If we pass only one integer argument to the bytes() function, it uses this input argument to determine how many bytes should be created. It just uses bytes with the value 0, in byte notation x00 to fill the byte.

The b prefix signifies a bytes string literal.

\x signifies Hexadecimal number.

Example 2 : Iterable as a Parameter

If the source parameter is iterable like Lists in Python, then it returns an array of the size of iterable with elements equal to iterable elements (range 0 <= x < 256).

a = [1, 5, 8]
print(bytes(a))

​b'\x01\x05\x08'

Example 3: String as a Parameter

Strings are in human-readable form and need to be encoded so that they can be stored on a disk. To convert a string into bytes, simply pass the string as the source(first parameter) argument and then pass the encoding as the encoding (second parameter) argument. Let's see an example for this:
s = "CodesDope"

# 'utf-8' encoding
array1 = bytes(s, 'utf-8')

# 'utf-16' encoding
array2 = bytes(s, 'utf-16')

print(array1)
print(array2)

# actual bytes in the string
for byte in array1:
    print(byte, end=' ')
print("")
for byte in array2:
    print(byte, end=' ')<br/>

​b'CodesDope' 

b'\xff\xfeC\x00o\x00d\x00e\x00s\x00D\x00o\x00p\x00e\x00'

67 111 100 101 115 68 111 112 101 

255 254 67 0 111 0 100 0 101 0 115 0 68 0 111 0 112 0 101 0

In the above example, we have stored a string in a variable s​ and printed the UTF-8 and UTF-16 encoding of the string. Also, we are using for loops for each UTF encoding to print the actual bytes.
Note: The array1 for which the encoding is UTF-8, appears user-friendly textual representation, but the data contained in it is in bytes. We have proved that using the for loop above.
Now, let's see the error handling parameter in the bytes() function:
The error parameter is given when some action is to be taken if the encoding of the string fails. Some of the basic error handlers are:
  • strict - Raises the default UnicodeDecodeError in case of encode failure.
  • ignore - Ignores the unencodable character and encodes the remaining string.
  • replace -  Replaces the unencodable character with a ‘?’.
Let's see an example on each:
1. when encoding = ASCII and error = strict:
string = 'CÖdesDÖpe'
print("When the error is strict : " +
      str(bytes(string, 'ascii', errors='strict')))

Traceback (most recent call last): On line 3, in <module> str(bytes(string, 'ascii', errors='strict'))) 

UnicodeEncodeError: 'ascii' codec can't encode character '\xd6' in position 1: ordinal not in range(128)

2. when encoding = ASCII and error = ignore:
string = 'CÖdesDÖpe'
print("When the error is ignore : " +
      str(bytes(string, 'ascii', errors='ignore')))

​When the error is ignore : b'CdesDpe'

3. when encoding = ASCII and error = replace:

string = 'CÖdesDÖpe'
print("When the error is replace : " +
      str(bytes(string, 'ascii', errors='replace')))

​When the error is replace : b'C?desD?pe'

When does bytes() give a TypeError?

1. If we try to modify the bytes object, it will raise a TypeError since it is immutable by nature. Below is an example for the same:

elements = [1,2,3,4]
byte_obj = bytes(elements)
print(byte_obj)

byte_obj[0] = 5
print(byte_obj)

​b'\x01\x02\x03\x04' 

Traceback (most recent call last): On line 5, in <module> byte_obj[0] = 5 

TypeError: 'bytes' object does not support item assignment

2. If the source parameter is None, it will raise a TypeError, as it cannot convert a None object to a byte array. Below is an example for the same:

b = bytes(None)
print(b)

​TypeError: cannot convert 'NoneType' object to bytes

When does bytes() give a ValueError?

If we use the bytes() function on an iterable like List in Python that contains at least one integer greater than the maximum number representable by 8 bits, namely 255, or smaller than 0, Python will throw a ValueError. We can fix it by ensuring that each number in the iterable can actually be represented by 8 bits and falls into the interval 0<=x<256. Let's see an example:
# This will run with no error
l = [0,2,255,7]
byte_obj = bytes(l)
print(byte_obj)

# This will throw an error since 999>255
l = [999,5,40]
byte_obj = bytes(l)
print(byte_obj)

# This will throw an error since -10<0
l = [-10,4,1]
byte_obj = bytes(l)
print(byte_obj)

b'\x00\x02\xff\x07'

ValueError: bytes must be in range(0, 256) 

ValueError: bytes must be in range(0, 256)

Note: The range doesn't include the number 256. If 256 is one of the integers in the List, then the bytes() function will throw a ValueError error:
l = [1,2,256]
byte_obj = bytes(l)
print(byte_obj)

​ValueError: bytes must be in range(0, 256)


Liked the post?
Rarely seen, always noticed.
Editor's Picks
0 COMMENT

Please login to view or add comment(s).