BlogsDope image BlogsDope

Python Generators

Dec. 15, 2020 PYTHON GENERATORS 1372

Today, we are going to learn about Generators, what they are, how to use them, and their advantages. Before going further, make sure that you know iterators. If you don’t, learn about them here.

Python Generator Function


A generator function is a function that returns a generator object, which is iterable, i.e., we can get an iterator from it. Unlike the usual way of creating an iterator, i.e., through classes, this way is much simpler. Moreover, regular functions in Python execute in one go. In other words, they cannot be stopped midway and rerun from that point. The lines of code after the return statement never get executed. In a generator function, however, we use yield expression instead of return, which allows us to maintain the local state of the function. When we invoke a generator function, it just returns an iterator. Note that it does not start to execute at this point.

Let’s understand how a generator function works through an example.

def test_generator():
    i = 0
    print("First execution point")
    yield i
    i += 1
    print("Second execution point")
    yield i
    i += 1
    print("Third execution point")
    yield i


obj = test_generator()
print(obj)

Output

<generator object test_generator at 0x7fad8b858a98>

The above test_generator() function contains a variable i that gets incremented after each yield statement. When we call test_generator(), it returns a generator object, which is an iterator.

Python next()


To start its execution, we can use the next() method that takes the generator object as an argument. In doing so, the function runs until it encounters a yield statement. After that, the function returns the next value in the iterator and hands over the transfer to the caller, but its state is maintained. When we call the next() method again, the function begins its execution after the last encountered yield statement. Let’s see.

print(f"The value of i is {next(obj)}")
print(f"The value of i is {next(obj)}")
print(f"The value of i is {next(obj)}")

Output

First execution point
The value of i is 0
Second execution point
The value of i is 1
Third execution point
The value of i is 2

As you can see, the variable i does not get destroyed, i.e., the value from the previous call is maintained. 

Python StopIteration Exception


What happens if we try to call the next() method one more time? As the function does not contain any more lines, so we will get a StopIteration exception, i.e.,

print(f"The value of i is {next(obj)}")

Output

StopIteration
Traceback (most recent call last)
<ipython-input-13-f06cc408a533> in <module>()
----> 1 print(f"The value of i is {next(obj)}")
StopIteration:

for loop


Instead of calling next() each time we want to get the next item, we can use the for loop that will implicitly use next() until the generator exhausts. It is simple, short, and eliminates the chances of raising a StopIteration exception (as mentioned above). Let’s see.

obj = test_generator()

for val in obj:
    print(f"The value of i is {val}")

Output

First execution point
The value of i is 0
Second execution point
The value of i is 1
Third execution point
The value of i is 2

To reiterate, we need to call the generator function again.

Python Generator Expressions


Generator Expressions are an even shorter way of creating iterators, and they are anonymous generator functions. They are similar to list comprehensions and defined in the same way, except it requires parenthesis instead of square brackets. They provide the same functionality as list comprehensions do but are more memory efficient and provide high performance.

Let’s take an example to understand them.

obj1 = [x * x for x in range(0, 9)]  # list comprehension
print(obj1)
obj2 = (x * x for x in range(0, 9))  # generator expression
print(obj2)

for item in obj2:
    print(item)

Output

[0, 1, 4, 9, 16, 25, 36, 49, 64]
<generator object <genexpr> at 0x7fad83a7e620>
0
1
4
9
16
25
36
49
64

Memory Efficient and High Performance


List comprehensions store the complete list in memory. However, the generator expression only holds the generator object in memory. Therefore, they are more memory efficient and provide high performance.

Consider the same example as above, except we are calculating the squares of numbers from 1 to 1 million here.

import time
import sys

start = time.time()
obj1 = [x * x for x in range(0, 1000000)]  # list comprehension
end = time.time()
print(f"Time Taken: {end-start}seconds")
print(f"Memory: {sys.getsizeof(obj1)} bytes\n")

start = time.time()
obj2 = (x * x for x in range(0, 1000000))  # generator expression
end = time.time()
print(f"Time Taken: {end-start} seconds")
print(f"Memory: {sys.getsizeof(obj2)} bytes")

Output

Time Taken: 0.10663628578186035seconds
Memory: 8697464 bytes

Time Taken: 8.130073547363281e-05 seconds
Memory: 88 bytes

See the difference

Infinite Sequence


As we have a limited amount of memory, we cannot generate an infinite sequence using usual methods. However, we can produce it using generators since they do not need to store complete data in memory.

Consider the following example that generates the square number series.

def infinite_square_number():
    i = 0
    while True:
        yield i * i
        i += 1


sequence = infinite_square_number()
for num in sequence:
    print(num, end=", ")

Output

output of infinite sequence

Python send()


The send() method sends a value to a generator function. When we call the send() method, the execution of the function resumes, and the value passed in the argument becomes the result of the yield expression from the previous execution point. If you use the send() method to call the generator function for the first time, then pass None as the argument as no yield expression is executed yet.

import random


def test_generator(n, recv=0):
    for i in range(n):
        recv = yield i * recv


n = 5
obj = test_generator(n)
print(obj.send(None), end=" ")
for i in range(1, n):
    print(obj.send(random.randint(1, 10)), end=" ")

Output

0 8 18 15 20

The test_generator() function runs the loop for n number of times. It yields the product of the current iteration number and the random integer sent through the send() method. Note that we pass None in send() in the first call to the test_generator() function.

close()


The close() method stops the generator. It closes the generator by raising the GeneratorExit exception. If the generator has already exited, either by exhaustion (normal exit) or due to some exception, it does nothing. Consider the following example, in which the test_generator() function takes squares of numbers from 0 to n. However, we stop the generator when the yielded value is 25.

import random


def test_generator(n):
    j = 0
    for i in range(n):
        yield j ** 2
        j += 1


n = 1000
obj = test_generator(1000)
for val in obj:
    print(val)
    if val == 25:
        obj.close()

Output

0
1
4
9
16
25

throw()


It is used to throw a specified exception. It takes three arguments. The first argument is the type of exception. The second optional argument represents the value of the error, and a third optional argument is a traceback object.

Consider the same example as above, except it throws ValueError when the yielded value is greater than or equal to 1000.

import random


def test_generator(n):
    j = 0
    for i in range(n):
        yield j ** 2
        j += 1


n = 1000
obj = test_generator(1000)
for val in obj:
    if val >= 1000:
        obj.throw(ValueError, "Too Large")
        print(val, end=" ")

Output

code output of throw function

Creating a Generator Pipeline


Generators can be chained together to create a pipeline, in which the output of one generator goes as an input to another generator. This will make your code and pipeline more readable.

Consider the following example, where one generator reads a file. The file contains a sample list of Australian cricket players and their career periods. The names of the players are all in lowercase. We want to capitalize their first and last names. Let’s do it.

lines = (line for line in open("sample.txt"))  # read each line from the file
words_list = (
    words.capitalize() + " " for l in lines for words in l.split(" ")
)  # split each line and capitalize
line = " "
for w in words_list:
    line += w

print(line)

Output

James Pattinson 2011–2015
Pat Cummins 2011–
Mitchell Marsh 2011–
Daniel Christian 2012–2014
Matthew Wade 2012–
Peter Forrest 2012
Nathan Lyon 2012–
George Bailey 2012–2016
Glenn Maxwell 2012–
Aaron Finch 2013–

The first generator expression yields each line of the file. The second generator expression splits each line into a list of words, and then each word is capitalized.


Liked the post?
A computer science student having interest in web development. Well versed in Object Oriented Concepts, and its implementation in various projects. Strong grasp of various data structures and algorithms. Excellent problem solving skills.
Editor's Picks
0 COMMENT

Please login to view or add comment(s).