3.2. Generator Builtin

  • Generator-like objects

  • Behaves similar, but is not generator

  • range()

  • reversed()

  • enumerate()

  • zip()

  • map()

  • filter()

  • import itertools

3.2.1. Range

  • Generate integers from start to stop incrementing by step

  • range([start], <stop>, [step])

  • optional start, inclusive, default: 0

  • required stop, exclusive,

  • optional step, default: 1

>>> from inspect import isgeneratorfunction, isgenerator
>>>
>>>
>>> isgeneratorfunction(range)
False
>>>
>>> result = range(0,5)
>>> isgenerator(result)
False
>>> range(0,3)
range(0, 3)
>>> list(range(0,3))
[0, 1, 2]
>>> tuple(range(0,3))
(0, 1, 2)
>>> set(range(0,3))
{0, 1, 2}
>>> list(range(4,11,2))
[4, 6, 8, 10]

3.2.2. Reversed

  • Return a reverse iterator over the values of the given sequence

  • reversed(sequence, /)

>>> data = (1, 2, 3)
>>> list(reversed(data))
[3, 2, 1]

3.2.3. Enumerate

  • enumerate(iterable, start=0)

  • Return an enumerate object

  • The enumerate object yields pairs containing a count (from start, which defaults to zero) and a value yielded by the iterable argument.

>>> from inspect import isgeneratorfunction, isgenerator
>>>
>>>
>>> isgeneratorfunction(enumerate)
False
>>> result = enumerate(['a', 'b', 'c'])
>>> isgenerator(result)
False
>>> months = ['January', 'February', 'March']
>>> result = enumerate(months)
>>>
>>> next(result)
(0, 'January')
>>> next(result)
(1, 'February')
>>> next(result)
(2, 'March')
>>> next(result)
Traceback (most recent call last):
StopIteration
>>> months = ['January', 'February', 'March']
>>> result = enumerate(months)
>>>
>>> list(result)
[(0, 'January'), (1, 'February'), (2, 'March')]
>>> months = ['January', 'February', 'March']
>>> result = enumerate(months)
>>>
>>> dict(result)
{0: 'January', 1: 'February', 2: 'March'}
>>> months = ['January', 'February', 'March']
>>> result = enumerate(months, start=1)
>>>
>>> dict(result)
{1: 'January', 2: 'February', 3: 'March'}
>>> months = ['January', 'February', 'March']
>>> result = {f'{i:02}':month for i,month in enumerate(months, start=1)}
>>>
>>> print(result)
{'01': 'January', '02': 'February', '03': 'March'}
>>> months = ['January', 'February', 'March']
>>>
>>> for i, month in enumerate(months, start=1):
...     print(f'{i} -> {month}')
1 -> January
2 -> February
3 -> March

3.2.4. Zip

  • zip(*iterables, strict=False)

  • Iterate over several iterables in parallel, producing tuples with an item from each one.

The zip object yields n-length tuples, where n is the number of iterables passed as positional arguments to zip(). The i-th element in every tuple comes from the i-th iterable argument to zip(). This continues until the shortest argument is exhausted. 1

If strict is true and one of the arguments is exhausted before the others, raise a ValueError.

>>> from inspect import isgeneratorfunction, isgenerator
>>>
>>>
>>> isgeneratorfunction(zip)
False
>>>
>>> result = zip(['a','b','c'], [1,2,3])
>>> isgenerator(result)
False
>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> next(result)
('Mark', 'Watney')
>>> next(result)
('Melissa', 'Lewis')
>>> next(result)
('Alex', 'Vogel')
>>> next(result)
Traceback (most recent call last):
StopIteration
>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> list(result)
[('Mark', 'Watney'), ('Melissa', 'Lewis'), ('Alex', 'Vogel')]
>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> dict(result)
{'Mark': 'Watney', 'Melissa': 'Lewis', 'Alex': 'Vogel'}
>>> roles = ['botanist', 'commander', 'chemist']
>>> names = ['Mark Watney', 'Melissa Lewis', 'Alex Vogel']
>>> dict(zip(roles, names))  
{'botanist': 'Mark Watney',
 'commander': 'Melissa Lewis',
 'chemist': 'Alex Vogel'}

zip() adjusts to the shortest:

>>> firstnames = ['Mark', 'Melissa']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> list(result)
[('Mark', 'Watney'), ('Melissa', 'Lewis')]
>>> roles = ['botanist', 'commander', 'chemist']
>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(roles, firstnames, lastnames)
>>>
>>> next(result)
('botanist', 'Mark', 'Watney')
>>> next(result)
('commander', 'Melissa', 'Lewis')
>>> next(result)
('chemist', 'Alex', 'Vogel')
>>> next(result)
Traceback (most recent call last):
StopIteration
>>> roles = ['botanist', 'commander', 'chemist']
>>> names = ['Mark Watney', 'Melissa Lewis', 'Alex Vogel']
>>>
>>> for role, name in zip(roles, names):
...     print(f'{role} -> {name}')
botanist -> Mark Watney
commander -> Melissa Lewis
chemist -> Alex Vogel

3.2.5. Map

  • map(callable, *iterables)

>>> from inspect import isgeneratorfunction, isgenerator
>>>
>>>
>>> isgeneratorfunction(map)
False
>>>
>>> result = map(float, [1,2,3])
>>> isgenerator(result)
False
>>> data = [1, 2, 3]
>>> result = map(float, data)
>>>
>>> next(result)
1.0
>>> next(result)
2.0
>>> next(result)
3.0
>>> next(result)
Traceback (most recent call last):
StopIteration
>>> data = [1, 2, 3]
>>> result = map(float, data)
>>>
>>> list(result)
[1.0, 2.0, 3.0]
>>> def square(x):
...     return x ** 2
...
>>> data = [1, 2, 3]
>>> result = map(square, data)
>>>
>>> list(result)
[1, 4, 9]

3.2.6. Filter

  • filter(callable, *iterables)

>>> from inspect import isgeneratorfunction, isgenerator
>>>
>>>
>>> def even(x):
...     return x % 2 == 0
>>>
>>>
>>> isgeneratorfunction(filter)
False
>>>
>>> result = filter(even, [1,2,3])
>>> isgenerator(result)
False
>>> def even(x):
...     return x % 2 == 0
>>>
>>>
>>> data = [1, 2, 3, 4, 5, 6]
>>> result = filter(even, data)
>>>
>>> next(result)
2
>>> next(result)
4
>>> next(result)
6
>>> next(result)
Traceback (most recent call last):
StopIteration
>>> def even(x):
...     return x % 2 == 0
>>>
>>>
>>> data = [1, 2, 3, 4, 5, 6]
>>> result = filter(even, data)
>>>
>>> list(result)
[2, 4, 6]

Performance:

>>> def even(x):
...     return x % 2 == 0
>>>
>>>
>>> data = [1, 2, 3, 4, 5, 6]
>>> 
... %%timeit -r 1000 -n 1000
... result = [x for x in data if even(x)]
1.11 µs ± 139 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
>>> 
... %%timeit -r 1000 -n 1000
... result = list(filter(even, data))
921 ns ± 112 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)

3.2.7. Performance

>>> def even(x):
...     return x % 2 == 0
>>> 
... %%timeit -r 1000 -n 1000
... result = [float(x) for x in data if even(x)]
1.9 µs ± 206 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)
>>> 
... %%timeit -r 1000 -n 1000
... result = list(map(float, filter(parzysta, data)))
1.66 µs ± 175 ns per loop (mean ± std. dev. of 1000 runs, 1,000 loops each)

3.2.8. Use Case - 0x01

  • Increment

>>> def increment(x):
...     return x + 1
>>>
>>>
>>> data = [1, 2, 3, 4]
>>> result = map(increment, data)
>>>
>>> list(result)
[2, 3, 4, 5]

3.2.9. Use Case - 0x02

  • Translate

>>> PL = {'ą': 'a', 'ć': 'c', 'ę': 'e',
...       'ł': 'l', 'ń': 'n', 'ó': 'o',
...       'ś': 's', 'ż': 'z', 'ź': 'z'}
>>>
>>> def translate(letter):
...     return PL.get(letter, letter)
>>>
>>>
>>> text = 'zażółć gęślą jaźń'
>>> result = map(translate, text)
>>> ''.join(result)
'zazolc gesla jazn'

3.2.10. Use Case - 0x03

  • Compare

>>> people = [
...     {'age': 21, 'name': 'Pan Twardowski'},
...     {'age': 25, 'name': 'Mark Watney'},
...     {'age': 18, 'name': 'Melissa Lewis'}]
>>>
>>>
>>> def adult(person):
...     return person['age'] >= 21
>>>
>>>
>>> result = filter(adult, people)
>>> list(result)  
[{'age': 21, 'name': 'Pan Twardowski'},
 {'age': 25, 'name': 'Mark Watney'}]

3.2.11. Use Case - 0x04

  • Bool

>>> people = [
...     {'is_astronaut': False, 'name': 'Pan Twardowski'},
...     {'is_astronaut': True, 'name': 'Mark Watney'},
...     {'is_astronaut': True, 'name': 'Melissa Lewis'}]
>>>
>>>
>>> def astronaut(person):
...     return person['is_astronaut']
>>>
>>>
>>> result = filter(astronaut, people)
>>> list(result)  
[{'is_astronaut': True, 'name': 'Mark Watney'},
 {'is_astronaut': True, 'name': 'Melissa Lewis'}]

3.2.12. Use Case - 0x05

  • Contains

>>> astronauts = ['Mark Watney', 'Melissa Lewis']
>>> people = ['Mark Watney', 'Melissa Lewis', 'Jimenez']
>>>
>>>
>>> def is_astronaut(person):
...     return person in astronauts
>>>
>>>
>>> result = filter(is_astronaut, people)
>>> list(result)
['Mark Watney', 'Melissa Lewis']

3.2.13. Use Case - 0x06

  • Add numbers from stdin

SetUp:

>>> import sys

Definition:

>>> print(sum(map(int, sys.stdin)))  

Usage:

$ cat ~/.profile |grep addnum
alias addnum='python -c"import sys; print(sum(map(int, sys.stdin)))"'

3.2.14. Use Case - 0x07

SetUp:

>>> from doctest import testmod as run_tests

Data 2:

>>> DATA = """150,4,setosa,versicolor,virginica
... 5.1,3.5,1.4,0.2,0
... 7.0,3.2,4.7,1.4,1
... 6.3,3.3,6.0,2.5,2
... 4.9,3.0,1.4,0.2,0
... 6.4,3.2,4.5,1.5,1
... 5.8,2.7,5.1,1.9,2"""

Definition:

>>> def get_labelencoder(header: str) -> dict[int, str]:
...     """
...     >>> get_labelencoder('150,4,setosa,versicolor,virginica')
...     {0: 'setosa', 1: 'versicolor', 2: 'virginica'}
...     """
...     nrows, nfeatures, *class_labels = header.split(',')
...     return dict(enumerate(class_labels))
>>>
>>> run_tests()  
TestResults(failed=0, attempted=1)
>>> def get_data(line: str) -> tuple:
...     """
...     >>> convert('5.1,3.5,1.4,0.2,0')
...     (5.1, 3.5, 1.4, 0.2, 'setosa')
...     >>> convert('7.0,3.2,4.7,1.4,1')
...     (7.0, 3.2, 4.7, 1.4, 'versicolor')
...     >>> convert('6.3,3.3,6.0,2.5,2')
...     (6.3, 3.3, 6.0, 2.5, 'virginica')
...     """
...     *values, species = line.split(',')
...     values = map(float, values)
...     species = label_encoder[int(species)]
...     return tuple(values) + (species,)
>>>
>>> run_tests()  
TestResults(failed=0, attempted=3)
>>> header, *lines = DATA.splitlines()
>>> label_encoder = get_labelencoder(header)
>>> result = map(get_data, lines)
>>> list(result)  
[(5.1, 3.5, 1.4, 0.2, 'setosa'),
 (7.0, 3.2, 4.7, 1.4, 'versicolor'),
 (6.3, 3.3, 6.0, 2.5, 'virginica'),
 (4.9, 3.0, 1.4, 0.2, 'setosa'),
 (6.4, 3.2, 4.5, 1.5, 'versicolor'),
 (5.8, 2.7, 5.1, 1.9, 'virginica')]

3.2.15. References

1

Python Core Developers. Built-in Functions. Year: 2022. Retrieved: 2022-06-28. URL: https://docs.python.org/3/library/functions.html#zip

2

Scikit-learn Contributors. Iris Dataset. Year: 2022. Retrieved: 2022-12-19. URL: https://raw.githubusercontent.com/scikit-learn/scikit-learn/main/sklearn/datasets/data/iris.csv

3.2.16. Assignments

Code 3.32. Solution
"""
* Assignment: Generator Builtin Chain
* Complexity: easy
* Lines of code: 5 lines
* Time: 5 min

English:
    1. Use generator expression to create `result`
    2. Use `range()` to get numbers from 0 (inclusive) to 10 (exclusive)
    3. Use `filter()` to get odd numbers from `result`
       (and assign to `result`)
    4. Use `map()` to cube all numbers in `result`
    5. Create `result: float` with arithmetic mean of `result`
    6. Do not use `lambda` expressions
    7. Note, that all the time you are working on one data stream
    8. Run doctests - all must succeed

Polish:
    1. Użyj wyrażenia generatorowego do stworzenia `result`
    2. Użyj `range()` aby otrzymać liczby od 0 (włącznie) do 10 (rozłącznie)
    3. Użyj `filter()` aby otrzymać liczby nieparzyste z `result`
       (i przypisz je do `result`)
    4. Użyj `map()` aby podnieść wszystkie liczby w `result` do sześcianu
    5. Stwórz `result: float` ze średnią arytmetyczną z `result`
    6. Nie używaj wyrażeń lambda
    7. Zwróć uwagę, że cały czas pracujesz na jednym strumieniu danych
    8. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * type cast to `list()` to expand generator before calculating mean
    * `mean = sum(...) / len(...)`
    * TypeError: object of type 'map' has no len()
    * ZeroDivisionError: division by zero

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from inspect import isfunction

    >>> isfunction(odd)
    True
    >>> isfunction(cube)
    True
    >>> type(result) is float
    True
    >>> result
    245.0
"""

def odd(x):
    return x % 2 == 1


def cube(x):
    return x ** 3


# Range from 0 to 10 (exclusive)
# Filter odd numbers
# Cube result
# Calculate mean
# type: float
result = ...