6.5. Idiom Zip

  • Combine two sequences

  • Generator (lazy evaluated)

  • zip(*iterables, strict=False)

  • required *iterables - 1 or many sequences or iterator object

  • Iterate over several iterables in parallel, producing tuples with an item from each one.

The zip object yields n-length tuples, where n is the number of iterables passed as positional arguments to zip(). The i-th element in every tuple comes from the i-th iterable argument to zip(). This continues until the shortest argument is exhausted. If strict is true and one of the arguments is exhausted before the others, raise a ValueError. [2]

>>> from inspect import isgeneratorfunction, isgenerator
>>>
>>>
>>> isgeneratorfunction(zip)
False
>>>
>>> result = zip(['a','b','c'], [1,2,3])
>>> isgenerator(result)
False

6.5.1. Problem

>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = []
>>> length = min(len(firstnames), len(lastnames))
>>> i = 0
>>>
>>> while i < length:
...     pair = (firstnames[i], lastnames[i])
...     result.append(pair)
...     i += 1
>>>
>>> result
[('Mark', 'Watney'), ('Melissa', 'Lewis'), ('Alex', 'Vogel')]
>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = []
>>>
>>> for i in range(min(len(firstnames), len(lastnames))):
...     pair = (firstnames[i], lastnames[i])
...     result.append(pair)
>>>
>>> result
[('Mark', 'Watney'), ('Melissa', 'Lewis'), ('Alex', 'Vogel')]

6.5.2. Solution

>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> list(result)
[('Mark', 'Watney'), ('Melissa', 'Lewis'), ('Alex', 'Vogel')]

6.5.3. Lazy Evaluation

>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> next(result)
('Mark', 'Watney')
>>> next(result)
('Melissa', 'Lewis')
>>> next(result)
('Alex', 'Vogel')
>>> next(result)
Traceback (most recent call last):
StopIteration

6.5.4. Generate Dict

>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> dict(result)
{'Mark': 'Watney', 'Melissa': 'Lewis', 'Alex': 'Vogel'}
>>> roles = ['botanist', 'commander', 'chemist']
>>> names = ['Mark Watney', 'Melissa Lewis', 'Alex Vogel']
>>>
>>> dict(zip(roles, names))  
{'botanist': 'Mark Watney',
 'commander': 'Melissa Lewis',
 'chemist': 'Alex Vogel'}

6.5.5. Adjusts to the Shortest

  • zip() adjusts to the shortest

>>> firstnames = ['Mark', 'Melissa']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> list(result)
[('Mark', 'Watney'), ('Melissa', 'Lewis')]

6.5.6. Adjust to the Longest

  • itertools.zip_longest(iter1 [,iter2 [...]], [fillvalue=None]) --> zip_longest object

>>> from itertools import zip_longest
>>>
>>>
>>> firstnames = ['Mark', 'Melissa']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>>
>>> list(zip_longest(firstnames, lastnames))
[('Mark', 'Watney'), ('Melissa', 'Lewis'), (None, 'Vogel')]
>>> list(zip_longest(firstnames, lastnames, fillvalue=''))
[('Mark', 'Watney'), ('Melissa', 'Lewis'), ('', 'Vogel')]

6.5.7. Three-way merge

>>> roles = ['botanist', 'commander', 'chemist']
>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(roles, firstnames, lastnames)
>>>
>>> next(result)
('botanist', 'Mark', 'Watney')
>>> next(result)
('commander', 'Melissa', 'Lewis')
>>> next(result)
('chemist', 'Alex', 'Vogel')
>>> next(result)
Traceback (most recent call last):
StopIteration

6.5.8. In For Loop

>>> roles = ['botanist', 'commander', 'chemist']
>>> names = ['Mark Watney', 'Melissa Lewis', 'Alex Vogel']
>>>
>>> for role, name in zip(roles, names):
...     print(f'{role} -> {name}')
botanist -> Mark Watney
commander -> Melissa Lewis
chemist -> Alex Vogel

6.5.9. Unzip

>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>>
>>> list(zip(firstnames, lastnames))
[('Mark', 'Watney'), ('Melissa', 'Lewis'), ('Alex', 'Vogel')]
>>>
>>> fname, lname = zip(*zip(firstnames, lastnames))
>>>
>>> print(fname)
('Mark', 'Melissa', 'Alex')
>>> print(lname)
('Watney', 'Lewis', 'Vogel')

6.5.10. Strict

  • zip(*iterables, strict=False)

  • Since Python 3.10: PEP 618 -- Add Optional Length-Checking To zip [1]

  • Source [2]

zip() adjusts to the shortest:

>>> firstnames = ['Mark', 'Melissa']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames)
>>>
>>> list(result)
[('Mark', 'Watney'), ('Melissa', 'Lewis')]

zip() is often used in cases where the iterables are assumed to be of equal length. In such cases, it's recommended to use the strict=True option. Its output is the same as regular zip()

>>> firstnames = ['Mark', 'Melissa', 'Alex']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>> result = zip(firstnames, lastnames, strict=True)  
>>>
>>> list(result)  
[('Mark', 'Watney'), ('Melissa', 'Lewis'), ('Alex', 'Vogel')]

Unlike the default behavior, it checks that the lengths of iterables are identical, raising a ValueError if they aren't:

>>> firstnames = ['Mark', 'Melissa']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']
>>>
>>> result = zip(firstnames, lastnames, strict=True)  
Traceback (most recent call last):
ValueError: zip() argument 2 is longer than argument 1

Without the strict=True argument, any bug that results in iterables of different lengths will be silenced, possibly manifesting as a hard-to-find bug in another part of the program.

6.5.11. Zip Longest

SetUp:

>>> from itertools import zip_longest
>>>
>>> firstnames = ['Mark', 'Melissa']
>>> lastnames = ['Watney', 'Lewis', 'Vogel']

Usage:

>>> result = zip_longest(firstnames, lastnames)
>>> list(result)
[('Mark', 'Watney'), ('Melissa', 'Lewis'), (None, 'Vogel')]
>>> result = zip_longest(firstnames, lastnames, fillvalue='n/a')
>>>
>>> list(result)
[('Mark', 'Watney'), ('Melissa', 'Lewis'), ('n/a', 'Vogel')]

6.5.12. Use Case - 0x01

>>> for user, address, order in zip(users, addresses, orders):  
...    print(f'Get {user} orders... {order}')

6.5.13. References

6.5.14. Assignments

Code 6.21. Solution
"""
* Assignment: Idiom Zip Dict
* Type: class assignment
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min

English:
    1. Define `result: dict`
    2. Assign to `result` zipped `KEYS` and `VALUES` to `dict`
    3. Use `zip()`
    4. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: dict`
    2. Przypisz do `result` zzipowane `KEYS` i `VALUES` do `dict`
    3. Użyj `zip()`
    4. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `dict()`
    * `zip()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert type(result) is dict, \
    'Variable `result` has invalid type, should be dict'

    >>> assert all(type(x) is str for x in result.keys()), \
    'All dict keys should be str'

    >>> assert 'sepal_length' in result.keys()
    >>> assert 'sepal_width' in result.keys()
    >>> assert 'petal_length' in result.keys()
    >>> assert 'petal_width' in result.keys()
    >>> assert 'species' in result.keys()

    >>> assert 5.8 in result.values()
    >>> assert 2.7 in result.values()
    >>> assert 5.1 in result.values()
    >>> assert 1.9 in result.values()
    >>> assert 'virginica' in result.values()

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    {'sepal_length': 5.8,
     'sepal_width': 2.7,
     'petal_length': 5.1,
     'petal_width': 1.9,
     'species': 'virginica'}
"""

KEYS = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
VALUES = [5.8, 2.7, 5.1, 1.9, 'virginica']

# Dict with Zipped KEYS and VALUES
# type: dict[str,float|str]
result = ...

Code 6.22. Solution
"""
* Assignment: Idiom Zip List[Dict]
* Complexity: easy
* Lines of code: 2 lines
* Time: 5 min

English:
    1. Define `result: list[dict]`:
    2. Convert `DATA` from `list[tuple]` to `list[dict]`
        a. key - name from the header
        b. value - numerical value or species name
    3. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: list[dict]`:
    2. Przekonwertuj `DATA` z `list[tuple]` do `list[dict]`
        a. klucz - nazwa z nagłówka
        b. wartość - wartość numeryczna lub nazwa gatunku
    3. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * list comprehension
    * `dict()`
    * `zip()`

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from pprint import pprint

    >>> assert result is not Ellipsis, \
    'Assign result to variable: `result`'
    >>> result = list(result)
    >>> assert type(result) is list, \
    'Result must be a list'
    >>> assert len(result) > 0, \
    'Result cannot be empty'
    >>> assert all(type(element) is dict for element in result), \
    'All elements in result must be a dict'

    >>> pprint(result)
    [{'petal_length': 5.1,
      'petal_width': 1.9,
      'sepal_length': 5.8,
      'sepal_width': 2.7,
      'species': 'virginica'},
     {'petal_length': 1.4,
      'petal_width': 0.2,
      'sepal_length': 5.1,
      'sepal_width': 3.5,
      'species': 'setosa'},
     {'petal_length': 4.1,
      'petal_width': 1.3,
      'sepal_length': 5.7,
      'sepal_width': 2.8,
      'species': 'versicolor'},
     {'petal_length': 5.6,
      'petal_width': 1.8,
      'sepal_length': 6.3,
      'sepal_width': 2.9,
      'species': 'virginica'},
     {'petal_length': 4.5,
      'petal_width': 1.5,
      'sepal_length': 6.4,
      'sepal_width': 3.2,
      'species': 'versicolor'},
     {'petal_length': 1.3,
      'petal_width': 0.2,
      'sepal_length': 4.7,
      'sepal_width': 3.2,
      'species': 'setosa'}]
"""

DATA = [
    ('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
]


# Convert DATA from list[tuple] to list[dict]
# type: list[dict[str,float|str]]
result = ...


Code 6.23. Solution
"""
* Assignment: Idiom Zip Dict
* Type: class assignment
* Complexity: easy
* Lines of code: 1 lines
* Time: 3 min

English:
    1. Define `result: zip` with enumerated `DATA`
    2. Recreate `enumerate()` behavior
    3. Use only: `len()`, `range()`, `zip()`
    4. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: zip` z enumerowanym `DATA`
    2. Odtwórz zachowanie `enumerate()`
    3. Użyj tylko: `len()`, `range()`, `zip()`
    4. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `zip()`
    * `range()`
    * `len()`

Tests:
    >>> assert type(result) is zip
    >>> next(result)
    (0, 'January')
    >>> next(result)
    (1, 'February')
    >>> next(result)
    (2, 'March')
    >>> next(result)
    (3, 'April')
    >>> next(result)
    Traceback (most recent call last):
    StopIteration
"""

DATA = ['January', 'February', 'March', 'April']

# Define `result: zip` with enumerated `DATA
# Recreate `enumerate()` behavior
# Use only: `len()`, `range()`, `zip()`
# type: zip
result = ...

Code 6.24. Solution
"""
* Assignment: Idiom Zip Impl
* Complexity: medium
* Lines of code: 8 lines
* Time: 13 min

English:
    1. Write own implementation of a built-in `zip()` function
    2. Define function `myzip` with parameters:
        a. parameter `a: list | tuple`
        b. parameter `b: list | tuple`
        c. parameter `strict: bool`
    3. Don't validate arguments and assume, that user will:
        a. always pass valid type of arguments
        b. iterable length will always be greater than 0
        c. user can only pass two iterables: `a`, `b`
    4. Do not use built-in function `zip()`
    5. Run doctests - all must succeed

Polish:
    1. Zaimplementuj własne rozwiązanie wbudowanej funkcji `zip()`
    2. Zdefiniuj funkcję `myzip` z parametrami:
        a. parametr `a: list | tuple`
        b. parametr `b: list | tuple`
        c. parametr `strict: bool`
    3. Nie waliduj argumentów i przyjmij, że użytkownik:
        a. zawsze poda argumenty poprawnych typów
        b. długość iterable będzie większa od 0
        c. użytkownik może podać tylko dwie iterable: `a`, `b`
    4. Nie używaj wbudowanej funkcji `zip()`
    5. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `min()`
    * `len()`
    * `range()`
    * `list.append()`

Tests:
    >>> import sys; sys.tracebacklimit = 0
    >>> from inspect import isfunction
    >>> assert isfunction(myzip)

    >>> list(myzip(['a', 'b', 'c'], [1, 2, 3]))
    [('a', 1), ('b', 2), ('c', 3)]

    >>> dict(myzip(['a', 'b', 'c'], [1, 2, 3]))
    {'a': 1, 'b': 2, 'c': 3}

    >>> dict(myzip(['a', 'b', 'c'], [1, 2, 3, 4]))
    {'a': 1, 'b': 2, 'c': 3}

    >>> dict(myzip(['a', 'b', 'c'], [1, 2, 3], strict=True))
    {'a': 1, 'b': 2, 'c': 3}

    >>> dict(myzip(['a', 'b', 'c'], [1, 2, 3, 4], strict=True))
    Traceback (most recent call last):
    ValueError: zip() argument 2 is longer than argument 1
"""

# Write own implementation of a built-in `zip()` function
# Define function `myrange` with parameters: `a`, `b`, `strict`
# type: Callable[[Iterable, Iterable, bool], list[tuple]]
def myzip(a, b, strict=False):
    ...