5.7. Filter

5.7.1. Rationale

  • Select elements from sequence

  • Generator (lazy evaluated)

  • Built-in

Syntax:

  • filter(callable, *iterables)

  • required callable - Function

  • required iterables - 1 or many sequence or iterator objects

>>> def even(x):
...     return x % 2 == 0
>>>
>>> result = (x for x in range(0,5) if even(x))
>>> result = filter(even, range(0,5))
>>> result = (x for x in range(0,5) if x%2==0)
>>> result = filter(lambda x: x%2==0, range(0,5))

5.7.2. Problem

Plain code:

>>> def even(x):
...     return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = []
>>>
>>> for x in DATA:
...     if even(x):
...         result.append(x)
>>>
>>> print(result)
[2, 4, 6]

Comprehension:

>>> def even(x):
...     return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = [x for x in DATA if even(x)]
>>>
>>> print(result)
[2, 4, 6]

5.7.3. Solution

>>> def even(x):
...     return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = filter(even, DATA)
>>>
>>> list(result)
[2, 4, 6]

5.7.4. Lazy Evaluation

>>> def even(x):
...     return x % 2 == 0
>>>
>>>
>>> DATA = [1, 2, 3, 4, 5, 6]
>>> result = filter(even, DATA)
>>>
>>> next(result)
2
>>> next(result)
4
>>> next(result)
6
>>> next(result)
Traceback (most recent call last):
StopIteration

5.7.5. Use Cases

>>> people = [
...     {'age': 21, 'name': 'Jan Twardowski'},
...     {'age': 25, 'name': 'Mark Watney'},
...     {'age': 18, 'name': 'Melissa Lewis'}]
>>>
>>>
>>> def adult(person):
...     return person['age'] >= 21
>>>
>>>
>>> result = filter(adult, people)
>>> list(result)  # doctest: +NORMALIZE_WHITESPACE
[{'age': 21, 'name': 'Jan Twardowski'},
 {'age': 25, 'name': 'Mark Watney'}]
>>> people = [
...     {'is_astronaut': False, 'name': 'Jan Twardowski'},
...     {'is_astronaut': True, 'name': 'Mark Watney'},
...     {'is_astronaut': True, 'name': 'Melissa Lewis'}]
>>>
>>>
>>> def astronaut(person):
...     return person['is_astronaut']
>>>
>>>
>>> result = filter(astronaut, people)
>>> list(result)  # doctest: +NORMALIZE_WHITESPACE
[{'is_astronaut': True, 'name': 'Mark Watney'},
 {'is_astronaut': True, 'name': 'Melissa Lewis'}]
>>> astronauts = ['Mark Watney', 'Melissa Lewis']
>>>
>>> people = ['Jan Twardowski', 'Mark Watney',
...           'Melissa Lewis', 'Jimenez']
>>>
>>>
>>> def is_astronaut(person):
...     return person in astronauts
>>>
>>>
>>> result = filter(is_astronaut, people)
>>> list(result)
['Mark Watney', 'Melissa Lewis']

5.7.6. Performance

>>> # %%timeit -r 10 -n 100_000
>>> # result = (x for x in range(0,5) if x%2==0)
>>> # 490 ns ± 44 ns per loop (mean ± std. dev. of 10 runs, 100000 loops each)
>>> # %%timeit -r 10 -n 100_000
>>> # result = filter(lambda x: x%2==0, range(0,5))
>>> # 384 ns ± 34.2 ns per loop (mean ± std. dev. of 10 runs, 100000 loops each)

5.7.7. Assignments

Code 5.58. Solution
"""
* Assignment: Idioms Filter Chain
* Complexity: easy
* Lines of code: 10 lines
* Time: 8 min

English:
    1. Use generator expression to create `result`
    2. In generator use `range()` to get numbers from 1 to 33 (inclusive) divisible by 3
    3. Use `filter()` to get odd numbers from `result`
    4. Use `map()` to cube all numbers in `result`
    5. Set `result` with arithmetic mean of `result`
    6. Compare result with "Tests" section (see below)

Polish:
    1. Użyj wyrażenia generatorowego do stworzenia `result`
    2. W generatorze użyj `range()` aby otrzymać liczby od 1 do 33 (włącznie) podzielne przez 3
    3. Użyj `filter()` aby otrzymać liczby nieparzyste z `result`
    4. Użyj `map()` aby podnieść wszystkie liczby w `result` do sześcianu
    5. Ustaw `result` ze średnią arytmetyczną z `result`
    6. Porównaj wyniki z sekcją "Tests" (patrz poniżej)

Hints:
    * type cast to `list()` to expand generator before calculating mean
    * `mean = sum(...) / len(...)`

Tests:
    >>> result
    11502.0
"""


# Given
def odd(x):
    return x % 2


def cube(x):
    return x ** 3


result: float