# 3. Generators

## 3.1. Lazy evaluation

• Code do not execute instantly

• Sometimes code is not executed at all!

### 3.1.1. Declaring generators

• `range()` requires `int` arguments

Listing 290. This will not execute code!
```range(0, 5)
range(0, 5)
range(0, 5)
```
Listing 291. This will only create generator expression, but not execute it!
```numbers = range(0, 5)

print(numbers)
# range(0, 5)
```

### 3.1.2. Getting values from generator

• Get all values from generator (not very efficient)

```numbers = range(0, 5)

list(numbers)
# [0, 1, 2, 3, 4]
```
• Generator will calculate next number for every loop iteration, forgetting previous number, and not knowing next one

```for i in range(0, 5):
print(i)

# 0
# 1
# 2
# 3
# 4
```
• Will generate only three numbers, and then stop and forget generator

```for i in range(0, 5):
print(i)

if i == 3:
break

# 0
# 1
# 2
# 3
```

## 3.2. Generator expressions vs. Comprehensions

### 3.2.1. Comprehensions

• Executes instantly

```list(x for x in range(0, 5))        # [0, 1, 2, 3, 4]
[x for x in range(0, 5)]            # [0, 1, 2, 3, 4]
```
```set(x for x in range(0, 5))         # {0, 1, 2, 3, 4}
{x for x in range(0, 5)}            # {0, 1, 2, 3, 4}
```
```{x: x for x in range(0, 5)}         # {0: 0, 1: 1, 2: 2, 3: 3, 4: 4}
```
```tuple(x for x in range(0, 5))       # (0, 1, 2, 3, 4)
(x for x in range(0, 5))            # <generator object <genexpr> at 0x1197032a0>
```
```all(x for x in range(0, 5))         # False
any(x for x in range(0, 5))         # True
sum(x for x in range(0, 5))         # 10
```

### 3.2.2. Generator Expressions

• Lazy evaluation

```(x for x in range(0, 5))
# <generator object <genexpr> at 0x1197032a0>
```

### 3.2.3. What is the difference?

• Execution and assignment

```numbers = [x for x in range(0, 5)]

print(numbers)
# [0, 1, 2, 3, 4]

print(numbers)
# [0, 1, 2, 3, 4]
```
• Create generator object and assign pointer (do not execute)

```numbers = (x for x in range(0, 5))

print(numbers)
# <generator object <genexpr> at 0x111e7acd0>

print(list(numbers))
# [0, 1, 2, 3, 4]

print(list(numbers))
# []
```

### 3.2.4. Which one is better?

• Comprehensions - Using values more than one

• Generators - Using value one (for example in the loop iterator)

## 3.3. Conditions

```[x for x in range(0, 5) if x % 2 == 0]
# [0, 2, 4]
```
```def is_even(x):
if x % 2 == 0:
return True
else:
return False

[x for x in range(0, 5) if is_even(x)]
# [0, 2, 4]
```

## 3.4. Returning nested objects

Listing 292. Returning nested objects
```def my_function(number):
return number, number+10

[my_function(x) for x in range(0, 5)]
# [
#   (0, 10),
#   (1, 11),
#   (2, 12),
#   (3, 13),
#   (4, 14)
# ]
```
Listing 293. Returning nested objects
```def my_function(number):
if number % 2 == 0:
return {'number': number, 'status': 'even'}
else:
return {'number': number, 'status': 'odd'}

[my_function(x) for x in range(0, 5)]
# [
#    {'number': 0, 'status': 'even'},
#    {'number': 1, 'status': 'odd'},
#    {'number': 2, 'status': 'even'},
#    {'number': 3, 'status': 'odd'},
#    {'number': 4, 'status': 'even'},
# ]
```

### 3.4.1. Nested Comprehensions

```DATA = [
{'last_name': 'Jiménez'},
{'first_name': 'Mark', 'last_name': 'Watney'},
{'first_name': 'Иван'},
{'first_name': 'Jan', 'last_name': 'Twardowski', 'born': 1961},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'first_step': 1969},
]

fieldnames = set()
fieldnames.update(key for record in DATA for key in record.keys())
```
```DATA = [
{'last_name': 'Jiménez'},
{'first_name': 'Mark', 'last_name': 'Watney'},
{'first_name': 'Иван'},
{'first_name': 'Jan', 'last_name': 'Twardowski', 'born': 1961},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'first_step': 1969},
]

fieldnames = set()
fieldnames.update(key
for record in DATA
for key in record.keys()
)
```

## 3.5. `yield` Operator

```# ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
DATA = [
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(5.4, 3.9, 1.7, 0.4, 'setosa'),
(4.6, 3.4, 1.4, 0.3, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(5.7, 2.8, 4.5, 1.3, 'versicolor'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 3.3, 6.0, 2.5, 'virginica'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(4.9, 2.5, 4.5, 1.7, 'virginica'),
]
```
```def get_species(species):
output = []

for record in DATA:
if record[4] == species:
output.append(record)

return output

data = get_species('setosa')

print(data)
# [(5.1, 3.5, 1.4, 0.2, 'setosa'),
#  (4.9, 3.0, 1.4, 0.2, 'setosa'),
#  (5.4, 3.9, 1.7, 0.4, 'setosa'),
#  (4.6, 3.4, 1.4, 0.3, 'setosa')]

for row in data:
print(row)
# (5.1, 3.5, 1.4, 0.2, 'setosa')
# (4.9, 3.0, 1.4, 0.2, 'setosa')
# (5.4, 3.9, 1.7, 0.4, 'setosa')
# (4.6, 3.4, 1.4, 0.3, 'setosa')
```
```def get_species(species):
for record in DATA:
if record[4] == species:
yield record

data = get_species('setosa')

print(data)
# <generator object get_species at 0x11af257c8>

for row in data:
print(row)
# (5.1, 3.5, 1.4, 0.2, 'setosa')
# (4.9, 3.0, 1.4, 0.2, 'setosa')
# (5.4, 3.9, 1.7, 0.4, 'setosa')
# (4.6, 3.4, 1.4, 0.3, 'setosa')
```

## 3.6. Example

### 3.6.1. Filtering `list` items

```DATA = [
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(5.4, 3.9, 1.7, 0.4, 'setosa'),
(4.6, 3.4, 1.4, 0.3, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(5.7, 2.8, 4.5, 1.3, 'versicolor'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 3.3, 6.0, 2.5, 'virginica'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(4.9, 2.5, 4.5, 1.7, 'virginica'),
]

setosa = [row for row in DATA if row[4] == 'setosa']
print(setosa)
```

### 3.6.2. Filtering `dict` items

```DATA = [
{'first_name': 'Иван', 'last_name': 'Иванович', 'agency': 'Roscosmos'},
{'first_name': 'Jose', 'last_name': 'Jimenez', 'agency': 'NASA'},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'agency': 'NASA'},
{'first_name': 'Alex', 'last_name': 'Vogel', 'agency': 'ESA'},
{'first_name': 'Mark', 'last_name': 'Watney', 'agency': 'NASA'},
]

nasa_astronauts = [(x['first_name'], x['last_name'])
for x in DATA if x['agency'] == 'NASA']
# [
#   ('Jose', 'Jimenez'),
#   ('Melissa', 'Lewis'),
#   ('Mark', 'Watney')
# ]
```

### 3.6.3. Reversing `dict` keys with values

```data = {'first_name': 'Jan', 'last_name': 'Twardowski'}

{v: k for k, v in data.items()}
# {'Jan': 'first_name', 'Twardowski': 'last_name'}
```

Listing 294. Clean Code in generator
```DATA = {'username': 'Иван Иванович', 'agency': 'Roscosmos'}

def asd(x):
return x.replace('Иван', 'Ivan')

output = {
value: asd(value)
for key, value in DATA.items()
}
print(output)
# {'Иван Иванович': 'Ivan Ivanоvic'}

output = ['CCCP' if k == 'Roscosmos' else 'USA' for k,v in DATA.items() if k == 'agency']
print(output)
# ['USA']
```
```DATA = [
{'last_name': 'Jiménez'},
{'first_name': 'Mark', 'last_name': 'Watney'},
{'first_name': 'Иван'},
{'first_name': 'Jan', 'last_name': 'Twardowski', 'born': 1961},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'first_step': 1969},
]

[asd(value)

for d in DATA
for key, value in d.items()

]
```
```DATA = [
{'first_name': 'Иван', 'last_name': 'Иванович', 'agency': 'Roscosmos'},
{'first_name': 'Jose', 'last_name': 'Jimenez', 'agency': 'NASA'},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'agency': 'NASA'},
{'first_name': 'Alex', 'last_name': 'Vogel', 'agency': 'ESA'},
{'first_name': 'Mark', 'last_name': 'Watney', 'agency': 'NASA'},
]

nasa_astronauts = [(astronaut['first_name'], astronaut['last_name']) for astronaut in DATA if astronaut['agency'] == 'NASA']
# [
#   ('Jose', 'Jimenez'),
#   ('Melissa', 'Lewis'),
#   ('Mark', 'Watney')
# ]
```

## 3.8. Built-in generators

```header = ['a', 'b', 'c']
data = [1, 2, 3]
output = {}

value = data[i]
output[key] = value

print(output)
# {'a': 1, 'b': 2, 'c': 3}
```

### 3.8.1. `zip()`

```header = ['a', 'b', 'c']
data = [1, 2, 3]

# <zip object at 0x11cf54b90>

# [('a', 1), ('b', 2), ('c', 3)]

# {'a': 1, 'b': 2, 'c': 3}

# (('a', 1), ('b', 2), ('c', 3))
```

### 3.8.2. `map()`

```map(float, [1, 2, 3])
# <map object at 0x11d15a190>

list(map(float, [1, 2, 3]))
# [1.0, 2.0, 3.0]

tuple(map(float, [1, 2, 3]))
# (1.0, 2.0, 3.0)
```
```data = [1, 2, 3]

tuple(map(float, data))
# (1.0, 2.0, 3.0)
```

### 3.8.3. `filter()`

```def czy_parzysty(x):
if x % 2 == 0:
return True
else:
return False

filter(czy_parzysty, data)
# <filter object at 0x11d182990>

list(filter(czy_parzysty, data))
# [2]
```

## 3.9. Assignments

### 3.9.1. Generators vs. Comprehensions - iris

English

Todo

English translation

Polish
1. Zapisz dane `data/iris.csv` do pliku "generator_iris.csv"

2. Zaczytaj dane pomijając nagłówek

3. Napisz funkcję która zwraca wszystkie pomiary dla danego gatunku

4. Gatunek będzie podawany jako `str` do funkcji

5. Zaimplementuj rozwiązanie wykorzystując zwykłą funkcję

6. Zaimplementuj rozwiązanie wykorzystując generator i słówko kluczowe `yield`

7. Porównaj wyniki jednego i drugiego rozwiązania przez użycie `sys.getsizeof()`

The whys and wherefores
• Wykorzystanie generatorów

• Odbieranie danych z lazy evaluation

• Porównanie wielkości struktur danych

• Parsowanie pliku

• Filtrowanie treści w locie

### 3.9.2. Generators vs. Comprehensions - passwd

• Complexity level: medium

• Lines of code to write: 40 lines

• Estimated time of completion: 20 min

English

Todo

English translation

Polish
1. Napisz program, który wczyta plik z danymi wejśiowymi (patrz poniżej)

2. Przefiltruj linie, tak aby nie zawierały komentarzy (zaczynające się od `#`) oraz pustych linii

3. Przefiltruj linie, aby wyciągnąć konta systemowe - użytkowników, którzy mają UID (trzecie pole) mniejsze niż 1000

4. Zwróć listę loginów użytkowników systemowych

5. Zaimplementuj rozwiązanie wykorzystując zwykłą funkcję

6. Zaimplementuj rozwiązanie wykorzystując generator i słówko kluczowe `yield`

7. Porównaj wyniki jednego i drugiego rozwiązania przez użycie `sys.getsizeof()`

8. Dlaczego różnice są tak niewielkie?

9. Co się stanie, gdy ilość danych się zwiększy?

The whys and wherefores
• Wykorzystanie generatorów

• Odbieranie danych z lazy evaluation

• Porównanie wielkości struktur danych

• Parsowanie pliku

• Filtrowanie treści w locie

Input
```##
# User Database
#   - User name
#   - User ID number (UID)
#   - User's group ID number (GID)
#   - Full name of the user (GECOS)
#   - User home directory
##

root:x:0:0:root:/root:/bin/bash