# 1. Generators and Comprehensions¶

## 1.1. Lazy evaluation¶

• Code do not execute instantly

• Sometimes code is not executed at all!

### 1.1.1. Declaring generators¶

```# This will not execute code!
range(0, 9_999_999)
range(0, 9_999_999)
range(0, 9_999_999)
```
```# This will only create generator expression, but not execute it!
numbers = range(0, 9_999_999)
print(numbers)
# range(0, 9999999)
```

### 1.1.2. Getting values from generator¶

• Get all values from generator (not very efficient)

```numbers = range(0, 1E30)
list(range)
```
• Generator will calculate next number for every loop iteration, forgetting previous number, and not knowing next one

```for i in range(0, 1E30):
print(i)
```
• Will generate only three numbers, not 1,000,000,000,000,000,000,000,000,000,000

```for i in range(0, 1E30):
print(i)

if i == 3:
break

0
1
2
```

## 1.2. Generator expressions vs. Comprehensions¶

### 1.2.1. Comprehensions¶

• Executes instantly

```list(x for x in range(0, 5))        # [0, 1, 2, 3, 4]
[x for x in range(0, 5)]            # [0, 1, 2, 3, 4]
```
```set(x for x in range(0, 5))         # {0, 1, 2, 3, 4}
{x for x in range(0, 5)}            # {0, 1, 2, 3, 4}
```
```{x: x for x in range(0, 5)}         # {0: 0, 1: 1, 2: 2, 3: 3, 4: 4}
```
```tuple(x for x in range(0, 5))       # (0, 1, 2, 3, 4)
```
```all(x for x in range(0, 5))                # False
any(x for x in range(0, 5) if x % 5 == 0)  # True
sum(x*x for x in range(0, 10, 2))          # 120
```

### 1.2.2. Generator Expressions¶

• Lazy evaluation

```(x*x for x in range(0, 30) if x % 2)
# <generator object <genexpr> at 0x1197032a0>
```

### 1.2.3. What is the difference?¶

• Execution and assignment

```numbers = [x**2 for x in range(0, 30) if x % 2 == 0]

print(numbers)
# [0, 4, 16, 36, 64, 100, 144, 196, 256, 324, 400, 484, 576, 676, 784]

print(numbers)
# [0, 4, 16, 36, 64, 100, 144, 196, 256, 324, 400, 484, 576, 676, 784]
```
• Create generator object and assign pointer (do not execute)

```numbers = (x**2 for x in range(0, 30) if x % 2 == 0)

print(numbers)
# <generator object <genexpr> at 0x11af5a570>

print(list(numbers))
# [0, 4, 16, 36, 64, 100, 144, 196, 256, 324, 400, 484, 576, 676, 784]

print(list(numbers))
# []
```

### 1.2.4. Which one is better?¶

• Comprehensions - Using values more than one

• Generators - Using value one (for example in the loop iterator)

### 1.2.5. Nested Comprehensions¶

```DATA = [
{'last_name': 'Jiménez'},
{'first_name': 'Mark', 'last_name': 'Watney'},
{'first_name': 'Иван'},
{'first_name': 'Jan', 'last_name': 'Twardowski', 'born': 1961},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'first_step': 1969},
]

fieldnames = set()
fieldnames.update(key for record in DATA for key in record.keys())
```
```DATA = [
{'last_name': 'Jiménez'},
{'first_name': 'Mark', 'last_name': 'Watney'},
{'first_name': 'Иван'},
{'first_name': 'Jan', 'last_name': 'Twardowski', 'born': 1961},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'first_step': 1969},
]

fieldnames = set()
fieldnames.update(key
for record in DATA
for key in record.keys()
)
```

## 1.3. Operator `yield`¶

```# ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
DATA = [
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(5.4, 3.9, 1.7, 0.4, 'setosa'),
(4.6, 3.4, 1.4, 0.3, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(5.7, 2.8, 4.5, 1.3, 'versicolor'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 3.3, 6.0, 2.5, 'virginica'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(4.9, 2.5, 4.5, 1.7, 'virginica'),
]
```
```def get_species(species):
output = []

for record in DATA:
if record[4] == species:
output.append(record)

return output

data = get_species('setosa')

print(data)
# [(5.1, 3.5, 1.4, 0.2, 'setosa'),
#  (4.9, 3.0, 1.4, 0.2, 'setosa'),
#  (5.4, 3.9, 1.7, 0.4, 'setosa'),
#  (4.6, 3.4, 1.4, 0.3, 'setosa')]

for row in data:
print(row)
# (5.1, 3.5, 1.4, 0.2, 'setosa')
# (4.9, 3.0, 1.4, 0.2, 'setosa')
# (5.4, 3.9, 1.7, 0.4, 'setosa')
# (4.6, 3.4, 1.4, 0.3, 'setosa')
```
```def get_species(species):
for record in DATA:
if record[4] == species:
yield record

data = get_species('setosa')

print(data)
# <generator object get_species at 0x11af257c8>

for row in data:
print(row)
# (5.1, 3.5, 1.4, 0.2, 'setosa')
# (4.9, 3.0, 1.4, 0.2, 'setosa')
# (5.4, 3.9, 1.7, 0.4, 'setosa')
# (4.6, 3.4, 1.4, 0.3, 'setosa')
```

## 1.4. The whys and wherefores¶

### 1.4.1. Filtering `list` items¶

```DATA = [
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(5.4, 3.9, 1.7, 0.4, 'setosa'),
(4.6, 3.4, 1.4, 0.3, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(5.7, 2.8, 4.5, 1.3, 'versicolor'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 3.3, 6.0, 2.5, 'virginica'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(4.9, 2.5, 4.5, 1.7, 'virginica'),
]

setosa = [x for x in DATA if x[4] == 'setosa']
print(setosa)
```

### 1.4.2. Filtering `dict` items¶

```DATA = [
{'first_name': 'Иван', 'last_name': 'Иванович', 'agency': 'Roscosmos'},
{'first_name': 'Jose', 'last_name': 'Jimenez', 'agency': 'NASA'},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'agency': 'NASA'},
{'first_name': 'Alex', 'last_name': 'Vogel', 'agency': 'ESA'},
{'first_name': 'Mark', 'last_name': 'Watney', 'agency': 'NASA'},
]

nasa_astronauts = [(x['first_name'], x['last_name'])
for x in DATA if x['agency'] == 'NASA']
# [
#   ('Jose', 'Jimenez'),
#   ('Melissa', 'Lewis'),
#   ('Mark', 'Watney')
# ]
```

### 1.4.3. Reversing `dict` keys with values¶

```data = {'first_name': 'Иван', 'last_name': 'Иванович'}

{v: k for k, v in data.items()}
# dict {'Иван': 'first_name', 'Иванович': 'last_name'}
```

### 1.4.4. Applying functions¶

```[float(x) for x in range(0, 5) if x % 2 == 0]
# [0.0, 2.0, 4.0, 6.0, 8.0]
```
```def is_even(x):
if x % 2 == 0:
return True
else:
return False

[float(x) for x in range(0, 5) if is_even(x)]
# [0.0, 2.0, 4.0, 6.0, 8.0]
```

Listing 1.47. Clean Code in generator
```DATA = {'username': 'Иван Иванович', 'agency': 'Roscosmos'}

def asd(x):
return x.replace('Иван', 'Ivan')

out = {
value: asd(value)
for key, value in DATA.items()
}

print(out)
# {'Иван Иванович': 'Ivan Ivanович'}

out = ['CCCP' if key == 'Roscosmos' else 'USA' for key, value in DATA.items() if key == 'agency']
print(out)
# ['USA']

```
```DATA = [
{'last_name': 'Jiménez'},
{'first_name': 'Mark', 'last_name': 'Watney'},
{'first_name': 'Иван'},
{'first_name': 'Jan', 'last_name': 'Twardowski', 'born': 1961},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'first_step': 1969},
]

[asd(value)

for d in DATA
for key, value in d.items()

]
```
```DATA = [
{'first_name': 'Иван', 'last_name': 'Иванович', 'agency': 'Roscosmos'},
{'first_name': 'Jose', 'last_name': 'Jimenez', 'agency': 'NASA'},
{'first_name': 'Melissa', 'last_name': 'Lewis', 'agency': 'NASA'},
{'first_name': 'Alex', 'last_name': 'Vogel', 'agency': 'ESA'},
{'first_name': 'Mark', 'last_name': 'Watney', 'agency': 'NASA'},
]

nasa_astronauts = [(astronaut['first_name'], astronaut['last_name']) for astronaut in DATA if astronaut['agency'] == 'NASA']
# [
#   ('Jose', 'Jimenez'),
#   ('Melissa', 'Lewis'),
#   ('Mark', 'Watney')
# ]
```

## 1.6. Assignments¶

### 1.6.1. Generators vs. Comprehensions - iris¶

• Filename: `generator_iris.py`

• Lines of code to write: 40 lines

• Estimated time of completion: 20 min

1. Skopiuj dane do pliku “iris.csv”

2. Zaczytaj dane pomijając nagłówek

3. Napisz funkcję która zwraca wszystkie pomiary dla danego gatunku

4. Gatunek będzie podawany jako `str` do funkcji

5. Zaimplementuj rozwiązanie wykorzystując zwykłą funkcję

6. Zaimplementuj rozwiązanie wykorzystując generator i słówko kluczowe `yield`

The whys and wherefores
• Wykorzystanie generatorów

• Odbieranie danych z lazy evaluation

• Porównanie wielkości struktur danych

• Parsowanie pliku

• Filtrowanie treści w locie

### 1.6.2. Generators vs. Comprehensions - passwd¶

• Filename: `generator_passwd.py`

• Lines of code to write: 40 lines

• Estimated time of completion: 20 min

1. Napisz program, który wczyta plik Listing 1.48.

2. Przefiltruj linie, tak aby nie zawierały komentarzy (zaczynające się od `#`) oraz pustych linii

3. Przefiltruj linie, aby wyciągnąć konta systemowe - użytkowników, którzy mają UID (trzecie pole) mniejsze niż 1000

4. Zwróć listę loginów użytkowników systemowych

5. Zaimplementuj rozwiązanie wykorzystując zwykłą funkcję

6. Zaimplementuj rozwiązanie wykorzystując generator i słówko kluczowe `yield`

7. Porównaj wyniki jednego i drugiego rozwiązania przez użycie `sys.getsizeof()`

The whys and wherefores
• Wykorzystanie generatorów

• Odbieranie danych z lazy evaluation

• Porównanie wielkości struktur danych

• Parsowanie pliku

• Filtrowanie treści w locie

Listing 1.48. `/etc/passwd` sample file
```##
# User Database
#   - User name
#   - User ID number (UID)
#   - User's group ID number (GID)
#   - Full name of the user (GECOS)
#   - User home directory
##

root:x:0:0:root:/root:/bin/bash