11.5. Loop For Dict¶
Since Python 3.7:
dict
keeps orderBefore Python 3.7:
dict
order is not ensured!!
11.5.1. Iterate¶
By default
dict
iterates over keysSuggested variable name:
key
>>> DATA = {
... 'sepal_length': 5.1,
... 'sepal_width': 3.5,
... 'petal_length': 1.4,
... 'petal_width': 0.2,
... 'species': 'setosa',
... }
>>>
>>> for obj in DATA:
... print(obj)
sepal_length
sepal_width
petal_length
petal_width
species
11.5.2. Iterate Keys¶
Suggested variable name:
key
>>> DATA = {
... 'sepal_length': 5.1,
... 'sepal_width': 3.5,
... 'petal_length': 1.4,
... 'petal_width': 0.2,
... 'species': 'setosa',
... }
>>>
>>> list(DATA.keys())
['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
>>>
>>> for obj in DATA.keys():
... print(obj)
sepal_length
sepal_width
petal_length
petal_width
species
11.5.3. Iterate Values¶
Suggested variable name:
value
>>> DATA = {
... 'sepal_length': 5.1,
... 'sepal_width': 3.5,
... 'petal_length': 1.4,
... 'petal_width': 0.2,
... 'species': 'setosa',
... }
>>>
>>> list(DATA.values())
[5.1, 3.5, 1.4, 0.2, 'setosa']
>>>
>>> for obj in DATA.values():
... print(obj)
5.1
3.5
1.4
0.2
setosa
11.5.4. Iterate Key-Value Pairs¶
Suggested variable name:
key
,value
Getting pair: key
, value
from dict
items:
>>> DATA = {
... 'sepal_length': 5.1,
... 'sepal_width': 3.5,
... 'petal_length': 1.4,
... 'petal_width': 0.2,
... 'species': 'setosa',
... }
>>>
>>>
>>> list(DATA.items())
[('sepal_length', 5.1),
('sepal_width', 3.5),
('petal_length', 1.4),
('petal_width', 0.2),
('species', 'setosa')]
>>>
>>> for key, value in DATA.items():
... print(key, '->', value)
sepal_length -> 5.1
sepal_width -> 3.5
petal_length -> 1.4
petal_width -> 0.2
species -> setosa
11.5.5. List of Dicts¶
Unpacking list
of dict
:
>>> DATA = [
... {'sepal_length': 5.1, 'sepal_width': 3.5, 'petal_length': 1.4, 'petal_width': 0.2, 'species': 'setosa'},
... {'sepal_length': 5.7, 'sepal_width': 2.8, 'petal_length': 4.1, 'petal_width': 1.3, 'species': 'versicolor'},
... {'sepal_length': 6.3, 'sepal_width': 2.9, 'petal_length': 5.6, 'petal_width': 1.8, 'species': 'virginica'},
... ]
>>>
>>> for row in DATA:
... sepal_length = row['sepal_length']
... species = row['species']
... print(f'{species} -> {sepal_length}')
setosa -> 5.1
versicolor -> 5.7
virginica -> 6.3
11.5.6. Generate with Range¶
range()
Pythonic way is to use
zip()
Don't use
len(range(...))
- it evaluates generator
Create dict
from two list
:
>>> header = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for i in range(len(header)):
... key = header[i]
... value = data[i]
... result[key] = value
>>>
>>> print(result)
{'sepal_length': 5.1,
'sepal_width': 3.5,
'petal_length': 1.4,
'petal_width': 0.2,
'species': 'setosa'}
11.5.7. Generate with Enumerate¶
enumerate()
_
regular variable name (not a special syntax)_
by convention is used when variable will not be referenced
Create dict
from two list
:
>>> header = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for i, key in enumerate(header):
... result[key] = data[i]
>>>
>>> print(result)
{'sepal_length': 5.1,
'sepal_width': 3.5,
'petal_length': 1.4,
'petal_width': 0.2,
'species': 'setosa'}
11.5.8. Generate with Zip¶
zip()
The most Pythonic way
>>> header = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for key, value in zip(header, data):
... result[key] = value
>>>
>>> print(result)
{'sepal_length': 5.1,
'sepal_width': 3.5,
'petal_length': 1.4,
'petal_width': 0.2,
'species': 'setosa'}
>>> header = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = dict(zip(header, data))
>>>
>>> print(result)
{'sepal_length': 5.1,
'sepal_width': 3.5,
'petal_length': 1.4,
'petal_width': 0.2,
'species': 'setosa'}
11.5.9. Assignments¶
"""
* Assignment: Loop Dict To Dict
* Type: class assignment
* Complexity: easy
* Lines of code: 3 lines
* Time: 8 min
English:
1. Convert to `result: dict[str, int]`
2. Run doctests - all must succeed
Polish:
1. Przekonwertuj do `result: dict[str, int]`
2. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from pprint import pprint
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is dict, \
'Variable `result` has invalid type, should be dict'
>>> pprint(result, sort_dicts=False)
{'Doctorate': 6,
'Prof-school': 6,
'Masters': 5,
'Bachelor': 5,
'Engineer': 5,
'HS-grad': 4,
'Junior High': 3,
'Primary School': 2,
'Kindergarten': 1}
"""
DATA = {
6: ['Doctorate', 'Prof-school'],
5: ['Masters', 'Bachelor', 'Engineer'],
4: ['HS-grad'],
3: ['Junior High'],
2: ['Primary School'],
1: ['Kindergarten'],
}
# Converted DATA. Note values are str not int!
# type: dict[str,str]
result = ...
"""
* Assignment: Loop Dict To List
* Type: class assignment
* Complexity: medium
* Lines of code: 4 lines
* Time: 5 min
English:
1. Define `result: list[dict]`:
a. key - name from the header
b. value - measurement or species
2. Run doctests - all must succeed
Polish:
1. Zdefiniuj `result: list[dict]`:
a. klucz - nazwa z nagłówka
b. wartość - wyniki pomiarów lub gatunek
2. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from pprint import pprint
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> assert all(type(x) is dict for x in result)
>>> pprint(result, width=120, sort_dicts=False)
[{'sepal_length': 5.8, 'sepal_width': 2.7, 'petal_length': 5.1, 'petal_width': 1.9, 'species': 'virginica'},
{'sepal_length': 5.1, 'sepal_width': 3.5, 'petal_length': 1.4, 'petal_width': 0.2, 'species': 'setosa'},
{'sepal_length': 5.7, 'sepal_width': 2.8, 'petal_length': 4.1, 'petal_width': 1.3, 'species': 'versicolor'},
{'sepal_length': 6.3, 'sepal_width': 2.9, 'petal_length': 5.6, 'petal_width': 1.8, 'species': 'virginica'},
{'sepal_length': 6.4, 'sepal_width': 3.2, 'petal_length': 4.5, 'petal_width': 1.5, 'species': 'versicolor'},
{'sepal_length': 4.7, 'sepal_width': 3.2, 'petal_length': 1.3, 'petal_width': 0.2, 'species': 'setosa'}]
"""
DATA = [
('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
]
# Define variable `result` with converted DATA
# type: list[dict]
result = ...
"""
* Assignment: Loop Dict Reverse
* Type: homework
* Complexity: easy
* Lines of code: 3 lines
* Time: 5 min
English:
1. Use iteration to reverse dict:
that is: change keys for values and values for keys
2. Run doctests - all must succeed
Polish:
1. Użyj iterowania do odwócenia dicta:
to jest: zamień klucze z wartościami i wartości z kluczami
2. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* `dict.items()`
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is dict, \
'Variable `result` has invalid type, should be dict'
>>> assert all(type(x) is str for x in result.keys())
>>> assert all(type(x) is int for x in result.values())
>>> assert len(result.keys()) == 3
>>> assert 'virginica' in result.keys()
>>> assert 'setosa' in result.keys()
>>> assert 'versicolor' in result.keys()
>>> assert 0 in result.values()
>>> assert 1 in result.values()
>>> assert 2 in result.values()
>>> result
{'virginica': 0, 'setosa': 1, 'versicolor': 2}
"""
DATA = {
0: 'virginica',
1: 'setosa',
2: 'versicolor',
}
# dict[str,int]:
result = ...
"""
* Assignment: Loop Dict Label Encoder
* Type: homework
* Complexity: hard
* Lines of code: 9 lines
* Time: 13 min
English:
1. Define:
a. `features: list[tuple]` - measurements
b. `labels: list[int]` - species
c. `label_encoder: dict[int, str]`
dictionary with encoded (as numbers) species names
2. Separate header from data
3. To encode and decode `labels` (species) we need:
a. Define `label_encoder: dict[int, str]`
b. key - id (incremented integer value)
c. value - species name
4. `label_encoder` must be generated from `DATA`
5. For each row append to `features`, `labels` and `label_encoder`
6. Run doctests - all must succeed
Polish:
1. Zdefiniuj:
a. `features: list[tuple]` - pomiary
b. `labels: list[int]` - gatunki
c. `label_encoder: dict[int, str]`
słownik zakodowanych (jako cyfry) nazw gatunków
2. Odseparuj nagłówek od danych
3. Aby móc zakodować i odkodować `labels` (gatunki) potrzebujesz:
a. Zdefiniuj `label_encoder: dict[int, str]`:
b. key - identyfikator (kolejna liczba rzeczywista)
c. value - nazwa gatunku
4. `label_encoder` musi być wygenerowany z `DATA`
5. Dla każdego wiersza dodawaj do `feature`, `labels` i `label_encoder`
6. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* Reversed lookup dict
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from pprint import pprint
>>> assert type(features) is list
>>> assert type(labels) is list
>>> assert type(label_encoder) is dict
>>> assert all(type(x) is tuple for x in features)
>>> assert all(type(x) is int for x in labels)
>>> assert all(type(x) is int for x in label_encoder.keys())
>>> assert all(type(x) is str for x in label_encoder.values())
>>> pprint(features)
[(5.8, 2.7, 5.1, 1.9),
(5.1, 3.5, 1.4, 0.2),
(5.7, 2.8, 4.1, 1.3),
(6.3, 2.9, 5.6, 1.8),
(6.4, 3.2, 4.5, 1.5),
(4.7, 3.2, 1.3, 0.2)]
>>> pprint(labels)
[0, 1, 2, 0, 2, 1]
>>> pprint(label_encoder, width=20)
{0: 'virginica',
1: 'setosa',
2: 'versicolor'}
"""
DATA = [
('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
]
# Values from column 0-3 from DATA without header
# type: list[tuple]
features = ...
# species name from column 4 from DATA without header
# type: list[str]
labels = ...
# Lookup dict generated from species names
# type: dict[int,str]
label_encoder = ...