8.9. Loop For Dict¶
Since Python 3.7:
dict
keeps orderBefore Python 3.7:
dict
order is not ensured!!
8.9.1. Iterate¶
By default
dict
iterates over keysSuggested variable name:
key
>>> DATA = {'Sepal length': 5.1,
... 'Sepal width': 3.5,
... 'Petal length': 1.4,
... 'Petal width': 0.2,
... 'Species': 'setosa'}
>>>
>>> for obj in DATA:
... print(obj)
Sepal length
Sepal width
Petal length
Petal width
Species
8.9.2. Iterate Keys¶
Suggested variable name:
key
>>> DATA = {'Sepal length': 5.1,
... 'Sepal width': 3.5,
... 'Petal length': 1.4,
... 'Petal width': 0.2,
... 'Species': 'setosa'}
>>>
>>> list(DATA.keys())
['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>>
>>> for obj in DATA.keys():
... print(obj)
Sepal length
Sepal width
Petal length
Petal width
Species
8.9.3. Iterate Values¶
Suggested variable name:
value
>>> DATA = {'Sepal length': 5.1,
... 'Sepal width': 3.5,
... 'Petal length': 1.4,
... 'Petal width': 0.2,
... 'Species': 'setosa'}
>>>
>>> list(DATA.values())
[5.1, 3.5, 1.4, 0.2, 'setosa']
>>>
>>> for obj in DATA.values():
... print(obj)
5.1
3.5
1.4
0.2
setosa
8.9.4. Iterate Key-Value Pairs¶
Suggested variable name:
key
,value
Getting pair: key
, value
from dict
items:
>>> DATA = {'Sepal length': 5.1,
... 'Sepal width': 3.5,
... 'Petal length': 1.4,
... 'Petal width': 0.2,
... 'Species': 'setosa'}
>>>
>>>
>>> list(DATA.items())
[('Sepal length', 5.1),
('Sepal width', 3.5),
('Petal length', 1.4),
('Petal width', 0.2),
('Species', 'setosa')]
>>>
>>> for key, value in DATA.items():
... print(key, '->', value)
Sepal length -> 5.1
Sepal width -> 3.5
Petal length -> 1.4
Petal width -> 0.2
Species -> setosa
8.9.5. List of Dicts¶
Unpacking list
of dict
:
>>> DATA = [{'Sepal length': 5.1, 'Sepal width': 3.5, 'Petal length': 1.4, 'Petal width': 0.2, 'Species': 'setosa'},
... {'Sepal length': 5.7, 'Sepal width': 2.8, 'Petal length': 4.1, 'Petal width': 1.3, 'Species': 'versicolor'},
... {'Sepal length': 6.3, 'Sepal width': 2.9, 'Petal length': 5.6, 'Petal width': 1.8, 'Species': 'virginica'}]
>>>
>>> for row in DATA:
... sepal_length = row['Sepal length']
... species = row['Species']
... print(f'{species} -> {sepal_length}')
setosa -> 5.1
versicolor -> 5.7
virginica -> 6.3
8.9.6. Generate with Range¶
range()
Pythonic way is to use
zip()
Don't use
len(range(...))
- it evaluates generator
Create dict
from two list
:
>>> header = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for i in range(len(header)):
... key = header[i]
... value = data[i]
... result[key] = value
>>>
>>> print(result)
{'Sepal length': 5.1,
'Sepal width': 3.5,
'Petal length': 1.4,
'Petal width': 0.2,
'Species': 'setosa'}
8.9.7. Generate with Enumerate¶
enumerate()
_
regular variable name (not a special syntax)_
by convention is used when variable will not be referenced
Create dict
from two list
:
>>> header = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for i, key in enumerate(header):
... result[key] = data[i]
>>>
>>> print(result)
{'Sepal length': 5.1,
'Sepal width': 3.5,
'Petal length': 1.4,
'Petal width': 0.2,
'Species': 'setosa'}
8.9.8. Generate with Zip¶
zip()
The most Pythonic way
>>> header = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for key, value in zip(header, data):
... result[key] = value
>>>
>>> print(result)
{'Sepal length': 5.1,
'Sepal width': 3.5,
'Petal length': 1.4,
'Petal width': 0.2,
'Species': 'setosa'}
>>> header = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = dict(zip(header, data))
>>>
>>> print(result)
{'Sepal length': 5.1,
'Sepal width': 3.5,
'Petal length': 1.4,
'Petal width': 0.2,
'Species': 'setosa'}
8.9.9. Assignments¶
"""
* Assignment: Loop Dict To Dict
* Required: yes
* Complexity: easy
* Lines of code: 3 lines
* Time: 8 min
English:
1. Convert to `result: dict[str, int]`
2. Run doctests - all must succeed
Polish:
1. Przekonwertuj do `result: dict[str, int]`
2. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is dict, \
'Variable `result` has invalid type, should be dict'
>>> result # doctest: +NORMALIZE_WHITESPACE
{'Doctorate': 6,
'Prof-school': 6,
'Masters': 5,
'Bachelor': 5,
'Engineer': 5,
'HS-grad': 4,
'Junior High': 3,
'Primary School': 2,
'Kindergarten': 1}
"""
DATA = {
6: ['Doctorate', 'Prof-school'],
5: ['Masters', 'Bachelor', 'Engineer'],
4: ['HS-grad'],
3: ['Junior High'],
2: ['Primary School'],
1: ['Kindergarten'],
}
# Converted DATA. Note values are str not int!
# type: dict[str,str]
result = ...
"""
* Assignment: Loop Dict To List
* Required: yes
* Complexity: medium
* Lines of code: 4 lines
* Time: 5 min
English:
1. Define `result: list[dict]`:
a. key - name from the header
b. value - measurement or species
2. Run doctests - all must succeed
Polish:
1. Zdefiniuj `result: list[dict]`:
a. klucz - nazwa z nagłówka
b. wartość - wyniki pomiarów lub gatunek
2. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> assert all(type(x) is dict for x in result)
>>> result # doctest: +NORMALIZE_WHITESPACE
[{'Sepal length': 5.8, 'Sepal width': 2.7, 'Petal length': 5.1, 'Petal width': 1.9, 'Species': 'virginica'},
{'Sepal length': 5.1, 'Sepal width': 3.5, 'Petal length': 1.4, 'Petal width': 0.2, 'Species': 'setosa'},
{'Sepal length': 5.7, 'Sepal width': 2.8, 'Petal length': 4.1, 'Petal width': 1.3, 'Species': 'versicolor'},
{'Sepal length': 6.3, 'Sepal width': 2.9, 'Petal length': 5.6, 'Petal width': 1.8, 'Species': 'virginica'},
{'Sepal length': 6.4, 'Sepal width': 3.2, 'Petal length': 4.5, 'Petal width': 1.5, 'Species': 'versicolor'},
{'Sepal length': 4.7, 'Sepal width': 3.2, 'Petal length': 1.3, 'Petal width': 0.2, 'Species': 'setosa'}]
"""
DATA = [
('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
]
# Define variable `result` with converted DATA
# type: list[dict]
result = ...
"""
* Assignment: Loop Dict Reverse
* Required: no
* Complexity: easy
* Lines of code: 3 lines
* Time: 5 min
English:
1. Use iteration to reverse dict:
that is: change keys for values and values for keys
2. Run doctests - all must succeed
Polish:
1. Użyj iterowania do odwócenia dicta:
to jest: zamień klucze z wartościami i wartości z kluczami
2. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* `dict.items()`
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is dict, \
'Variable `result` has invalid type, should be dict'
>>> assert all(type(x) is str for x in result.keys())
>>> assert all(type(x) is int for x in result.values())
>>> assert len(result.keys()) == 3
>>> assert 'virginica' in result.keys()
>>> assert 'setosa' in result.keys()
>>> assert 'versicolor' in result.keys()
>>> assert 0 in result.values()
>>> assert 1 in result.values()
>>> assert 2 in result.values()
>>> result
{'virginica': 0, 'setosa': 1, 'versicolor': 2}
"""
DATA = {
0: 'virginica',
1: 'setosa',
2: 'versicolor'}
# dict[str,int]:
result = ...
"""
* Assignment: Loop Dict Label Encoder
* Required: no
* Complexity: hard
* Lines of code: 9 lines
* Time: 13 min
English:
1. Define:
a. `features: list[tuple]` - measurements
b. `labels: list[int]` - species
c. `label_encoder: dict[int, str]`
dictionary with encoded (as numbers) species names
2. Separate header from data
3. To encode and decode `labels` (species) we need:
a. Define `label_encoder: dict[int, str]`
b. key - id (incremented integer value)
c. value - species name
4. `label_encoder` must be generated from `DATA`
5. For each row append to `features`, `labels` and `label_encoder`
6. Run doctests - all must succeed
Polish:
1. Zdefiniuj:
a. `features: list[tuple]` - pomiary
b. `labels: list[int]` - gatunki
c. `label_encoder: dict[int, str]`
słownik zakodowanych (jako cyfry) nazw gatunków
2. Odseparuj nagłówek od danych
3. Aby móc zakodować i odkodować `labels` (gatunki) potrzebujesz:
a. Zdefiniuj `label_encoder: dict[int, str]`:
b. key - identyfikator (kolejna liczba rzeczywista)
c. value - nazwa gatunku
4. `label_encoder` musi być wygenerowany z `DATA`
5. Dla każdego wiersza dodawaj do `feature`, `labels` i `label_encoder`
6. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* Reversed lookup dict
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert type(features) is list
>>> assert type(labels) is list
>>> assert type(label_encoder) is dict
>>> assert all(type(x) is tuple for x in features)
>>> assert all(type(x) is int for x in labels)
>>> assert all(type(x) is int for x in label_encoder.keys())
>>> assert all(type(x) is str for x in label_encoder.values())
>>> features # doctest: +NORMALIZE_WHITESPACE
[(5.8, 2.7, 5.1, 1.9),
(5.1, 3.5, 1.4, 0.2),
(5.7, 2.8, 4.1, 1.3),
(6.3, 2.9, 5.6, 1.8),
(6.4, 3.2, 4.5, 1.5),
(4.7, 3.2, 1.3, 0.2)]
>>> labels
[0, 1, 2, 0, 2, 1]
>>> label_encoder # doctest: +NORMALIZE_WHITESPACE
{0: 'virginica',
1: 'setosa',
2: 'versicolor'}
"""
DATA = [
('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
]
# Values from column 0-3 from DATA without header
# type: list[tuple]
features = ...
# Species name from column 4 from DATA without header
# type: list[str]
labels = ...
# Lookup dict generated from species names
# type: dict[int,str]
label_encoder = ...