8.9. Loop For Dict

  • Since Python 3.7: dict keeps order

  • Before Python 3.7: dict order is not ensured!!

8.9.1. Iterate

  • By default dict iterates over keys

  • Suggested variable name: key

>>> DATA = {'Sepal length': 5.1,
...         'Sepal width': 3.5,
...         'Petal length': 1.4,
...         'Petal width': 0.2,
...         'Species': 'setosa'}
>>>
>>> for obj in DATA:
...     print(obj)
Sepal length
Sepal width
Petal length
Petal width
Species

8.9.2. Iterate Keys

  • Suggested variable name: key

>>> DATA = {'Sepal length': 5.1,
...         'Sepal width': 3.5,
...         'Petal length': 1.4,
...         'Petal width': 0.2,
...         'Species': 'setosa'}
>>>
>>> list(DATA.keys())
['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>>
>>> for obj in DATA.keys():
...     print(obj)
Sepal length
Sepal width
Petal length
Petal width
Species

8.9.3. Iterate Values

  • Suggested variable name: value

>>> DATA = {'Sepal length': 5.1,
...         'Sepal width': 3.5,
...         'Petal length': 1.4,
...         'Petal width': 0.2,
...         'Species': 'setosa'}
>>>
>>> list(DATA.values())
[5.1, 3.5, 1.4, 0.2, 'setosa']
>>>
>>> for obj in DATA.values():
...     print(obj)
5.1
3.5
1.4
0.2
setosa

8.9.4. Iterate Key-Value Pairs

  • Suggested variable name: key, value

Getting pair: key, value from dict items:

>>> DATA = {'Sepal length': 5.1,
...         'Sepal width': 3.5,
...         'Petal length': 1.4,
...         'Petal width': 0.2,
...         'Species': 'setosa'}
>>>
>>>
>>> list(DATA.items())  
[('Sepal length', 5.1),
 ('Sepal width', 3.5),
 ('Petal length', 1.4),
 ('Petal width', 0.2),
 ('Species', 'setosa')]
>>>
>>> for key, value in DATA.items():
...     print(key, '->', value)
Sepal length -> 5.1
Sepal width -> 3.5
Petal length -> 1.4
Petal width -> 0.2
Species -> setosa

8.9.5. List of Dicts

Unpacking list of dict:

>>> DATA = [{'Sepal length': 5.1, 'Sepal width': 3.5, 'Petal length': 1.4, 'Petal width': 0.2, 'Species': 'setosa'},
...         {'Sepal length': 5.7, 'Sepal width': 2.8, 'Petal length': 4.1, 'Petal width': 1.3, 'Species': 'versicolor'},
...         {'Sepal length': 6.3, 'Sepal width': 2.9, 'Petal length': 5.6, 'Petal width': 1.8, 'Species': 'virginica'}]
>>>
>>> for row in DATA:
...     sepal_length = row['Sepal length']
...     species = row['Species']
...     print(f'{species} -> {sepal_length}')
setosa -> 5.1
versicolor -> 5.7
virginica -> 6.3

8.9.6. Generate with Range

  • range()

  • Pythonic way is to use zip()

  • Don't use len(range(...)) - it evaluates generator

Create dict from two list:

>>> header = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for i in range(len(header)):
...     key = header[i]
...     value = data[i]
...     result[key] = value
>>>
>>> print(result)  
{'Sepal length': 5.1,
 'Sepal width': 3.5,
 'Petal length': 1.4,
 'Petal width': 0.2,
 'Species': 'setosa'}

8.9.7. Generate with Enumerate

  • enumerate()

  • _ regular variable name (not a special syntax)

  • _ by convention is used when variable will not be referenced

Create dict from two list:

>>> header = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for i, key in enumerate(header):
...     result[key] = data[i]
>>>
>>> print(result)  
{'Sepal length': 5.1,
 'Sepal width': 3.5,
 'Petal length': 1.4,
 'Petal width': 0.2,
 'Species': 'setosa'}

8.9.8. Generate with Zip

  • zip()

  • The most Pythonic way

>>> header = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = {}
>>>
>>> for key, value in zip(header, data):
...     result[key] = value
>>>
>>> print(result)  
{'Sepal length': 5.1,
 'Sepal width': 3.5,
 'Petal length': 1.4,
 'Petal width': 0.2,
 'Species': 'setosa'}
>>> header = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species']
>>> data = [5.1, 3.5, 1.4, 0.2, 'setosa']
>>> result = dict(zip(header, data))
>>>
>>> print(result)  
{'Sepal length': 5.1,
 'Sepal width': 3.5,
 'Petal length': 1.4,
 'Petal width': 0.2,
 'Species': 'setosa'}

8.9.9. Assignments

Code 8.20. Solution
"""
* Assignment: Loop Dict To Dict
* Required: yes
* Complexity: easy
* Lines of code: 3 lines
* Time: 8 min

English:
    1. Convert to `result: dict[str, int]`
    2. Run doctests - all must succeed

Polish:
    1. Przekonwertuj do `result: dict[str, int]`
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'
    >>> assert type(result) is dict, \
    'Variable `result` has invalid type, should be dict'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    {'Doctorate': 6,
     'Prof-school': 6,
     'Masters': 5,
     'Bachelor': 5,
     'Engineer': 5,
     'HS-grad': 4,
     'Junior High': 3,
     'Primary School': 2,
     'Kindergarten': 1}
"""

DATA = {
    6: ['Doctorate', 'Prof-school'],
    5: ['Masters', 'Bachelor', 'Engineer'],
    4: ['HS-grad'],
    3: ['Junior High'],
    2: ['Primary School'],
    1: ['Kindergarten'],
}

# Converted DATA. Note values are str not int!
# type: dict[str,str]
result = ...

Code 8.21. Solution
"""
* Assignment: Loop Dict To List
* Required: yes
* Complexity: medium
* Lines of code: 4 lines
* Time: 5 min

English:
    1. Define `result: list[dict]`:
        a. key - name from the header
        b. value - measurement or species
    2. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: list[dict]`:
        a. klucz - nazwa z nagłówka
        b. wartość - wyniki pomiarów lub gatunek
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'
    >>> assert type(result) is list, \
    'Variable `result` has invalid type, should be list'

    >>> assert all(type(x) is dict for x in result)

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [{'Sepal length': 5.8, 'Sepal width': 2.7, 'Petal length': 5.1, 'Petal width': 1.9, 'Species': 'virginica'},
     {'Sepal length': 5.1, 'Sepal width': 3.5, 'Petal length': 1.4, 'Petal width': 0.2, 'Species': 'setosa'},
     {'Sepal length': 5.7, 'Sepal width': 2.8, 'Petal length': 4.1, 'Petal width': 1.3, 'Species': 'versicolor'},
     {'Sepal length': 6.3, 'Sepal width': 2.9, 'Petal length': 5.6, 'Petal width': 1.8, 'Species': 'virginica'},
     {'Sepal length': 6.4, 'Sepal width': 3.2, 'Petal length': 4.5, 'Petal width': 1.5, 'Species': 'versicolor'},
     {'Sepal length': 4.7, 'Sepal width': 3.2, 'Petal length': 1.3, 'Petal width': 0.2, 'Species': 'setosa'}]
"""

DATA = [
    ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
]

# Define variable `result` with converted DATA
# type: list[dict]
result = ...

Code 8.22. Solution
"""
* Assignment: Loop Dict Reverse
* Required: no
* Complexity: easy
* Lines of code: 3 lines
* Time: 5 min

English:
    1. Use iteration to reverse dict:
       that is: change keys for values and values for keys
    2. Run doctests - all must succeed

Polish:
    1. Użyj iterowania do odwócenia dicta:
       to jest: zamień klucze z wartościami i wartości z kluczami
    2. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `dict.items()`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert result is not Ellipsis, \
    'Assign your result to variable `result`'
    >>> assert type(result) is dict, \
    'Variable `result` has invalid type, should be dict'

    >>> assert all(type(x) is str for x in result.keys())
    >>> assert all(type(x) is int for x in result.values())
    >>> assert len(result.keys()) == 3

    >>> assert 'virginica' in result.keys()
    >>> assert 'setosa' in result.keys()
    >>> assert 'versicolor' in result.keys()

    >>> assert 0 in result.values()
    >>> assert 1 in result.values()
    >>> assert 2 in result.values()

    >>> result
    {'virginica': 0, 'setosa': 1, 'versicolor': 2}
"""

DATA = {
    0: 'virginica',
    1: 'setosa',
    2: 'versicolor'}

# dict[str,int]:
result = ...

Code 8.23. Solution
"""
* Assignment: Loop Dict Label Encoder
* Required: no
* Complexity: hard
* Lines of code: 9 lines
* Time: 13 min

English:
    1. Define:
        a. `features: list[tuple]` - measurements
        b. `labels: list[int]` - species
        c. `label_encoder: dict[int, str]`
            dictionary with encoded (as numbers) species names
    2. Separate header from data
    3. To encode and decode `labels` (species) we need:
        a. Define `label_encoder: dict[int, str]`
        b. key - id (incremented integer value)
        c. value - species name
    4. `label_encoder` must be generated from `DATA`
    5. For each row append to `features`, `labels` and `label_encoder`
    6. Run doctests - all must succeed

Polish:
    1. Zdefiniuj:
        a. `features: list[tuple]` - pomiary
        b. `labels: list[int]` - gatunki
        c. `label_encoder: dict[int, str]`
            słownik zakodowanych (jako cyfry) nazw gatunków
    2. Odseparuj nagłówek od danych
    3. Aby móc zakodować i odkodować `labels` (gatunki) potrzebujesz:
        a. Zdefiniuj `label_encoder: dict[int, str]`:
        b. key - identyfikator (kolejna liczba rzeczywista)
        c. value - nazwa gatunku
    4. `label_encoder` musi być wygenerowany z `DATA`
    5. Dla każdego wiersza dodawaj do `feature`, `labels` i `label_encoder`
    6. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * Reversed lookup dict

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert type(features) is list
    >>> assert type(labels) is list
    >>> assert type(label_encoder) is dict
    >>> assert all(type(x) is tuple for x in features)
    >>> assert all(type(x) is int for x in labels)
    >>> assert all(type(x) is int for x in label_encoder.keys())
    >>> assert all(type(x) is str for x in label_encoder.values())

    >>> features  # doctest: +NORMALIZE_WHITESPACE
    [(5.8, 2.7, 5.1, 1.9),
     (5.1, 3.5, 1.4, 0.2),
     (5.7, 2.8, 4.1, 1.3),
     (6.3, 2.9, 5.6, 1.8),
     (6.4, 3.2, 4.5, 1.5),
     (4.7, 3.2, 1.3, 0.2)]
    >>> labels
    [0, 1, 2, 0, 2, 1]
    >>> label_encoder  # doctest: +NORMALIZE_WHITESPACE
    {0: 'virginica',
     1: 'setosa',
     2: 'versicolor'}
"""

DATA = [
    ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
]

# Values from column 0-3 from DATA without header
# type: list[tuple]
features = ...

# Species name from column 4 from DATA without header
# type: list[str]
labels = ...

# Lookup dict generated from species names
# type: dict[int,str]
label_encoder = ...