14.1. Stdlib Exam

14.1.1. Label encoder

English
  1. For input data (see below)

  2. Define:

    • features: List[tuple] - measurements

    • labels: List[int] - species

  3. Separate header from data

  4. To encode and decode labels (species) we need label_encoder: Dict[int, str]:

    • key - id (incremented integer value)

    • value - species name

  5. label_encoder must be generated from INPUT

  6. For each row add appropriate data to features, labels and label_encoder

  7. Print features, labels and label_encoder

  8. Output must be identical to output data (see below)

Polish
  1. Dla danych wejściowych (patrz sekcja input)

  2. Zdefiniuj:

    • features: List[tuple] - pomiary

    • labels: List[int] - gatunki

    • label_encoder: Dict[int, str] - słownik podmiany nazw gatunków

  3. Odseparuj nagłówek od danych

  4. Aby móc zakodować i odkodować labels (gatunki) potrzebny jest label_encoder: Dict[int, str]:

    • key - identyfikator (kolejna liczba rzeczywista)

    • value - nazwa gatunku

  5. label_encoder musi być wygenerowany z INPUT

  6. Dla każdego wiersza dodawaj odpowiednie dane do feature, labels i label_encoder

  7. Wypisz feature, labels i label_encoder

  8. Wynik ma być identyczny z danymi wyjściowymi (patrz sekcja output)

Input
INPUT = [
    ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
    (7.0, 3.2, 4.7, 1.4, 'versicolor'),
    (7.6, 3.0, 6.6, 2.1, 'virginica'),
    (4.9, 3.0, 1.4, 0.2, 'setosa'),
    (4.9, 2.5, 4.5, 1.7, 'virginica'),
    (7.1, 3.0, 5.9, 2.1, 'virginica'),
    (4.6, 3.4, 1.4, 0.3, 'setosa'),
    (5.4, 3.9, 1.7, 0.4, 'setosa'),
    (5.7, 2.8, 4.5, 1.3, 'versicolor'),
    (5.0, 3.6, 1.4, 0.3, 'setosa'),
    (5.5, 2.3, 4.0, 1.3, 'versicolor'),
    (6.5, 3.0, 5.8, 2.2, 'virginica'),
    (6.5, 2.8, 4.6, 1.5, 'versicolor'),
    (6.3, 3.3, 6.0, 2.5, 'virginica'),
    (6.9, 3.1, 4.9, 1.5, 'versicolor'),
    (4.6, 3.1, 1.5, 0.2, 'setosa'),
]
Output
from typing import List, Dict


features: List[tuple] = [
    (5.8, 2.7, 5.1, 1.9),
    (5.1, 3.5, 1.4, 0.2),
    (5.7, 2.8, 4.1, 1.3),
    (6.3, 2.9, 5.6, 1.8),
    (6.4, 3.2, 4.5, 1.5),
    (4.7, 3.2, 1.3, 0.2), ...]

labels: List[int] = [0, 1, 2, 1, 2, 0, ...]

label_encoder: Dict[int, str] = {
    0: 'virginica',
    1: 'setosa',
    2: 'versicolor'}
The whys and wherefores
  • dict lookups

  • Dynamic dict generating

  • dict reversal

14.1.2. Relational Files Database

  • Complexity level: medium

  • Lines of code to write: 100-150 lines

  • Estimated time of completion: 60 min

  • Solution: solution/exam.py

English
  1. Save listings content to files:

  2. Copy also comments and empty lines

  3. Parse files and convert it to output: List[dict]

  4. Return list of users with UID greater than 1000

  5. User dict should contains data collected from all files

Polish
  1. Zapisz treści listingów do plików:

  2. Skopiuj również komentarze i puste linie

  3. Sparsuj plik i przedstaw go w formacie output: List[dict]

  4. Zwróć listę użytkowników, których UID jest większy niż 1000

  5. Dict użytkownika powinien zawierać dane z wszystkich plików

Input
Listing 430. Przykładowa zawartość pliku /etc/passwd
##
# ``/etc/passwd`` structure:
#   - Username
#   - Password: ``x`` indicates that shadow passwords are used
#   - UID: User ID number
#   - GID: User's group ID number
#   - GECOS: Full name of the user
#   - Home directory
#   - Login shell
##

root:x:0:0:root:/root:/bin/bash
bin:x:1:1:bin:/bin:/sbin/nologin
daemon:x:2:2:daemon:/sbin:/sbin/nologin
adm:x:3:4:adm:/var/adm:/sbin/nologin
shutdown:x:6:0:shutdown:/sbin:/sbin/shutdown
halt:x:7:0:halt:/sbin:/sbin/halt
nobody:x:99:99:Nobody:/:/sbin/nologin
sshd:x:74:74:Privilege-separated SSH:/var/empty/sshd:/sbin/nologin
watney:x:1000:1000:Mark Watney:/home/peck:/bin/bash
twardowski:x:1001:1001:Jan Twardowski:/home/twardowski:/bin/bash
ivanovic:x:1002:1002:Ivan Ivanovic:/home/ivanovic:/bin/bash
Listing 431. Przykładowa zawartość pliku /etc/shadow
##
# ``/etc/shadow`` structure
#   - Username: from ``/etc/passwd``
#   - Password
#   - Last Password Change: Days since 1970-01-01
#   - Minimum days between password changes: 0 - changed at any time
#   - Password validity: Days after which password must be changed, 99999 - many, many years
#   - Warning threshold: Days to warn user of an expiring password, 7 - full week
#   - Account inactive: Days after password expires and account is disabled
#   - Time since account is disabled: Days since 1970-01-01
#   - A reserved field for possible future use
#
# Password field (split by ``$``):
#   - algorithm
#   - salt
#   - password hash
#
# Password algorithms:
#   - ``1`` - MD5
#   - ``2a`` - Blowfish
#   - ``2y`` - Blowfish
#   - ``5`` - SHA-256
#   - ``6`` - SHA-512
#
# Password special chars:
#   - `` `` (blank entry) - password is not required to log in
#   - ``*`` (asterisk) - account is disabled, cannot be unlocked, no password has ever been set
#   - ``!`` (exclamation mark) - account is locked, can be unlocked, no password has ever been set
#   - ``!<password_hash>`` - account is locked, can be unlocked, but password is set
#   - ``!!`` (two exclamation marks) - account created, waiting for initial password to be set by admin
##

root:$6$Ke02nYgo.9v0SF4p$hjztYvo/M4buqO4oBX8KZTftjCn6fE4cV5o/I95QPekeQpITwFTRbDUBYBLIUx2mhorQoj9bLN8v.w6btE9xy1:16431:0:99999:7:::
adm:$6$5H0QpwprRiJQR19Y$bXGOh7dIfOWpUb/Tuqr7yQVCqL3UkrJns9.7msfvMg4ZO/PsFC5Tbt32PXAw9qRFEBs1254aLimFeNM8YsYOv.:16431:0:99999:7:::
watney:!!:16550::::::
twardowski:$6$P9zn0KwR$tgfvvFWJJ5FKmoXiP5rXWOjwoEBOEoAuBi3EphRbJqqjWYvhEM2wa67L9XgQ7W591FxUNklkDIQsk4kijuhE50:16632:0:99999:7:::
ivanovic:$1$.QKDPc5E$SWlkjRWexrXYgc98F.:12825:0:90:5:30:13096:
Listing 432. Przykładowa zawartość pliku /etc/group
##
# ``/etc/group`` structure
#   - Group Name: from ``/etc/passwd``
#   - Group Password: ``x`` indicates that shadow passwords are used)
#   - GID: Group ID
#   - Members: usernames from ``/etc/passwd``
##

root::0:root
other::1:
bin::2:root,bin,daemon
sys::3:root,bin,sys,adm
adm::4:root,adm,daemon
mail::6:root
astronauts::10:twardowski,watney,ivanovic
daemon::12:root,daemon
sysadmin::14:twardowski,ivanovic
mars::1000:watney
moon::1001:twardowski
nobody::60001:
noaccess::60002:
nogroup::65534:
Output
output: List[dict] = [
    {
        'username': 'twardowski',
        'uid': 1001,
        'gid': 1001,
        'home': '/home/twardowski',
        'shell': '/bin/bash',
        'algorithm': 'SHA-512',
        'password': 'tgfvvFWJJ5...k4kijuhE50',
        'salt': 'P9zn0KwR',
        'groups': {'astronauts', 'sysadmin'},
        'last_changed': datetime.date(2015, 7, 16),
        'locked': False,
    },
    ...
]
The whys and wherefores