8.2. Control Flow Files

8.2.1. Path

8.2.1.1. Absolute path

  • FILE as constant (never hardcode paths)

  • FILE as a raw string r'...'

Listing 70. Windows paths
FILE = 'C:\\Temp\\iris.csv'
FILE = r'C:\Temp\iris.csv'
Listing 71. Linux, macOS, BSD
FILE = '/tmp/iris.csv'
FILE = r'/tmp/iris.csv'

8.2.1.2. Relative path

  • FILE as constant (never hardcode paths)

  • FILE as a raw string r'...'

Listing 72. File in the same directory directory
FILE = r'iris.csv'
Listing 73. File in parent directory
FILE = r'../iris.csv'
FILE = r'../data/iris.csv'
Listing 74. File in the same directory as script
from os.path import dirname, join

FILE = join(dirname(__file__), 'iris.csv')

8.2.2. Read from file

  • Works with both relative and absolute path

  • Fails when directory with file cannot be accessed

  • Fails when file cannot be accessed

  • Uses context manager

  • mode parameter to open() function is optional (defaults to mode='r')

  • Reading access modes:

    • mode='r' - write in text mode (default)

    • mode='rt' - write in text mode

    • mode='rb' - write in binary mode

8.2.2.1. Reading file line by line

Listing 75. file can be iterated line by line
with open(r'/tmp/iris.csv') as file:
    for line in file:
        print(line)

8.2.2.2. Reading whole file content

Listing 76. Read whole file as a text to content variable
with open(r'/tmp/iris.csv') as file:
    content = file.read()

8.2.2.3. Reading file as list with lines

Listing 77. Convert file to list by line
with open(r'/tmp/iris.csv') as file:
    lines = file.readlines()

8.2.2.4. Read selected lines from file

Listing 78. Convert file to list by line, select 1-30 lines
with open(r'/tmp/iris.csv') as file:
    lines = file.readlines()[1:30]
Listing 79. Convert file to list by line, select 1-30 lines
with open(r'/tmp/iris.csv') as file:
    for line in file.readlines()[1:30]:
        print(line)

8.2.3. Writing

8.2.3.1. Writing to file

  • Works with both relative and absolute path

  • Fails when directory with file cannot be accessed

  • Creates file if not exists

  • Truncate the file before writing

  • mode parameter to open() function is required

  • Writing modes:

    • mode='w' - write in text mode

    • mode='wt' - write in text mode

    • mode='wb' - write in binary mode

Listing 80. Writing to file
with open(r'/tmp/iris.csv', mode='w') as file:
    file.write('hello')

8.2.3.2. Appending to file

  • Works with both relative and absolute path

  • Fails when directory with file cannot be accessed

  • Creates file if not exists

  • Append to the end of file

  • mode parameter to open() function is required

  • Writing modes:

    • mode='a' - append in text mode

    • mode='at' - append in text mode

    • mode='ab' - append in binary mode

Listing 81. Appending to file
with open(r'/tmp/iris.csv', mode='a') as file:
    file.write('hello')

8.2.4. Exception handling

Listing 82. Exception handling while accessing files
try:
    with open(r'/tmp/iris.csv') as file:
        for line in file:
            print(line)

except FileNotFoundError:
    print('File does not exist')

except PermissionError:
    print('Permission denied')

8.2.5. Good Engineering Practises

  • FILE as a raw string r'...' constant

  • encoding='utf-8'

  • Use context manager - with keyword

8.2.6. Assignments

8.2.6.1. Content of a requested file

English
  1. Using input() ask user for a file path

  2. Print file content

  3. Handle exception for not existing file

  4. Handle exception for not having sufficient permissions

Polish
  1. Używając input() zapytaj użytkownika o ścieżkę do pliku

  2. Wypisz zawartość pliku

  3. Obsłuż wyjątek dla nieistniejącego pliku

  4. Obsłuż wyjątek dla braku wystarczających uprawnień

8.2.6.2. Parsing simple CSV file

English
  1. Download data/iris.csv save as iris.csv

  2. Define:

    • features - list of measurements (each row is a tuple)

    • labels - list of species names

  3. For each line in file:

    1. Remove whitespaces

    2. Split line by coma ,

    3. Append measurements to features

    4. Append species name to labels

  4. Print features and labels

Polish
  1. Ściągnij data/iris.csv i zapisz jako iris.csv

  2. Zdefiniuj:

    • features - lista pomiarów (każdy wiersz to tuple)

    • labels - lista nazw gatunków

  3. Dla każdej linii:

    1. Usuń białe znaki

    2. Podziel linię po przecinku ,

    3. Dodaj pomiary do features

    4. Dodaj gatunek do labels

  4. Wyświetl features i labels

The whys and wherefores
  • Reading file

  • Iterating over lines in file

  • String methods

  • Working with nested sequences

8.2.6.3. /etc/hosts - parsing to dict

English
  1. Copy input data from listing below and save to file hosts.txt

  2. For each line in file:

    1. Remove leading and trailing whitespaces

    2. Split line by whitespace

    3. Separate IP address and hosts names

    4. Append IP address and hosts names to OUTPUT

  3. Merge hostnames for the same IP

Polish
  1. Skopiuj dane wejściowe z listingu poniżej i zapisz do pliku hosts.txt

  2. Dla każdej lini w piku:

    1. Usuń białe znaki na początku i końcu linii

    2. Podziel linię po białych znakach

    3. Odseparuj adres IP i nazwy hostów

    4. Dodaj adres IP i nazwy hostów do OUTPUT

  3. Scal nazwy hostów dla tego samego IP

Input
127.0.0.1       localhost
127.0.0.1       astromatt
10.13.37.1      nasa.gov esa.int roscosmos.ru
255.255.255.255 broadcasthost
::1             localhost
Output
OUTPUT: Dict[str, List[str]] = {
    '127.0.0.1': ['localhost', 'astromatt'],
    '10.13.37.1': ['nasa.gov', 'esa.int', 'roscosmos.ru'],
    '255.255.255.255': ['broadcasthost'],
    '::1': ['localhost'],
}
The whys and wherefores
  • Reading file

  • Iterating over lines in file

  • String methods

  • Working with nested sequences

8.2.6.4. /etc/hosts - parsing to List[dict]

English
  1. Copy input data from listing below and save to file hosts.txt

  2. Copy also comments and empty lines

  3. For each line in file:

    1. Skup line if it's empty, is whitespace or starts with comment #

    2. Remove leading and trailing whitespaces

    3. Split line by whitespace

    4. Separate IP address and hosts names

    5. Use one line if to check whether dot . is in the IP address

    6. If is present then protocol is IPv4 otherwise IPv6

    7. Append IP address and hosts names to OUTPUT

  4. Merge hostnames for the same IP

  5. OUTPUT must be list of dicts (List[dict])

Polish
  1. Skopiuj dane wejściowe z listingu poniżej i zapisz do pliku hosts.txt

  2. Skopiuj również komentarz i pustą linię

  3. Dla każdej lini w piku:

    1. Pomiń linię jeżeli jest pusta, jest białym znakiem lub zaczyna się od komentarza #

    2. Usuń białe znaki na początku i końcu linii

    3. Podziel linię po białych znakach

    4. Odseparuj adres IP i nazwy hostów

    5. Wykorzystaj jednolinikowego if do sprawdzenia czy jest kropka . w adresie IP

    6. Jeżeli jest obecna to protokół jest IPv4, w przeciwnym przypadku IPv6

    7. Dodaj adres IP i nazwy hostów do OUTPUT

  4. Scal nazwy hostów dla tego samego IP

  5. OUTPUT ma być listą dictów (List[dict])

Input
##
# ``/etc/hosts`` structure:
#   - IPv4 or IPv6
#   - Hostnames
##

127.0.0.1       localhost
127.0.0.1       astromatt
10.13.37.1      nasa.gov esa.int roscosmos.ru
255.255.255.255 broadcasthost
::1             localhost
Output
OUTPUT: List[Dict[str, Union[str, Set[str]]] = [
    {'ip': '127.0.0.1', 'protocol': 'ipv4', 'hostnames': {'localhost', 'astromatt'}},
    {'ip': '10.13.37.1', 'protocol': 'ipv4', 'hostnames': {'nasa.gov', 'esa.int', 'roscosmos.ru'}},
    {'ip': '255.255.255.255', 'protocol': 'ipv4', 'hostnames': {'broadcasthost'}},
    {'ip': '::1', 'protocol': 'ipv6', 'hostnames': {'localhost'}}
]
The whys and wherefores
  • czytanie i parsowanie pliku

  • nieregularne pliki konfiguracyjne (struktura może się zmieniać)

  • filtrowanie elementów

  • korzystanie z pętli i instrukcji warunkowych

  • parsowanie stringów

  • praca ze ścieżkami w systemie operacyjnym

Hints
  • str.isspace()

  • value = True if ... else False