8.5. CSV Reader¶
Reads CSV file to list[list]
csv.reader()
Default encoding is
encoding='utf-8'
8.5.1. SetUp¶
>>> import csv
>>> from pprint import pprint
>>> from pathlib import Path
8.5.2. Minimal¶
Default mode is
mode='r'
Data:
sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor
SetUp:
>>> DATA = """sepal_length,sepal_width,petal_length,petal_width,species
... 5.8,2.7,5.1,1.9,virginica
... 5.1,3.5,1.4,0.2,setosa
... 5.7,2.8,4.1,1.3,versicolor
... """
>>>
>>> _ = Path('/tmp/myfile.csv').write_text(DATA)
Usage:
>>> with open('/tmp/myfile.csv') as file:
... reader = csv.reader(file)
... result = list(reader)
>>>
>>>
>>> pprint(result)
[['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'],
['5.8', '2.7', '5.1', '1.9', 'virginica'],
['5.1', '3.5', '1.4', '0.2', 'setosa'],
['5.7', '2.8', '4.1', '1.3', 'versicolor']]
8.5.3. Parametrized¶
Data:
"sepal_length";"sepal_width";"petal_length";"petal_width";"species"
"5.8";"2.7";"5.1";"1.9";"virginica"
"5.1";"3.5";"1.4";"0.2";"setosa"
"5.7";"2.8";"4.1";"1.3";"versicolor"
SetUp:
>>> DATA = '''"sepal_length";"sepal_width";"petal_length";"petal_width";"species"
... "5.8";"2.7";"5.1";"1.9";"virginica"
... "5.1";"3.5";"1.4";"0.2";"setosa"
... "5.7";"2.8";"4.1";"1.3";"versicolor"
... '''
>>>
>>> _ = Path('/tmp/myfile.csv').write_text(DATA)
Usage:
>>> with open('/tmp/myfile.csv', mode='r', encoding='utf-8') as file:
... reader = csv.reader(file, quotechar='"', delimiter=';', quoting=csv.QUOTE_ALL)
... result = list(reader)
>>>
>>>
>>> pprint(result)
[['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'],
['5.8', '2.7', '5.1', '1.9', 'virginica'],
['5.1', '3.5', '1.4', '0.2', 'setosa'],
['5.7', '2.8', '4.1', '1.3', 'versicolor']]
8.5.4. Assignments¶
"""
* Assignment: CSV Reader Syntax
* Complexity: easy
* Lines of code: 4 lines
* Time: 5 min
English:
1. Using `csv.reader()` read data from `FILE`
2. Define `result: list[tuple]` with converted data
3. Use Unix `\n` line terminator
4. Run doctests - all must succeed
Polish:
1. Za pomocą `csv.reader()` wczytaj dane z `FILE`
2. Zdefiniuj `result: list[tuple]` z przekonwerowanymi danymi
3. Użyj zakończenia linii Unix `\n`
4. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from os import remove
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> assert all(type(x) is tuple for x in result), \
'All rows in `result` should be tuple'
>>> result # doctest: +NORMALIZE_WHITESPACE
[('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
('5.8', '2.7', '5.1', '1.9', 'virginica'),
('5.1', '3.5', '1.4', '0.2', 'setosa'),
('5.7', '2.8', '4.1', '1.3', 'versicolor')]
>>> remove(FILE)
"""
import csv
FILE = r'_temporary.csv'
DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""
with open(FILE, mode='w') as file:
file.write(DATA)
# data from file (note the list[tuple] format!)
# type: list[tuple]
result = ...
"""
* Assignment: CSV Reader Substitute
* Complexity: easy
* Lines of code: 6 lines
* Time: 5 min
English:
1. Using `csv.reader()` read data from `FILE`
2. Define `result: list[tuple]` with converted data
3. Lookup species name in `SPECIES` dictionary
4. Use Unix `\n` line terminator
5. Run doctests - all must succeed
Polish:
1. Za pomocą `csv.reader()` wczytaj dane z `FILE`
2. Zdefiniuj `result: list[tuple]` z przekonwerowanymi danymi
3. Nazwę gatunku wyszukaj w słowniku `SPECIES`
4. Użyj zakończenia linii Unix `\n`
5. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from os import remove
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> assert all(type(x) is tuple for x in result), \
'All rows in `result` should be tuple'
>>> result # doctest: +NORMALIZE_WHITESPACE
[('5.8', '2.7', '5.1', '1.9', 'virginica'),
('5.1', '3.5', '1.4', '0.2', 'setosa'),
('5.7', '2.8', '4.1', '1.3', 'versicolor')]
>>> remove(FILE)
"""
import csv
FILE = r'_temporary.csv'
DATA = """5.8,2.7,5.1,1.9,1
5.1,3.5,1.4,0.2,0
5.7,2.8,4.1,1.3,2"""
SPECIES = {
0: 'setosa',
1: 'virginica',
2: 'versicolor'}
with open(FILE, mode='w') as file:
file.write(DATA)
# data from file (note the list[tuple] format!)
# type: list[tuple]
result = ...
"""
* Assignment: CSV Reader Enumerate
* Complexity: medium
* Lines of code: 8 lines
* Time: 8 min
English:
1. Using `csv.reader()` read data from `FILE`
2. Define `result: list[tuple]` with converted data
3. Use Unix `\n` line terminator
4. Run doctests - all must succeed
Polish:
1. Za pomocą `csv.reader()` wczytaj dane z `FILE`
2. Zdefiniuj `result: list[tuple]` z przekonwerowanymi danymi
3. Użyj zakończenia linii Unix `\n`
4. Uruchom doctesty - wszystkie muszą się powieść
Hint:
* For Python before 3.8: `dict(OrderedDict)`
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from os import remove
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> assert all(type(x) is tuple for x in result), \
'All rows in `result` should be tuple'
>>> result # doctest: +NORMALIZE_WHITESPACE
[('5.8', '2.7', '5.1', '1.9', 'virginica'),
('5.1', '3.5', '1.4', '0.2', 'setosa'),
('5.7', '2.8', '4.1', '1.3', 'versicolor')]
>>> remove(FILE)
"""
import csv
FILE = r'_temporary.csv'
DATA = """3,4,setosa,virginica,versicolor
5.8,2.7,5.1,1.9,1
5.1,3.5,1.4,0.2,0
5.7,2.8,4.1,1.3,2"""
with open(FILE, mode='w') as file:
file.write(DATA)
# data from file (note the list[tuple] format!)
# type: list[tuple]
result = ...
"""
* Assignment: CSV Reader TypeCast
* Complexity: medium
* Lines of code: 8 lines
* Time: 8 min
English:
1. Using `csv.reader()` read data from `FILE`
2. Define `result: list[tuple]` with converted data
3. Use Unix `\n` line terminator
4. Run doctests - all must succeed
Polish:
1. Za pomocą `csv.reader()` wczytaj dane z `FILE`
2. Zdefiniuj `result: list[tuple]` z przekonwerowanymi danymi
3. Użyj zakończenia linii Unix `\n`
4. Uruchom doctesty - wszystkie muszą się powieść
Hint:
* For Python before 3.8: `dict(OrderedDict)`
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from os import remove
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, should be list'
>>> assert all(type(x) is tuple for x in result), \
'All rows in `result` should be tuple'
>>> result # doctest: +NORMALIZE_WHITESPACE
[('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor')]
>>> remove(FILE)
"""
import csv
FILE = r'_temporary.csv'
DATA = """sepal_length,sepal_width,petal_length,petal_width,species
5.8,2.7,5.1,1.9,virginica
5.1,3.5,1.4,0.2,setosa
5.7,2.8,4.1,1.3,versicolor"""
with open(FILE, mode='w') as file:
file.write(DATA)
# data from file (note the list[tuple] format!)
# type: list[tuple]
result = ...