5.2. Unpack Slice¶
Slice argument must be
int
(positive, negative or zero)Positive Index starts with
0
Negative index starts with
-1
5.2.1. Slice Forwards¶
sequence[start:stop]
>>> data = 'abcde'
>>> data[0:3]
'abc'
>>> data = 'abcde'
>>> data[2:5]
'cde'
5.2.2. Slice Defaults¶
sequence[start:stop]
start
defaults to0
stop
defaults tolen(sequence)
>>> data = 'abcde'
>>> data[:3]
'abc'
>>> data = 'abcde'
>>> data[3:]
'de'
>>> data = 'abcde'
>>> data[:]
'abcde'
5.2.3. Slice Backwards¶
Negative index starts from the end and go right to left
>>> data = 'abcde'
>>> data[-3:-1]
'cd'
>>> data = 'abcde'
>>> data[-3:]
'cde'
>>> data = 'abcde'
>>> data[0:-3]
'ab'
>>> data = 'abcde'
>>> data[:-3]
'ab'
>>> data = 'abcde'
>>> data[-3:0]
''
5.2.4. Step Forward¶
Every
n
-th elementsequence[start:stop:step]
start
defaults to0
stop
defaults tolen(sequence)
step
defaults to1
>>> data = 'abcde'
>>> data[::1]
'abcde'
>>> data = 'abcde'
>>> data[::2]
'ace'
>>> data = 'abcde'
>>> data[::3]
'ad'
>>> data = 'abcde'
>>> data[1:4:2]
'bd'
5.2.5. Step Backward¶
Every
n
-th elementsequence[start:stop:step]
start
defaults to0
stop
defaults tolen(sequence)
step
defaults to1
>>> data = 'abcde'
>>> data[::-1]
'edcba'
>>> data = 'abcde'
>>> data[::-2]
'eca'
>>> data = 'abcde'
>>> data[::-3]
'eb'
>>> data = 'abcde'
>>> data[4:1:-2]
'ec'
5.2.6. Slice Errors¶
>>> data = 'abcde'
>>> data[::0]
Traceback (most recent call last):
ValueError: slice step cannot be zero
>>> data = 'abcde'
>>> data[::1.0]
Traceback (most recent call last):
TypeError: slice indices must be integers or None or have an __index__ method
5.2.7. Out of Range¶
>>> data = 'abcde'
>>> data[:100]
'abcde'
>>> data = 'abcde'
>>> data[100:]
''
5.2.8. Slice str¶
>>> data = 'abcde'
>>>
>>>
>>> data[0:3]
'abc'
>>> data[3:5]
'de'
>>> data[:3]
'abc'
>>> data[3:]
'de'
>>> data[::1]
'abcde'
>>> data[::-1]
'edcba'
>>> data[::2]
'ace'
>>> data[::-2]
'eca'
>>> data[1::2]
'bd'
>>> data[1:4:2]
'bd'
5.2.9. Slice tuple¶
>>> data = ('a', 'b', 'c', 'd', 'e')
>>>
>>>
>>> data[0:3]
('a', 'b', 'c')
>>> data[3:5]
('d', 'e')
>>> data[:3]
('a', 'b', 'c')
>>> data[3:]
('d', 'e')
>>> data[::2]
('a', 'c', 'e')
>>> data[::-1]
('e', 'd', 'c', 'b', 'a')
>>> data[1::2]
('b', 'd')
>>> data[1:4:2]
('b', 'd')
5.2.10. Slice list¶
>>> data = ['a', 'b', 'c', 'd', 'e']
>>>
>>>
>>> data[0:3]
['a', 'b', 'c']
>>> data[3:5]
['d', 'e']
>>> data[:3]
['a', 'b', 'c']
>>> data[3:]
['d', 'e']
>>> data[::2]
['a', 'c', 'e']
>>> data[::-1]
['e', 'd', 'c', 'b', 'a']
>>> data[1::2]
['b', 'd']
>>> data[1:4:2]
['b', 'd']
5.2.11. Slice set¶
Slicing set
is not possible:
>>> data = {'a', 'b', 'c', 'd', 'e'}
>>>
>>> data[:3]
Traceback (most recent call last):
TypeError: 'set' object is not subscriptable
5.2.12. Nested Sequences¶
>>> DATA = [
... ('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
... (5.8, 2.7, 5.1, 1.9, 'virginica'),
... (5.1, 3.5, 1.4, 0.2, 'setosa'),
... (5.7, 2.8, 4.1, 1.3, 'versicolor'),
... (6.3, 2.9, 5.6, 1.8, 'virginica'),
... (6.4, 3.2, 4.5, 1.5, 'versicolor'),
... (4.7, 3.2, 1.3, 0.2, 'setosa'),
... ]
>>>
>>>
>>> DATA[1:]
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>> DATA[-3:]
[(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa')]
5.2.13. Column Selection¶
Column selection unfortunately does not work on list
:
>>> data = [[1, 2, 3],
... [4, 5, 6],
... [7, 8, 9]]
...
>>> data[:]
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>>
>>> data[:, 1]
Traceback (most recent call last):
TypeError: list indices must be integers or slices, not tuple
>>>
>>> data[:][1]
[4, 5, 6]
However this syntax is valid in numpy and pandas.
5.2.14. Index Arithmetic¶
>>> text = 'We choose to go to the Moon!'
>>> first = 23
>>> last = 28
>>> step = 2
>>>
>>> text[first:last]
'Moon!'
>>> text[first:last-1]
'Moon'
>>> text[first:last:step]
'Mo!'
>>> text[first:last-1:step]
'Mo'
5.2.15. Slice Function¶
Every
n
-th elementsequence[start:stop:step]
start
defaults to0
stop
defaults tolen(sequence)
step
defaults to1
>>> text = 'We choose to go to the Moon!'
>>>
>>> q = slice(23, 27)
>>> text[q]
'Moon'
>>>
>>> q = slice(None, 9)
>>> text[q]
'We choose'
>>>
>>> q = slice(23, None)
>>> text[q]
'Moon!'
>>>
>>> q = slice(23, None, 2)
>>> text[q]
'Mo!'
>>>
>>> q = slice(None, None, 2)
>>> text[q]
'W hoet ot h on'
5.2.16. Use Case - 0x01¶
>>> from pprint import pprint
>>>
>>>
>>> DATA = [
... ('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
... (5.8, 2.7, 5.1, 1.9, 'virginica'),
... (5.1, 3.5, 1.4, 0.2, 'setosa'),
... (5.7, 2.8, 4.1, 1.3, 'versicolor'),
... (6.3, 2.9, 5.6, 1.8, 'virginica'),
... (6.4, 3.2, 4.5, 1.5, 'versicolor'),
... (4.7, 3.2, 1.3, 0.2, 'setosa'),
... ]
>>>
>>>
>>> pprint(DATA[1:])
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa')]
>>>
>>> pprint(DATA[1::2])
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.4, 3.2, 4.5, 1.5, 'versicolor')]
>>>
>>> pprint(DATA[1::-2])
[(5.8, 2.7, 5.1, 1.9, 'virginica')]
>>>
>>> pprint(DATA[:1:-2])
[(4.7, 3.2, 1.3, 0.2, 'setosa'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa')]
>>>
>>> pprint(DATA[:-5:-2])
[(4.7, 3.2, 1.3, 0.2, 'setosa'), (6.3, 2.9, 5.6, 1.8, 'virginica')]
>>>
>>> pprint(DATA[1:-5:-2])
[]
5.2.17. Use Case - 0x02¶
>>> data = [[1, 2, 3],
... [4, 5, 6],
... [7, 8, 9]]
...
>>> data[::2]
[[1, 2, 3],
[7, 8, 9]]
>>>
>>> data[::2][1]
[7, 8, 9]
>>>
>>> data[::2][:1]
[[1, 2, 3]]
>>>
>>> data[::2][1][1:]
[8, 9]
5.2.18. Use Case - 0x03¶
>>> text = 'We choose to go to the Moon!'
>>> word = 'Moon'
>>>
>>>
>>> start = text.find(word)
>>> stop = start + len(word)
>>>
>>> text[start:stop]
'Moon'
>>>
>>> text[:start]
'We choose to go to the '
>>>
>>> text[stop:]
'!'
>>>
>>> text[:start] + text[stop:]
'We choose to go to the !'
5.2.19. Assignments¶
"""
* Assignment: Iterable Slice Text
* Required: yes
* Complexity: easy
* Lines of code: 8 lines
* Time: 8 min
English:
1. Remove title and military rank in each variable
2. Remove also whitespaces at the beginning and end of a text
3. Use only `slice` to clean text
4. Run doctests - all must succeed
Polish:
1. Usuń tytuł naukowy i stopień wojskowy z każdej zmiennej
2. Usuń również białe znaki na początku i końcu tekstu
3. Użyj tylko `slice` do oczyszczenia tekstu
4. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert a is not Ellipsis, \
'Assign your result to variable `a`'
>>> assert b is not Ellipsis, \
'Assign your result to variable `b`'
>>> assert c is not Ellipsis, \
'Assign your result to variable `c`'
>>> assert d is not Ellipsis, \
'Assign your result to variable `d`'
>>> assert e is not Ellipsis, \
'Assign your result to variable `e`'
>>> assert f is not Ellipsis, \
'Assign your result to variable `f`'
>>> assert g is not Ellipsis, \
'Assign your result to variable `g`'
>>> assert type(a) is str, \
'Variable `a` has invalid type, should be str'
>>> assert type(b) is str, \
'Variable `b` has invalid type, should be str'
>>> assert type(c) is str, \
'Variable `c` has invalid type, should be str'
>>> assert type(d) is str, \
'Variable `d` has invalid type, should be str'
>>> assert type(e) is str, \
'Variable `e` has invalid type, should be str'
>>> assert type(f) is str, \
'Variable `f` has invalid type, should be str'
>>> assert type(g) is str, \
'Variable `g` has invalid type, should be str'
>>> example
'Mark Watney'
>>> a
'Pan Twardowski'
>>> b
'Pan Twardowski'
>>> c
'Mark Watney'
>>> d
'Melissa Lewis'
>>> e
'Ryan Stone'
>>> f
'Ryan Stone'
>>> g
'Pan Twardowski'
"""
EXAMPLE = 'lt. Mark Watney, PhD'
A = 'dr hab. inż. Pan Twardowski, prof. AATC'
B = 'gen. pil. Pan Twardowski'
C = 'Mark Watney, PhD'
D = 'lt. col. ret. Melissa Lewis'
E = 'dr n. med. Ryan Stone'
F = 'Ryan Stone, MD-PhD'
G = 'lt. col. Pan Twardowski\t'
example = EXAMPLE[4:-5]
# String with: 'Pan Twardowski'
# type: str
a = ...
# String with: 'Pan Twardowski'
# type: str
b = ...
# String with: 'Mark Watney'
# type: str
c = ...
# String with: 'Melissa Lewis'
# type: str
d = ...
# String with: 'Ryan Stone'
# type: str
e = ...
# String with: 'Ryan Stone'
# type: str
f = ...
# String with: 'Pan Twardowski'
# type: str
g = ...
"""
* Assignment: Iterable Slice Substr
* Required: yes
* Complexity: easy
* Lines of code: 3 lines
* Time: 5 min
English:
1. Use `str.find()` and slicing
2. Print `TEXT` without fragment from `REMOVE`
3. Output should be: 'We choose the Moon!'
4. Do not use `str.replace()`
5. Run doctests - all must succeed
Polish:
1. Użyj `str.find()` oraz wycinania
2. Wypisz `TEXT` bez fragmentu znajdującego się w `REMOVE`
3. Wynik powinien być: 'We choose the Moon!'
4. Nie używaj `str.replace()`
5. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is str, \
'Variable `result` has invalid type, should be str'
>>> result
'We choose the Moon!'
"""
TEXT = 'We choose to go to the Moon!'
REMOVE = 'to go to '
# String with TEXT without REMOVE part
# type: str
result = ...
"""
* Assignment: Iterable Slice Sequence
* Required: yes
* Complexity: easy
* Lines of code: 2 lines
* Time: 3 min
English:
1. Create set `result` with every second element from `a` and `b`
2. Run doctests - all must succeed
Polish:
1. Stwórz zbiór `result` z co drugim elementem `a` i `b`
2. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign your result to variable `result`'
>>> assert type(result) is set, \
'Variable `result` has invalid type, should be set'
>>> result
{0, 2, 4}
"""
a = (0, 1, 2, 3)
b = [2, 3, 4, 5]
# Set with every second element from `a` and `b`
# type: set[int]
result = ...
"""
* Assignment: Iterable Slice Header/Rows
* Required: yes
* Complexity: easy
* Lines of code: 2 lines
* Time: 3 min
English:
1. Separate header (first line) from rows:
a. Define `header: tuple[str]` with header
b. Define `rows: list[tuple]` with other rows
2. Run doctests - all must succeed
Polish:
1. Odseparuj nagłówek (pierwsza linia) od danych:
a. Zdefiniuj `header: tuple[str]` z nagłówkiem
b. Zdefiniuj `rows: list[tuple]` z pozostałymi wierszami
2. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert header is not Ellipsis, \
'Assign your result to variable `header`'
>>> assert rows is not Ellipsis, \
'Assign your result to variable `rows`'
>>> assert type(header) is tuple, \
'Variable `header` has invalid type, should be tuple'
>>> assert all(type(x) is tuple for x in rows), \
'All elements in `rows` should be tuple'
>>> assert header not in rows, \
'Header should not be in `rows`'
>>> header
('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species')
>>> rows # doctest: +NORMALIZE_WHITESPACE
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica')]
"""
DATA = [
('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica'),
]
# Tuple with row at index 0 from DATA
# type: tuple[str]
header = ...
# List with rows at all the other indexes from DATA
# type: list[tuple]
rows = ...
"""
* Assignment: Iterable Slice Train/Test
* Required: yes
* Complexity: easy
* Lines of code: 4 lines
* Time: 8 min
English:
1. Divide `rows` into two lists:
a. `train`: 60% - training data
b. `test`: 40% - testing data
2. Calculate split point:
a. `rows` length multiplied by percent
b. From `rows` slice training data from start to split
c. From `rows` slice test data from split to end
3. Run doctests - all must succeed
Polish:
1. Podziel `rows` na dwie listy:
a. `train`: 60% - dane do uczenia
b. `test`: 40% - dane do testów
2. Aby to zrobić wylicz punkt podziału:
a. Długość `rows` razy procent
c. Z `rows` wytnij do uczenia rekordy od początku do punktu podziału
d. Z `rows` zapisz do testów rekordy od punktu podziału do końca
3. Uruchom doctesty - wszystkie muszą się powieść
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert split is not Ellipsis, \
'Assign your result to variable `split`'
>>> assert train is not Ellipsis, \
'Assign your result to variable `train`'
>>> assert test is not Ellipsis, \
'Assign your result to variable `test`'
>>> assert type(split) is int, \
'Variable `split` has invalid type, should be int'
>>> assert type(train) is list, \
'Variable `train` has invalid type, should be list'
>>> assert type(train) is list, \
'Variable `train` has invalid type, should be list'
>>> assert type(test) is list, \
'Variable `test` has invalid type, should be list'
>>> assert all(type(x) is tuple for x in train), \
'All elements in `train` should be tuple'
>>> assert all(type(x) is tuple for x in test), \
'All elements in `test` should be tuple'
>>> split
6
>>> train # doctest: +NORMALIZE_WHITESPACE
[(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa')]
>>> test # doctest: +NORMALIZE_WHITESPACE
[(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica')]
"""
DATA = [
('sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'),
(5.8, 2.7, 5.1, 1.9, 'virginica'),
(5.1, 3.5, 1.4, 0.2, 'setosa'),
(5.7, 2.8, 4.1, 1.3, 'versicolor'),
(6.3, 2.9, 5.6, 1.8, 'virginica'),
(6.4, 3.2, 4.5, 1.5, 'versicolor'),
(4.7, 3.2, 1.3, 0.2, 'setosa'),
(7.0, 3.2, 4.7, 1.4, 'versicolor'),
(7.6, 3.0, 6.6, 2.1, 'virginica'),
(4.9, 3.0, 1.4, 0.2, 'setosa'),
(4.9, 2.5, 4.5, 1.7, 'virginica'),
]
header = DATA[0]
rows = DATA[1:]
# Result of `rows` length multiplied by percent
# type: int
split = ...
# List with first 60% from rows
# type: list[tuple]
train = ...
# List with last 40% from rows
# type: list[tuple]
test = ...