3.3. Sequence Set¶
3.3.1. Rationale¶
Only unique values
Mutable - can add, remove, and modify items
Can store elements of any hashable types
Set is unordered data structure and do not record element position or insertion
Do not support getitem and slice
Hashable (Immutable):
int
float
bool
NoneType
str
tuple
frozenset
Non-hashable (Mutable):
list
set
dict
"Hashable types are also immutable" is true for builtin types, but it's not a universal truth. More information in OOP Hash More information in OOP Object Identity
3.3.2. Definition¶
Defining only with set()
- no short syntax:
>>> data = set()
Comma after last element of a one element set is optional. Brackets are required
>>> data = {1}
>>> data = {1, 2, 3}
>>> data = {1.1, 2.2, 3.3}
>>> data = {True, False}
>>> data = {'a', 'b', 'c'}
>>> data = {'a', 1, 2.2, True, None}
Stores only unique values:
>>> {1, 2, 1}
{1, 2}
Compares by values, not types:
>>> {1}
{1}
>>> {1.0}
{1.0}
>>> {1, 1.0}
{1}
>>> {1.0, 1}
{1.0}
Can store elements of any hashable types:
>>> data = {1, 2, 'a'}
>>> data = {1, 2, (3, 4)}
>>>
>>> data = {1, 2, [3, 4]}
Traceback (most recent call last):
TypeError: unhashable type: 'list'
>>>
>>> data = {1, 2, {3, 4}}
Traceback (most recent call last):
TypeError: unhashable type: 'set'
3.3.3. Type Casting¶
set()
converts argument toset
>>> data = 'abcd'
>>> set(data) == {'a', 'b', 'c', 'd'}
True
>>> data = ['a', 'b', 'c', 'd']
>>> set(data) == {'a', 'b', 'c', 'd'}
True
>>> data = ('a', 'b', 'c', 'd')
>>> set(data) == {'a', 'b', 'c', 'd'}
True
>>> data = {'a', 'b', 'c', 'd'}
>>> set(data) == {'a', 'b', 'c', 'd'}
True
>>> data = frozenset({'a', 'b', 'c', 'd'})
>>> set(data) == {'a', 'b', 'c', 'd'}
True
3.3.4. Deduplicate¶
Works with str
, list
, tuple
, frozenset
>>> data = [1, 2, 3, 1, 1, 2, 4]
>>> set(data)
{1, 2, 3, 4}
Converting set
deduplicate items:
>>> data = ['Twardowski',
... 'Twardowski',
... 'Watney',
... 'Twardowski']
...
>>> set(data) == {'Twardowski', 'Watney'}
True
3.3.5. Add¶
>>> data = {1, 2}
>>>
>>> data.add(3)
>>> data == {1, 2, 3}
True
>>>
>>> data.add(3)
>>> data == {1, 2, 3}
True
>>>
>>> data.add(4)
>>> data == {1, 2, 3, 4}
True
>>> data = {1, 2}
>>> data.add([3, 4])
Traceback (most recent call last):
TypeError: unhashable type: 'list'
>>> data = {1, 2}
>>> data.add((3, 4))
>>> data == {1, 2, (3, 4)}
True
>>> data = {1, 2}
>>> data.add({3, 4})
Traceback (most recent call last):
TypeError: unhashable type: 'set'
>>> data = {1, 2}
>>> data.add(frozenset({3,4}))
>>> data
{frozenset({3, 4}), 1, 2}
3.3.6. Update¶
>>> data = {1, 2}
>>> data.update({3, 4})
>>> data == {1, 2, 3, 4}
True
>>> data.update([5, 6])
>>> data == {1, 2, 3, 4, 5, 6}
True
>>> data.update((7, 8))
>>> data == {1, 2, 3, 4, 5, 6, 7, 8}
True
3.3.7. Pop¶
Gets and remove items
>>> data = {1, 2, 3}
>>> value = data.pop()
>>> value in [1, 2, 3]
True
3.3.8. Membership¶
Is Disjoint?:
True
- if there are no common elements indata
andx
False
- if anyx
element are in data>>> data = {1,2} >>> >>> data.isdisjoint({1,2}) False >>> data.isdisjoint({1,3}) False >>> data.isdisjoint({3,4}) True
Is Subset?:
True
- ifx
has all elements fromdata
False
- ifx
don't have element fromdata
>>> data = {1,2} >>> >>> data.issubset({1}) False >>> data.issubset({1,2}) True >>> data.issubset({1,2,3}) True >>> data.issubset({1,3,4}) False>>> {1,2} < {3,4} False >>> {1,2} < {1,2} False >>> {1,2} < {1,2,3} True >>> {1,2,3} < {1,2} False>>> {1,2} <= {3,4} False >>> {1,2} <= {1,2} True >>> {1,2} <= {1,2,3} True >>> {1,2,3} <= {1,2} False
Is Superset?:
* True
- if data
has all elements from x
* False
- if data
don't have element from x
>>> data = {1,2}
>>>
>>> data.issuperset({1})
True
>>> data.issuperset({1,2})
True
>>> data.issuperset({1,2,3})
False
>>> data.issuperset({1,3})
False
>>> data.issuperset({2,1})
True
>>> {1,2} > {1,2}
False
>>> {1,2} > {1,2,3}
False
>>> {1,2,3} > {1,2}
True
>>> {1,2} >= {1,2}
True
>>> {1,2} >= {1,2,3}
False
>>> {1,2,3} >= {1,2}
True
3.3.9. Basic Operations¶
Union (returns sum of elements from data
and x
):
>>> data = {1,2}
>>>
>>> data.union({1,2})
{1, 2}
>>> data.union({1,2,3})
{1, 2, 3}
>>> data.union({1,2,4})
{1, 2, 4}
>>> data.union({1,3}, {2,4})
{1, 2, 3, 4}
>>> {1,2} | {1,2}
{1, 2}
>>> {1,2,3} | {1,2}
{1, 2, 3}
>>> {1,2,3} | {1,2,4}
{1, 2, 3, 4}
>>> {1,2} | {1,3} | {2,4}
{1, 2, 3, 4}
Difference (returns elements from data
which are not in x
):
>>> data = {1,2}
>>>
>>> data.difference({1,2})
set()
>>> data.difference({1,2,3})
set()
>>> data.difference({1,4})
{2}
>>> data.difference({1,3}, {2,4})
set()
>>> data.difference({3,4})
{1, 2}
>>> {1,2} - {2,3}
{1}
>>> {1,2} - {2,3} - {3}
{1}
>>> {1,2} - {1,2,3}
set()
Symmetric Difference (returns elements from data
and x
, but without common):
>>> data = {1,2}
>>>
>>> data.symmetric_difference({1,2})
set()
>>> data.symmetric_difference({1,2,3})
{3}
>>> data.symmetric_difference({1,4})
{2, 4}
>>> data.symmetric_difference({1,3}, {2,4})
Traceback (most recent call last):
TypeError: symmetric_difference() takes exactly one argument (2 given)
>>> data.symmetric_difference({3,4})
{1, 2, 3, 4}
>>> {1,2} ^ {1,2}
set()
>>> {1,2} ^ {2,3}
{1, 3}
>>> {1,2} ^ {1,3}
{2, 3}
Intersection (returns common element from in data
and x
):
>>> data = {1,2}
>>>
>>> data.intersection({1,2})
{1, 2}
>>> data.intersection({1,2,3})
{1, 2}
>>> data.intersection({1,4})
{1}
>>> data.intersection({1,3}, {2,4})
set()
>>> data.intersection({1,3}, {1,4})
{1}
>>> data.intersection({3,4})
set()
>>> {1,2} & {2,3}
{2}
>>> {1,2} & {2,3} & {2,4}
{2}
>>> {1,2} & {2,3} & {3}
set()
3.3.10. Cardinality¶
>>> data = {1, 2, 3}
>>> len(data)
3
3.3.11. Assignments¶
"""
* Assignment: Sequence Set Create
* Complexity: easy
* Lines of code: 1 lines
* Time: 2 min
English:
1. Create set `result` with elements:
a. `'a'`
b. `1`
c. `2.2`
2. Compare result with "Tests" section (see below)
Polish:
1. Stwórz zbiór `result` z elementami:
a. `'a'`
b. `1`
c. `2.2`
2. Porównaj wyniki z sekcją "Tests" (patrz poniżej)
Tests:
>>> import sys
>>> sys.tracebacklimit = 0
>>> assert result is not Ellipsis, 'Assignment solution must be in `result` instead of ... (Ellipsis)'
>>> assert type(result) is set, 'Variable `result` has invalid type, should be set'
>>> assert len(result) == 3, 'Variable `result` length should be 3'
>>> 'a' in result
True
>>> 1 in result
True
>>> 2.2 in result
True
"""
# Given
result = ... # set with 'a' and 1 and 2.2
"""
* Assignment: Sequence Set Many
* Complexity: easy
* Lines of code: 9 lines
* Time: 8 min
English:
1. Use data from "Given" section (see below)
2. Non-functional requirements:
a. Assignmnet verifies creation of `set()` and method `.add()` and `.update()` usage
b. For simplicity numerical values type as `floats`, and not `str`
c. Example: instead of '5.8' just type 5.8
d. Do not use `str.split()`, `slice`, `getitem`, `for`, `while` or any other control-flow statement
3. Create set `result` representing row with index 1
4. Values from row at index 2 add to `result` using `.add()` (five calls)
5. From row at index 3 create `set` and add it to `result` using `.update()` (one call)
6. From row at index 4 `tuple` and add it to `result` using `.update()` (one call)
7. From row at index 5 `list` and add it to `result` using `.update()` (one call)
8. Compare result with "Tests" section (see below)
Polish:
1. Użyj danych z sekcji "Given" (patrz poniżej)
2. Wymagania niefunkcjonalne:
a. Zadanie sprawdza tworzenie `set()` oraz użycie metod `.add()` i `.update()`
b. Dla uproszczenia wartości numeryczne wypisuj jako `float`, a nie `str`
c. Przykład: zamiast '5.8' zapisz 5.8
d. Nie używaj `str.split()`, `slice`, `getitem`, `for`, `while` lub jakiejkolwiek innej instrukcji sterującej
3. Stwórz zbiór `result` reprezentujący wiersz o indeksie 1
4. Wartości z wiersza o indeksie 2 dodawaj do `result` używając `.add()` (pięć wywołań)
5. Na podstawie wiersza o indeksie 3 stwórz `set` i dodaj go do `result` używając `.update()` (jedno wywołanie)
6. Na podstawie wiersza o indeksie 4 stwórz `tuple` i dodaj go do `result` używając `.update()` (jedno wywołanie)
7. Na podstawie wiersza o indeksie 5 stwórz `list` i dodaj go do `result` używając `.update()` (jedno wywołanie)
8. Porównaj wyniki z sekcją "Tests" (patrz poniżej)
Tests:
>>> import sys
>>> sys.tracebacklimit = 0
>>> assert result is not Ellipsis, 'Assignment solution must be in `result` instead of ... (Ellipsis)'
>>> assert type(result) is set, 'Variable `result` has invalid type, should be set'
>>> assert len(result) == 22, 'Variable `result` length should be 22'
>>> ('sepal_length' not in result
... and 'sepal_width' not in result
... and 'petal_length' not in result
... and 'petal_width' not in result
... and 'species' not in result)
True
>>> result >= {5.8, 2.7, 5.1, 1.9, 'virginica'}
True
>>> result >= {5.1, 3.5, 1.4, 0.2, 'setosa'}
True
>>> result >= {5.7, 2.8, 4.1, 1.3, 'versicolor'}
True
>>> result >= {6.3, 2.9, 5.6, 1.8, 'virginica'}
True
>>> result >= {6.4, 3.2, 4.5, 1.5, 'versicolor'}
True
"""
# Given
DATA = [
'sepal_length,sepal_width,petal_length,petal_width,species',
'5.8,2.7,5.1,1.9,virginica',
'5.1,3.5,1.4,0.2,setosa',
'5.7,2.8,4.1,1.3,versicolor',
'6.3,2.9,5.6,1.8,virginica',
'6.4,3.2,4.5,1.5,versicolor',
]
result = ... # with row at DATA[2] (manually converted to float and str)
result # add to result float 5.1
result # add to result float 3.5
result # add to result float 1.4
result # add to result float 0.2
result # add to result str setosa
result # update result with set 5.7, 2.8, 4.1, 1.3, 'versicolor'
result # update result with tuple 6.3, 2.9, 5.6, 1.8, 'virginica'
result # update result with list 6.4, 3.2, 4.5, 1.5, 'versicolor'