10. Comprehensions

10.1. Simple usage

10.1.1. Traditional

Listing 36. Iterative approach to applying function to elements
numbers = []

for x in range(0, 5):
    numbers.append(x+10)

print(numbers)
# [10, 11, 12, 13, 14]

10.1.2. List Comprehension

Listing 37. list Comprehension approach to applying function to elements
numbers = [x+10 for x in range(0, 5)]

print(numbers)
# [10, 11, 12, 13, 14]

10.1.3. Set Comprehension

Listing 38. set Comprehension approach to applying function to elements
numbers = {x+10 for x in range(0, 5)}
# {10, 11, 12, 13, 14}

10.1.4. Dict Comprehension

Listing 39. dict Comprehension approach to applying function to elements
numbers = {x: x+10 for x in range(0, 5)}
# {0:10, 1:11, 2:12, 3:13, 4:14}
Listing 40. dict Comprehension approach to applying function to elements
numbers = {x+10: x for x in range(0, 5)}
# {10:0, 11:1, 12:2, 13:3, 14:4}
Listing 41. dict Comprehension approach to applying function to elements
numbers = {x+10: x+10 for x in range(0, 5)}
# {10:10, 11:11, 12:12, 13:13, 14:14}

10.1.5. Tuple Comprehension?!

  • It is a Generator Expression

  • More in chapter Generators

Listing 42. Generator Expression approach to applying function to elements
numbers = (x+10 for x in range(0, 5))
# <generator object <genexpr> at 0x11eaef570>

10.2. Generator expressions vs. Comprehensions

10.2.1. Comprehensions

  • Executes instantly

list(x for x in range(0, 5))        # [0, 1, 2, 3, 4]
[x for x in range(0, 5)]            # [0, 1, 2, 3, 4]
set(x for x in range(0, 5))         # {0, 1, 2, 3, 4}
{x for x in range(0, 5)}            # {0, 1, 2, 3, 4}
{x: x for x in range(0, 5)}         # {0: 0, 1: 1, 2: 2, 3: 3, 4: 4}
tuple(x for x in range(0, 5))       # (0, 1, 2, 3, 4)
(x for x in range(0, 5))            # <generator object <genexpr> at 0x1197032a0>
all(x for x in range(0, 5))         # False
any(x for x in range(0, 5))         # True
sum(x for x in range(0, 5))         # 10

10.2.2. Generator Expressions

  • Lazy evaluation

(x*x for x in range(0, 30) if x % 2)
# <generator object <genexpr> at 0x1197032a0>

10.3. Conditional Comprehension

10.3.1. Traditional

Listing 43. Iterative approach to applying function to selected elements
even_numbers = []

for x in range(0, 10):
    if x % 2 == 0:
        even_numbers.append(x)

print(even_numbers)
# [0, 2, 4, 6, 8]

10.3.2. Comprehensions

Listing 44. list Comprehensions approach to applying function to selected elements
even_numbers = [x for x in range(0, 10) if x % 2 == 0]

print(even_numbers)
# [0, 2, 4, 6, 8]

10.4. Examples

10.4.1. Applying function to each element

Listing 45. Applying function to each output element
[float(x) for x in range(0, 5)]
# [0.0, 1.0, 2.0, 3.0, 4.0]

[float(x) for x in range(0, 5) if x % 2 == 0]
# [0.0, 2.0, 4.0]
Listing 46. Applying function to each output element
[pow(x, 2) for x in range(0, 5)]
# [0, 1, 4, 9, 16]

[pow(x, 2) for x in range(0, 5) if x % 2 == 0]
# [0, 4, 16]

10.4.2. Filtering results

Listing 47. Using list comprehension for result filtering
DATA = [
    ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
    (7.0, 3.2, 4.7, 1.4, 'versicolor'),
]

setosa = [m for *m,s in DATA if s == 'setosa']
# [
#   [5.1, 3.5, 1.4, 0.2],
#   [4.7, 3.2, 1.3, 0.2],
# ]

10.4.3. Filtering with complex expressions

Listing 48. Using list comprehension for result filtering with more complex expression
DATA = [
    ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
    (7.0, 3.2, 4.7, 1.4, 'versicolor'),
]


def is_setosa(species):
    if species == 'setosa':
        return True
    else:
        return False


measurements = [m for *m,s in DATA if is_setosa(s)]
# [
#   [5.1, 3.5, 1.4, 0.2],
#   [4.7, 3.2, 1.3, 0.2],
# ]

10.4.4. Quick parsing lines

Listing 49. Quick parsing lines
DATA = [
    '5.8,2.7,5.1,1.9,virginica',
    '5.1,3.5,1.4,0.2,setosa',
    '5.7,2.8,4.1,1.3,versicolor',
]

output = []

for row in DATA:
    row = row.split(',')
    output.append(row)


print(output)
# [
#   ['5.8', '2.7', '5.1', '1.9', 'virginica'],
#   ['5.1', '3.5', '1.4', '0.2', 'setosa'],
#   ['5.7', '2.8', '4.1', '1.3', 'versicolor']
# ]
Listing 50. Quick parsing lines
DATA = [
    '5.8,2.7,5.1,1.9,virginica',
    '5.1,3.5,1.4,0.2,setosa',
    '5.7,2.8,4.1,1.3,versicolor',
]

output = [row.split(',') for row in DATA]

print(output)
# [
#   ['5.8', '2.7', '5.1', '1.9', 'virginica'],
#   ['5.1', '3.5', '1.4', '0.2', 'setosa'],
#   ['5.7', '2.8', '4.1', '1.3', 'versicolor']
# ]

10.4.5. Reversing dict keys with values

Listing 51. Reversing dict keys with values
DATA = {'a': 1, 'b': 2}

DATA.items()
# [
#    ('a', 1),
#    ('b', 2),
# ]
Listing 52. Reversing dict keys with values
DATA = {'a': 1, 'b': 2}

{v:k for k,v in DATA.items()}
# {1:'a', 2:'b'}
Listing 53. Value collision while reversing dict
DATA = {'a': 1, 'b': 2, 'c': 2}

{v:k for k,v in DATA.items()}
# {1:'a', 2:'c'}

10.5. Assignments

10.5.1. Split train/test

English
  1. For given data structure INPUT: List[tuple] (see below)

  2. Separate header from data

  3. Calculate pivot point: length of data times given percent

  4. Using List Comprehension split data to:

    • features: List[Tuple[float]] - list of measurements (each measurement row is a tuple)

    • labels: List[str] - list of species names

  5. Split those data structures with proportion:

    • features_train: List[Tuple[float]] - features to train - 60%

    • features_test: List[Tuple[float]] - features to test - 40%

    • labels_train: List[str] - labels to train - 60%

    • labels_test: List[str] - labels to test - 40%

  6. Create result: Tuple[list, list, list, list] with features (training and test) and labels (training and test)

  7. Print result

Polish
  1. Dana jest struktura danych INPUT: List[tuple] (patrz poniżej)

  2. Odseparuj nagłówek do danych

  3. Wylicz punkt podziału: długość danych razy zadany procent

  4. Używając List Comprehension podziel dane na:

    • features: List[Tuple[float]] - lista pomiarów (każdy wiersz z pomiarami ma być tuple)

    • labels: List[str] - lista nazw gatunków

  5. Podziel te struktury danych w proporcji:

    • features_train: List[Tuple[float]] - features do uczenia - 60%

    • features_test: List[Tuple[float]] - features do testów - 40%

    • labels_train: List[str] - labels do uczenia - 60%

    • labels_test: List[str] - labels do testów - 40%

  6. Stwórz result: Tuple[list, list, list, list] z cechami (treningowymi i testowymi) oraz labelkami (treningowymi i testowymi)

  7. Wypisz result

Input
INPUT = [
    ('Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Species'),
    (5.8, 2.7, 5.1, 1.9, 'virginica'),
    (5.1, 3.5, 1.4, 0.2, 'setosa'),
    (5.7, 2.8, 4.1, 1.3, 'versicolor'),
    (6.3, 2.9, 5.6, 1.8, 'virginica'),
    (6.4, 3.2, 4.5, 1.5, 'versicolor'),
    (4.7, 3.2, 1.3, 0.2, 'setosa'),
    (7.0, 3.2, 4.7, 1.4, 'versicolor'),
    (7.6, 3.0, 6.6, 2.1, 'virginica'),
    (4.9, 3.0, 1.4, 0.2, 'setosa'),
    (4.9, 2.5, 4.5, 1.7, 'virginica'),
    (7.1, 3.0, 5.9, 2.1, 'virginica'),
    (4.6, 3.4, 1.4, 0.3, 'setosa'),
    (5.4, 3.9, 1.7, 0.4, 'setosa'),
    (5.7, 2.8, 4.5, 1.3, 'versicolor'),
    (5.0, 3.6, 1.4, 0.3, 'setosa'),
    (5.5, 2.3, 4.0, 1.3, 'versicolor'),
    (6.5, 3.0, 5.8, 2.2, 'virginica'),
    (6.5, 2.8, 4.6, 1.5, 'versicolor'),
    (6.3, 3.3, 6.0, 2.5, 'virginica'),
    (6.9, 3.1, 4.9, 1.5, 'versicolor'),
    (4.6, 3.1, 1.5, 0.2, 'setosa'),
]
The whys and wherefores
  • Iterating over nested data structures

  • Using slices

  • Type casting

  • List comprehension

  • Magic Number