5.19. DataFrame Sample

5.19.1. Sample n elements

df.sample()
#           A     B     C     D
# 1970-01-05  0.589973  0.748417 -1.680741 0.510512
df.sample(2)
#           A     B     C     D
# 1970-01-04 -0.974425  1.327082 -0.435516 1.328745
# 1970-01-01 0.131926 -1.825204 -1.909562  1.274718
df.sample(n=2, repeat=True)
#           A     B     C     D
# 1970-01-05  0.589973  0.748417 -1.680741 0.510512
# 1970-01-05  0.589973  0.748417 -1.680741 0.510512

5.19.2. Sample n percent of elements

 • 0.05 is 5%

 • 1.0 is 100%

df.sample(frac=0.05)
#   Sepal length Sepal width Petal length Petal width   Species
# 146      5.9     3.0      4.2     1.5 Versicolor
# 135      4.7     3.2      1.3     0.2   Setosa
# 15      6.6     3.0      4.4     1.4 Versicolor
# 68      5.0     3.6      1.4     0.2   Setosa
# 42      6.2     2.8      4.8     1.8  Virginica
# 10      6.5     3.0      5.2     2.0  Virginica
# 17      5.8     2.7      5.1     1.9  Virginica
# 66      5.4     3.4      1.7     0.2   Setosa
df.sample(frac=0.05).reset_index(drop=True)
#   Sepal length Sepal width Petal length Petal width   Species
# 0       5.9     3.0      4.2     1.5 Versicolor
# 1       4.7     3.2      1.3     0.2   Setosa
# 2       6.6     3.0      4.4     1.4 Versicolor
# 3       5.0     3.6      1.4     0.2   Setosa
# 4       6.2     2.8      4.8     1.8  Virginica
# 5       6.5     3.0      5.2     2.0  Virginica
# 6       5.8     2.7      5.1     1.9  Virginica
# 7       5.4     3.4      1.7     0.2   Setosa

5.19.2.1. Assignments

5.19.3. Iris Clean

 • Complexity level: easy

 • Lines of code to write: 5 lines

 • Estimated time of completion: 10 min

 • Filename: solution/df_select.py

Polish
 1. Pobierz zbiór danych Iris Dataset data/iris.csv

 2. Korzystając z Pandas do pd.DataFrame

 3. Ustaw wszystkie wiersze w losowej kolejności

 4. Zresetuj index nie pozostawiając kopii zapasowej starego