5.12. Series Statistics

import pandas as pd
import numpy as np

s = pd.Series(
    data = [1.0, 2.0, 3.0, np.nan, 5.0],
    index = ['a', 'b', 'c', 'd', 'e'])

s
# a    1.0
# b    2.0
# c    3.0
# d    NaN
# e    5.0
# dtype: float64

5.12.1. Count

s.count()
# 4

5.12.2. Sum

s.sum()
# 11.0
s.cumsum()
# a    1.0
# b    3.0
# c    6.0
# d    NaN
# e    11.0
# dtype: float64

5.12.3. Product

s.prod()
# 30.0
s.cumprod()
# a    1.0
# b    2.0
# c    6.0
# d    NaN
# e    30.0
# dtype: float64

5.12.4. Extremes

5.12.4.1. Minimum

s.min()
# 1.0
s.idxmin()
# 'a'

5.12.4.2. Maximum

s.max()
# 5.0
s.idxmax()
# 'e'

5.12.5. Average

5.12.5.1. Mean

s.mean()
# 2.75

5.12.5.2. Median

s.median()
# 2.5

5.12.5.3. Standard Deviation

s.std()
# 1.707825127659933

5.12.6. Distribution

5.12.6.1. Quantile

  • A.K.A. Percentile

s.quantile(.3)
# 1.9

s.quantile([.25, .5, .75])
# 0.25    1.75
# 0.50    2.50
# 0.75    3.50
# dtype: float64

5.12.6.2. Variance

s.var()
# 2.9166666666666665

5.12.6.3. Correlation Coefficient

s.corr(s)
# 1.0

5.12.7. Describe

s.describe()
# count    4.000000
# mean     2.750000
# std      1.707825
# min      1.000000
# 25%      1.750000
# 50%      2.500000
# 75%      3.500000
# max      5.000000
# dtype: float64

5.12.8. Other methods

Table 5.4. Descriptive statistics

Function

Description

count

Number of non-null observations

sum

Sum of values

mean

Mean of values

mad

Mean absolute deviation

median

Arithmetic median of values

min

Minimum

max

Maximum

mode

Mode

abs

Absolute Value

prod

Product of values

std

Unbiased standard deviation

var

Unbiased variance

sem

Unbiased standard error of the mean

skew

Unbiased skewness (3rd moment)

kurt

Unbiased kurtosis (4th moment)

quantile

Sample quantile (value at %)

cumsum

Cumulative sum

cumprod

Cumulative product

cummax

Cumulative maximum

cummin

Cumulative minimum

5.12.9. Assignments

Todo

Create Assignments