4.5. Datetime Parse

4.5.1. Rationale

  • Parsing - analyze (a sentence) into its parts and describe their syntactic roles.

4.5.2. Parsing dates

Datetime parsing from string:

>>> from datetime import datetime
>>>
>>>
>>> sputnik = '4 October 1957, 19:28:34 [UTC]'
>>> datetime.strptime(sputnik, '%d %B %Y, %H:%M:%S [%Z]')
datetime.datetime(1957, 10, 4, 19, 28, 34)
>>>
>>> gagarin = '1961-04-12 06:07'
>>> datetime.strptime(gagarin, '%Y-%m-%d %H:%M')
datetime.datetime(1961, 4, 12, 6, 7)
>>>
>>> armstrong = 'Jul 21, 69 2:56:15'
>>> datetime.strptime(armstrong, '%b %d, %y %I:%M:%S')
datetime.datetime(1969, 7, 21, 2, 56, 15)

4.5.3. Use Case

>>> from datetime import datetime
>>>
>>>
>>> line = '1969-07-21T02:56:15.123 [WARNING] First step on the Moon'
>>>
>>> dt, lvl, msg = line.split(maxsplit=2)
>>> dt = datetime.strptime(dt, '%Y-%m-%dT%H:%M:%S.%f')
>>> lvl = lvl.strip('[]')
>>>
>>> print(dt)
1969-07-21 02:56:15.123000
>>>
>>> print(lvl)
WARNING
>>>
>>> print(msg)
First step on the Moon

4.5.4. Parsing Parameters

Table 4.3. Date and time parsing parameters

Directive

Example

Meaning

%a

Sun, Mon, …, Sat

Weekday as locale's abbreviated name

%A

Sunday, Monday, …, Saturday (en_US)

Weekday as locale's full name

%w

0, 1, …, 6

Weekday as a decimal number, where 0 is Sunday and 6 is Saturday

%d

01, 02, …, 31

Day of the month as a zero-padded decimal number

%b

Jan, Feb, …, Dec (en_US)

Month as locale's abbreviated name

%B

January, February, …, December (en_US)

Month as locale’s full name

%m

01, 02, …, 12

Month as a zero-padded decimal number

%y

00, 01, …, 99

Year without century as a zero-padded decimal number

%Y

0001, 0002, …, 2013, 2014, …, 9998, 9999

Year with century as a decimal number

%H

00, 01, …, 23

Hour (24-hour clock) as a zero-padded decimal number

%I

01, 02, …, 12

Hour (12-hour clock) as a zero-padded decimal number

%p

AM, PM (en_US)

Locale’s equivalent of either AM or PM

%M

00, 01, …, 59

Minute as a zero-padded decimal number

%S

00, 01, …, 59

Second as a zero-padded decimal number

%f

000000, 000001, …, 999999

Microsecond as a decimal number, zero-padded on the left

%z

(empty), +0000, -0400, +1030

UTC offset in the form +HHMM or -HHMM (empty string if the object is naive)

%Z

(empty), UTC, EST, CST

Time zone name (empty string if the object is naive)

%j

001, 002, …, 366

Day of the year as a zero-padded decimal number

%U

00, 01, …, 53

Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0

%W

00, 01, …, 53

Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0

%c

Tue Aug 16 21:30:00 1988 (en_US)

Locale’s appropriate date and time representation

%x

08/16/1988 (en_US); 16.08.1988 (de_DE)

Locale’s appropriate date representation

%X

21:30:00

Locale’s appropriate time representation

%%

%

A literal % character

%G

0001, 0002, …, 2013, 2014, …, 9998, 9999

ISO 8601 year with century representing the year that contains the greater part of the ISO week (%V)

%u

1, 2, …, 7

ISO 8601 weekday as a decimal number where 1 is Monday

%V

01, 02, …, 53

ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4

4.5.5. Assignments

Code 4.13. Solution
"""
* Assignment: Datetime Parse US
* Complexity: easy
* Lines of code: 1 lines
* Time: 3 min

English:
    1. Define `result: datetime` with parsed date `DATA`
    2. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: datetime` ze sparsowaną datą `DATA`
    2. Uruchom doctesty - wszystkie muszą się powieść

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert type(result) is datetime, \
    'Variable `result` has invalid type, must be a datetime'

    >>> result
    datetime.datetime(1969, 7, 21, 2, 56, 15)
"""

from datetime import datetime


DATA = 'July 21, 1969 2:56:15 AM'

# datetime: DATA from US long format
result = ...

Code 4.14. Solution
"""
* Assignment: Datetime Parse Ordinals
* Complexity: easy
* Lines of code: 1 lines
* Time: 3 min

English:
    1. Define `result: datetime` with parsed date `DATA`
    2. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: datetime` ze sparsowaną datą `DATA`
    2. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `%dst`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert type(result) is datetime, \
    'Variable `result` has invalid type, must be a datetime'

    >>> result
    datetime.datetime(1969, 7, 21, 2, 56, 15)
"""

from datetime import datetime


DATA = 'July 21st, 1969 2:56:15 AM'

# datetime: DATA from long US format with ordinals
result = ...

Code 4.15. Solution
"""
* Assignment: Datetime Parse List
* Complexity: medium
* Lines of code: 8 lines
* Time: 5 min

English:
    1. Define `result: list[datetime]` with parsed `DATA` dates
    2. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: list[datetime]` ze sparsowanymi datami `DATA`
    2. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `for ... in`
    * `try ... except`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert type(result) is list, \
    'Variable `result` has invalid type, must be a list'

    >>> assert all(type(element) is datetime for element in result), \
    'All elements in `result` must be a datetime'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [datetime.datetime(1961, 4, 12, 6, 7),
     datetime.datetime(1961, 4, 12, 6, 7)]
"""

from datetime import datetime


DATA = ['1961-04-12 06:07',
        '1961-04-12 06:07:00']

# list[datetime]: parsed DATA
result = ...

Code 4.16. Solution
"""
* Assignment: Datetime Parse Many
* Complexity: medium
* Lines of code: 12 lines
* Time: 5 min

English:
    1. Define `result: list[datetime]` with parsed `DATA` dates
    2. Run doctests - all must succeed

Polish:
    1. Zdefiniuj `result: list[datetime]` ze sparsowanymi datami `DATA`
    2. Uruchom doctesty - wszystkie muszą się powieść

Hints:
    * `for ... in`
    * nested `try ... except`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert type(result) is list, \
    'Variable `result` has invalid type, must be a list'

    >>> assert all(type(element) is datetime for element in result), \
    'All elements in `result` must be a datetime'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [datetime.datetime(1957, 10, 4, 19, 28, 34),
     datetime.datetime(1961, 4, 12, 6, 7),
     datetime.datetime(1969, 7, 21, 2, 56, 15)]
"""

from datetime import datetime


DATA = [
    '1957-10-04 19:28:34',  # Sputnik launch (first satellite in space)
    '1961-04-12 06:07',  # Gagarin launch (first human in space)
    '1969-07-21T02:56:15',  # Armstrong first step on the Moon
]

# list[datetime]: DATA elements in datetime format
result = ...

Code 4.17. Solution
"""
* Assignment: Datetime Parse Logs
* Complexity: medium
* Lines of code: 13 lines
* Time: 8 min

English:
    1. Iterate over `DATA` with Apollo 11 timeline [1]
    2. From each line extract date, time, level and message
    3. Collect data to `result: list[dict]`
    4. Run doctests - all must succeed

Polish:
    1. Iteruj po `DATA` z harmonogramem Apollo 11 [1]
    2. Dla każdej linii wyciągnij datę, czas, poziom logowania oraz wiadomość
    3. Zbierz dane do `result: list[dict]`
    4. Uruchom doctesty - wszystkie muszą się powieść

Hint:
    * Note, that last time has no seconds
    * This is not bug, time without seconds is in NASA history records [1]

References:
    [1] National Aeronautics and Space Administration.
        Apollo 11 timeline.
        Year: 1969. Retrieved: 2021-03-25.
        URL: https://history.nasa.gov/SP-4029/Apollo_11i_Timeline.htm

Hints:
    * `str.splitlines()`
    * `str.split(', ', maxsplit=3)`
    * `try ... except`

Tests:
    >>> import sys; sys.tracebacklimit = 0

    >>> assert type(result) is list, \
    'Variable `result` has invalid type, must be a list'

    >>> assert all(type(row) is dict for row in result), \
    'All elements in result must be dict'

    >>> result  # doctest: +NORMALIZE_WHITESPACE
    [{'when': datetime.datetime(1969, 7, 14, 21, 0),
      'level': 'INFO',
      'message': 'Terminal countdown started'},
     {'when': datetime.datetime(1969, 7, 16, 13, 31, 53),
      'level': 'WARNING',
      'message': 'S-IC engine ignition (#5)'},
     {'when': datetime.datetime(1969, 7, 16, 13, 33, 23),
      'level': 'DEBUG',
      'message': 'Maximum dynamic pressure (735.17 lb/ft^2)'},
     {'when': datetime.datetime(1969, 7, 16, 13, 34, 44),
      'level':
      'WARNING', 'message': 'S-II ignition'},
     {'when': datetime.datetime(1969, 7, 16, 13, 35, 17),
      'level': 'DEBUG',
      'message': 'Launch escape tower jettisoned'},
     {'when': datetime.datetime(1969, 7, 16, 13, 39, 40),
      'level': 'DEBUG',
      'message': 'S-II center engine cutoff'},
     {'when': datetime.datetime(1969, 7, 16, 16, 22, 13),
      'level': 'INFO',
      'message': 'Translunar injection'},
     {'when': datetime.datetime(1969, 7, 16, 16, 56, 3),
      'level': 'INFO',
      'message': 'CSM docked with LM/S-IVB'},
     {'when': datetime.datetime(1969, 7, 16, 17, 21, 50),
      'level': 'INFO',
      'message': 'Lunar orbit insertion ignition'},
     {'when': datetime.datetime(1969, 7, 16, 21, 43, 36),
      'level': 'INFO',
      'message': 'Lunar orbit circularization ignition'},
     {'when': datetime.datetime(1969, 7, 20, 17, 44),
      'level': 'INFO',
      'message': 'CSM/LM undocked'},
     {'when': datetime.datetime(1969, 7, 20, 20, 5, 5),
      'level': 'WARNING',
      'message': 'LM powered descent engine ignition'},
     {'when': datetime.datetime(1969, 7, 20, 20, 10, 22),
      'level': 'ERROR',
      'message': 'LM 1202 alarm'},
     {'when': datetime.datetime(1969, 7, 20, 20, 14, 18),
      'level': 'ERROR',
      'message': 'LM 1201 alarm'},
     {'when': datetime.datetime(1969, 7, 20, 20, 17, 39),
      'level':
      'WARNING', 'message': 'LM lunar landing'},
     {'when': datetime.datetime(1969, 7, 21, 2, 39, 33),
      'level': 'DEBUG',
      'message': 'EVA started (hatch open)'},
     {'when': datetime.datetime(1969, 7, 21, 2, 56, 15),
      'level': 'WARNING',
      'message': '1st step taken lunar surface (CDR)'},
     {'when': datetime.datetime(1969, 7, 21, 2, 56, 15),
      'level': 'WARNING',
      'message': 'Neil Armstrong first words on the Moon'},
     {'when': datetime.datetime(1969, 7, 21, 3, 5, 58),
      'level': 'DEBUG',
      'message': 'Contingency sample collection started (CDR)'},
     {'when': datetime.datetime(1969, 7, 21, 3, 15, 16),
      'level': 'INFO',
      'message': 'LMP on lunar surface'},
     {'when': datetime.datetime(1969, 7, 21, 5, 11, 13),
      'level': 'DEBUG',
      'message': 'EVA ended (hatch closed)'},
     {'when': datetime.datetime(1969, 7, 21, 17, 54),
      'level': 'WARNING',
      'message': 'LM lunar liftoff ignition (LM APS)'},
     {'when': datetime.datetime(1969, 7, 21, 21, 35),
      'level': 'INFO',
      'message': 'CSM/LM docked'},
     {'when': datetime.datetime(1969, 7, 22, 4, 55, 42),
      'level': 'WARNING',
      'message': 'Transearth injection ignition (SPS)'},
     {'when': datetime.datetime(1969, 7, 24, 16, 21, 12),
      'level': 'INFO',
      'message': 'CM/SM separation'},
     {'when': datetime.datetime(1969, 7, 24, 16, 35, 5),
      'level': 'WARNING',
      'message': 'Entry'},
     {'when': datetime.datetime(1969, 7, 24, 16, 50, 35),
      'level':
      'WARNING', 'message': 'Splashdown (went to apex-down)'},
     {'when': datetime.datetime(1969, 7, 24, 17, 29),
      'level': 'INFO',
      'message': 'Crew egress'}]
"""
from datetime import date, datetime, time


DATA = """1969-07-14, 21:00:00, INFO, Terminal countdown started
1969-07-16, 13:31:53, WARNING, S-IC engine ignition (#5)
1969-07-16, 13:33:23, DEBUG, Maximum dynamic pressure (735.17 lb/ft^2)
1969-07-16, 13:34:44, WARNING, S-II ignition
1969-07-16, 13:35:17, DEBUG, Launch escape tower jettisoned
1969-07-16, 13:39:40, DEBUG, S-II center engine cutoff
1969-07-16, 16:22:13, INFO, Translunar injection
1969-07-16, 16:56:03, INFO, CSM docked with LM/S-IVB
1969-07-16, 17:21:50, INFO, Lunar orbit insertion ignition
1969-07-16, 21:43:36, INFO, Lunar orbit circularization ignition
1969-07-20, 17:44:00, INFO, CSM/LM undocked
1969-07-20, 20:05:05, WARNING, LM powered descent engine ignition
1969-07-20, 20:10:22, ERROR, LM 1202 alarm
1969-07-20, 20:14:18, ERROR, LM 1201 alarm
1969-07-20, 20:17:39, WARNING, LM lunar landing
1969-07-21, 02:39:33, DEBUG, EVA started (hatch open)
1969-07-21, 02:56:15, WARNING, 1st step taken lunar surface (CDR)
1969-07-21, 02:56:15, WARNING, Neil Armstrong first words on the Moon
1969-07-21, 03:05:58, DEBUG, Contingency sample collection started (CDR)
1969-07-21, 03:15:16, INFO, LMP on lunar surface
1969-07-21, 05:11:13, DEBUG, EVA ended (hatch closed)
1969-07-21, 17:54:00, WARNING, LM lunar liftoff ignition (LM APS)
1969-07-21, 21:35:00, INFO, CSM/LM docked
1969-07-22, 04:55:42, WARNING, Transearth injection ignition (SPS)
1969-07-24, 16:21:12, INFO, CM/SM separation
1969-07-24, 16:35:05, WARNING, Entry
1969-07-24, 16:50:35, WARNING, Splashdown (went to apex-down)
1969-07-24, 17:29, INFO, Crew egress"""

# list[dict]: representation of DATA; dict keys: when, level, message
result = ...