11.5. Datetime Parse¶
Parsing - analyze (a sentence) into its parts and describe their syntactic roles.
11.5.1. Parsing dates¶
>>> from datetime import datetime
Datetime parsing from string:
>>> x = '1961-04-12 06:07'
>>>
>>> datetime.strptime(x, '%Y-%m-%d %H:%M')
datetime.datetime(1961, 4, 12, 6, 7)
11.5.2. Leading Zero¶
Mind that while parsing dates without leading zero, you do not use %#H
or %-H
as it was for formatting. One should simply use %H
to capture
hour:
>>> x = '1961-04-12 6:07'
>>>
>>> datetime.strptime(x, '%Y-%m-%d %H:%M')
datetime.datetime(1961, 4, 12, 6, 7)
11.5.3. String Fitting¶
If there are any other characters in the string, such as commas, brackets spaces, colons, dashes etc, they should be reflected in the format string.
>>> x = 'Apr 12th, 1961 6:07 am'
>>>
>>> datetime.strptime(x, '%b %dth, %Y %I:%M %p')
datetime.datetime(1961, 4, 12, 6, 7)
>>> x = '12 April 1961 at 6:07 am'
>>>
>>> datetime.strptime(x, '%d %B %Y at %I:%M %p')
datetime.datetime(1961, 4, 12, 6, 7)
Omitting any of those values will result with an error:
>>> x = '12 April 1961 at 6:07 am'
>>>
>>> datetime.strptime(x, '%d %B %Y %I:%M %p')
Traceback (most recent call last):
ValueError: time data '12 April 1961 at 6:07 am' does not match format '%d %B %Y %I:%M %p'
11.5.4. Time Zone¶
More information in Datetime Timezone
>>> x = '12 April 1961 6:07 UTC'
>>>
>>> datetime.strptime(x, '%d %B %Y %H:%M %Z')
datetime.datetime(1961, 4, 12, 6, 7)
>>> x = '1961-04-12 6:07 local'
>>>
>>> datetime.strptime(x, '%Y-%m-%d %H:%M')
Traceback (most recent call last):
ValueError: unconverted data remains: local
>>> x = '1961-04-12 6:07 local'
>>>
>>> datetime.strptime(x, '%Y-%m-%d %H:%M %Z')
Traceback (most recent call last):
ValueError: time data '1961-04-12 6:07 local' does not match format '%Y-%m-%d %H:%M %Z'
>>> x = '1961-04-12 6:07 local'
>>>
>>> datetime.strptime(x, '%Y-%m-%d %H:%M local')
datetime.datetime(1961, 4, 12, 6, 7)
11.5.5. Parsing Parameters¶
Directive |
Example |
Meaning |
---|---|---|
|
Sun, Mon, ..., Sat |
Weekday as locale's abbreviated name |
|
Sunday, Monday, ..., Saturday (en_US) |
Weekday as locale's full name |
|
0, 1, ..., 6 |
Weekday as a decimal number, where 0 is Sunday and 6 is Saturday |
|
01, 02, ..., 31 |
Day of the month as a zero-padded decimal number |
|
Jan, Feb, ..., Dec (en_US) |
Month as locale's abbreviated name |
|
January, February, ..., December (en_US) |
Month as locale's full name |
|
01, 02, ..., 12 |
Month as a zero-padded decimal number |
|
00, 01, ..., 99 |
Year without century as a zero-padded decimal number |
|
0001, 0002, ..., 2013, 2014, ..., 9998, 9999 |
Year with century as a decimal number |
|
00, 01, ..., 23 |
Hour (24-hour clock) as a zero-padded decimal number |
|
01, 02, ..., 12 |
Hour (12-hour clock) as a zero-padded decimal number |
|
AM, PM (en_US) |
Locale's equivalent of either AM or PM |
|
00, 01, ..., 59 |
Minute as a zero-padded decimal number |
|
00, 01, ..., 59 |
Second as a zero-padded decimal number |
|
000000, 000001, ..., 999999 |
Microsecond as a decimal number, zero-padded on the left |
|
(empty), +0000, -0400, +1030 |
UTC offset in the form +HHMM or -HHMM (empty string if the object is naive) |
|
(empty), UTC, EST, CST |
Time zone name (empty string if the object is naive) |
|
001, 002, ..., 366 |
Day of the year as a zero-padded decimal number |
|
00, 01, ..., 53 |
Week number of the year (Sunday as the first day of the week) as a zero padded decimal number. All days in a new year preceding the first Sunday are considered to be in week 0 |
|
00, 01, ..., 53 |
Week number of the year (Monday as the first day of the week) as a decimal number. All days in a new year preceding the first Monday are considered to be in week 0 |
|
Tue Aug 16 21:30:00 1988 (en_US) |
Locale's appropriate date and time representation |
|
08/16/1988 (en_US); 16.08.1988 (de_DE) |
Locale's appropriate date representation |
|
21:30:00 |
Locale's appropriate time representation |
|
% |
A literal |
|
0001, 0002, ..., 2013, 2014, ..., 9998, 9999 |
ISO 8601 year with century representing the year that contains the greater part of the ISO week ( |
|
1, 2, ..., 7 |
ISO 8601 weekday as a decimal number where 1 is Monday |
|
01, 02, ..., 53 |
ISO 8601 week as a decimal number with Monday as the first day of the week. Week 01 is the week containing Jan 4 |
11.5.6. Assignments¶
"""
* Assignment: Datetime Parse US
* Complexity: easy
* Lines of code: 1 lines
* Time: 3 min
English:
1. Define `result: datetime` with parsed date `DATA`
2. Run doctests - all must succeed
Polish:
1. Zdefiniuj `result: datetime` ze sparsowaną datą `DATA`
2. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* 12-hour clock
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is datetime, \
'Variable `result` has invalid type, must be a datetime'
>>> result
datetime.datetime(1969, 7, 21, 2, 56, 15)
"""
from datetime import datetime
DATA = 'July 21, 1969 2:56:15 AM'
# DATA from US long format
# type: datetime
result = ...
"""
* Assignment: Datetime Parse Ordinals
* Complexity: easy
* Lines of code: 1 lines
* Time: 3 min
English:
1. Define `result: datetime` with parsed date `DATA`
2. Run doctests - all must succeed
Polish:
1. Zdefiniuj `result: datetime` ze sparsowaną datą `DATA`
2. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* 12-hour clock
* `%dst`
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is datetime, \
'Variable `result` has invalid type, must be a datetime'
>>> result
datetime.datetime(1969, 7, 21, 2, 56, 15)
"""
from datetime import datetime
DATA = 'July 21st, 1969 2:56:15 AM'
# DATA from long US format with ordinals
# type: datetime
result = ...
"""
* Assignment: Datetime Parse List
* Complexity: medium
* Lines of code: 8 lines
* Time: 3 min
English:
1. Define `result: list[date]` with parsed `DATA` dates
2. Run doctests - all must succeed
Polish:
1. Zdefiniuj `result: list[date]` ze sparsowanymi datami `DATA`
2. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* `for ... in`
* `try ... except`
* ``dt.strptime()``
* ``dt.date()``
* ``list.append()``
* 24-hour clock
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> from pprint import pprint
>>> result = list(result)
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> assert type(result) is list, \
'Variable `result` has invalid type, must be a list'
>>> assert all(type(element) is date for element in result), \
'All elements in `result` must be a date'
>>> pprint(result, width=30)
[datetime.date(1957, 10, 4),
datetime.date(1961, 4, 12)]
"""
from datetime import date, datetime
DATA = [
'October 4, 1957', # Sputnik launch (first satellite in space)
'Apr 12, 1961', # Gagarin launch (first human in space)
]
# parsed DATA
# type: list[date]
result = ...
"""
* Assignment: Datetime Parse Many
* Complexity: medium
* Lines of code: 12 lines
* Time: 5 min
English:
1. Define `result: list[datetime]` with parsed `DATA` dates
2. Run doctests - all must succeed
Polish:
1. Zdefiniuj `result: list[datetime]` ze sparsowanymi datami `DATA`
2. Uruchom doctesty - wszystkie muszą się powieść
Hints:
* `for ... in`
* nested `try ... except`
* FORMATS = []
* for fmt in FORMATS
* helper function
* 24-hour clock
Tests:
>>> import sys; sys.tracebacklimit = 0
>>> assert result is not Ellipsis, \
'Assign result to variable: `result`'
>>> result = list(result)
>>> assert type(result) is list, \
'Variable `result` has invalid type, must be a list'
>>> assert all(type(element) is datetime for element in result), \
'All elements in `result` must be a datetime'
>>> result # doctest: +NORMALIZE_WHITESPACE
[datetime.datetime(1957, 10, 4, 19, 28, 34),
datetime.datetime(1961, 4, 12, 6, 7),
datetime.datetime(1969, 7, 21, 2, 56, 15)]
"""
from datetime import datetime
DATA = [
'Oct 4, 1957 19:28:34', # Sputnik launch (first satellite in space)
'April 12, 1961 6:07', # Gagarin launch (first human in space)
'July 21, 1969 2:56:15', # Armstrong first step on the Moon
]
FORMATS = [
'%b %d, %Y %H:%M:%S',
'%B %d, %Y %H:%M',
'%B %d, %Y %H:%M:%S',
]
# DATA elements in datetime format
# type: list[datetime]
result = ...