3.5. Type str

3.5.1. Type Definition

Listing 24. str Type Definition
my_str = ''
my_str = 'Jan Twardowski'
my_str = "Jan Twardowski"
my_str = '''Jan Twardowski'''
my_str = """Jan Twardowski"""
Listing 25. Multiline str
my_str = """First line
Second line
Third line"""
# 'First line\nSecond line\nThird line'
Listing 26. Multiline str
my_str = """
    First line
    Second line
    Third line
"""
# '\n        First line\n        Second line\n        Third line\n    '
Listing 27. If assigned to variable, it serves as multiline str otherwise it's a docstring.
"""
We choose to go to the Moon!
We choose to go to the Moon in this decade and do the other things,
not because they are easy, but because they are hard;
because that goal will serve to organize and measure the best of our energies and skills,
because that challenge is one that we are willing to accept, one we are unwilling to postpone,
and one we intend to win, and the others, too.
"""

3.5.2. Type Casting

Listing 28. str() converts argument to str
str('hello')        # 'hello'
str(1969)           # '1969'
str(13.37)          # '13.37'

3.5.3. Single or Double Quotes?

  • " and ' works the same

  • Choose one and keep consistency in code

  • Python console uses '

  • It matters for doctest, which compares two outputs character by character

  • For multiline always use double quote characters to be consistent with the docstring convention PEP 257

Listing 29. When use double quotes?
my_str = 'It\'s Twardowski\'s Moon.'
my_str = "It's Twardowski's Moon."
  • HTML and XML uses double quotes

Listing 30. When use single quotes?
my_str = '<a href="http://python.astrotech.io">Python and Machine Learning</a>'
Listing 31. For multiline always use double quote characters to be consistent with the docstring convention PEP 257
my_str = """My name's "José Jiménez""""
my_str = '''My name\'s "José Jiménez"'''

3.5.4. Escape Characters

  • \r\n - is used on windows

  • \n - is used everywhere else

../../_images/type-machine.jpg

Figure 17. Why we have '\r\n' on Windows?

Table 6. Frequently used escape characters

Sequence

Description

\n

New line (LF - Linefeed)

\r

Carriage Return (CR)

\t

Horizontal Tab (TAB)

\'

Single quote '

\"

Double quote "

\\

Backslash \

Table 7. Less frequently used escape characters

Sequence

Description

\a

Bell (BEL)

\b

Backspace (BS)

\f

New page (FF - Form Feed)

\v

Vertical Tab (VT)

\uF680

Character with 16-bit (2 bytes) hex value F680

\U0001F680

Character with 32-bit (4 bytes) hex value 0001F680

\o755

ASCII character with octal value 755

\x1F680

ASCII character with hex value 1F680

print('\U0001F680')     # 🚀

3.5.5. String Literals

3.5.5.1. Format String

  • String interpolation (variable substitution)

  • Since Python 3.6

  • Used for str concatenation

name = 'José Jiménez'

print(f'My name... {name}')
# My name... José Jiménez
first_name = 'Jan'
last_name = 'Twardowski'

name = f'{first_name} {last_name}'
# Jan Twardowski

3.5.5.2. Unicode Literals

  • In Python 3 str is Unicode

  • In Python 2 str is Bytes

  • In Python 3 u'...' is only for compatibility with Python 2

u'zażółć gęślą jaźń'

3.5.5.3. Bytes Literals

  • Used while reading from low level devices and drivers

  • Used in sockets and HTTP connections

  • bytes is a sequence of octets (integers between 0 and 255)

  • bytes.decode() conversion to unicode str

  • str.encode() conversion to bytes

b'this is bytes literals'

3.5.5.4. Raw String

  • Escapes does not matters

Listing 32. In Regular Expressions
r'[a-z0-9]\n'
print(r'C:\Users\Admin\file.txt')
# C:\Users\Admin\file.txt
print('C:\Users\Admin\file.txt')
# SyntaxError: (unicode error) 'unicodeescape'
#   codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape
  • Problem: \Users

  • after \U... python expects Unicode codepoint in hex

  • s is invalid hexadecimal character

3.5.6. Reading User Input

  • input() returns str

  • Good practice: add space at the end of prompt

name = input('What is your name: ')
# What is your name: Jan Twardowski

print(name)     # 'Jan Twardowski'
type(name)      # <class 'str'>
age = input('What is your age: ')
# What is your age: 42

print(age)      # '42'
type(age)       # <class 'str'>

3.5.7. Assignments

3.5.7.1. Example

  • Complexity level: easy

  • Lines of code to write: 3 lines

  • Estimated time of completion: 3 min

  • Solution: solution/str_example.py

English
  • Ask user to input text

  • Print number of characters

Polish
  • Poproś użytkownika o wprowadzenie tekstu

  • Wypisz liczbę znaków

Solution
text = input('Type text: ')
length = len(text)

print(length)

3.5.7.2. Emoticon Print

  • Complexity level: easy

  • Lines of code to write: 3 lines

  • Estimated time of completion: 3 min

  • Solution: solution/str_emoticon.py

English
  1. Ask user to input name

  2. Print hello NAME EMOTICON, where:

    • NAME is a name read from user

    • EMOTICON is Unicode Codepoint "U+1F642"

  3. Print length of a name, which was read from user

Polish
  1. Poproś użytkownika o wprowadzenie imienia

  2. Wypisz hello NAME EMOTICON, gdzie:

    • NAME to imię wczytane od użytkownika

    • EMOTICON to Unicode Codepoint "U+1F642"

  3. Wyświetl długość imienia, wczytanego od użytkownika

The whys and wherefores
  • Variable declaration

  • Print formatting

  • Reading input data from user

3.5.7.3. Variables and Types

  • Complexity level: easy

  • Lines of code to write: 3 lines

  • Estimated time of completion: 10 min

  • Solution: solution/str_input.py

English
  1. Ask user to input name

  2. Print text identical to output code (see below)

  3. To print use f-string formatting

  4. Note, that second line starts with tab

  5. Value in double quotes is a name read from user (in output user typed José Jiménez)

  6. Mind the different quotes, apostrophes, tabs and newlines

  7. Do not use neither space not enter - use \n and \t

  8. Do not use string addition (str + str)

Polish
  1. Poproś użytkownika o wprowadzenie imienia

  2. Wypisz tekst identyczny do kodu wyjścia (patrz sekcja output)

  3. Do wypisania użyj f-string formatting

  4. Zauważ, że druga linijka zaczyna się od tabulacji

  5. Wartość w podwójnych cudzysłowach to ciąg od użytkownika (w przykładzie użytkownik wpisał José Jiménez)

  6. Zwróć uwagę na znaki apostrofów, cudzysłowów, tabulacji i nowych linii

  7. Nie używaj spacji ani entera - użyj \n i \t

  8. Nie korzystaj z dodawania stringów (str + str)

Output
'''My name... "José Jiménez".
    I'm an """astronaut!"""'''
The whys and wherefores
  • Variable declaration

  • Print formatting

  • Reading input data from user