Strings in Python

Strings in Python

The string is one of the most important data in programming.

>>> dir(str)
[‘__add__’, ‘__class__’, ‘__contains__’, ‘__delattr__’, ‘__dir__’, ‘__doc__’, ‘_
_eq__’, ‘__format__’, ‘__ge__’, ‘__getattribute__’, ‘__getitem__’, ‘__getnewargs
__’, ‘__gt__’, ‘__hash__’, ‘__init__’, ‘__iter__’, ‘__le__’, ‘__len__’, ‘__lt__’
, ‘__mod__’, ‘__mul__’, ‘__ne__’, ‘__new__’, ‘__reduce__’, ‘__reduce_ex__’, ‘__r
epr__’, ‘__rmod__’, ‘__rmul__’, ‘__setattr__’, ‘__sizeof__’, ‘__str__’, ‘__subcl
asshook__’, ‘capitalize’, ‘casefold’, ‘center’, ‘count’, ‘encode’, ‘endswith’, ‘
expandtabs’, ‘find’, ‘format’, ‘format_map’, ‘index’, ‘isalnum’, ‘isalpha’, ‘isd
ecimal’, ‘isdigit’, ‘isidentifier’, ‘islower’, ‘isnumeric’, ‘isprintable’, ‘issp
ace’, ‘istitle’, ‘isupper’, ‘join’, ‘ljust’, ‘lower’, ‘lstrip’, ‘maketrans’, ‘pa
rtition’, ‘replace’, ‘rfind’, ‘rindex’, ‘rjust’, ‘rpartition’, ‘rsplit’, ‘rstrip
‘, ‘split’, ‘splitlines’, ‘startswith’, ‘strip’, ‘swapcase’, ‘title’, ‘translate
‘, ‘upper’, ‘zfill’]
>>>

We see all the methods for string. There are a lot of them. We will be changing some strings in Polish to see how these methods work. Changed letters are in bold to see the work better.

The following string transformation functions create a new string object from an existing string.

>>> s = “witamy Państwa na naszym przyjęciu”

str.capitalize() → string

>>> s.capitalize()
Witamy państwa na naszym przyjęciu’
>>>

Create a copy of the string with only its first character capitalized.

str.center(width) → string

‘                                 witamy Państwa na naszym przyjęciu
                     ‘
>>>

Create a copy of the string centered in a string of length width. Padding is done using spaces.

str.encode(encoding[, errors]) → string

>>> s.encode(“ascii”)
Traceback (most recent call last):
  File “<stdin>”, line 1, in <module>
UnicodeEncodeError: ‘ascii’ codec can’t encode character ‘\u0144’ in position 9:
 ordinal not in range(128)
>>>

>>> s.encode(“ISO 8859-2”)
b’witamy Pa\xf1stwa na naszym przyj\xeaciu’
>>>

>>> s.encode(“Windows-1250”)
b’witamy Pa\xf1stwa na naszym przyj\xeaciu’
>>>

>>> s.encode(“UTF-8”)
b’witamy Pa\xc5\x84stwa na naszym przyj\xc4\x99ciu’
>>>

>>> s.encode(“Unicode”)
Traceback (most recent call last):
  File “<stdin>”, line 1, in <module>
LookupError: unknown encoding: Unicode
>>>

You see how look this string in various encodings. The default source encoding in Python 3.x.x is UTF-8, but with 2.x.x it was ASCII. That’s why, we can use Polish letters in the console. It’s a very good move because today UTF-8 is a standard for editors.

Return an encoded version of string. Default encoding is the current default string encoding. errors may be given to set a different error handling scheme. Default is ‘strict’ meaning that encoding errors raise a ValueError. Other possible values are ‘ignore’ and ‘replace’.

str.expandtabs([tabsize]) → string

We have the string with tabs:

>>> t = ”       witamy Państwa na przyjęciu             “
>>> t
‘\twitamy Państwa na przyjęciu\t\t’
>>>

Now, let’s how the method works:

>>> t.expandtabs()
‘        witamy Państwa na przyjęciu             ‘
>>> t.expandtabs(10)
‘          witamy Państwa na przyjęciu             ‘
>>> t.expandtabs(100)

                     witamy Państwa na przyjęciu

                                                             ‘
>>>

Return a copy of string where all tab characters are expanded using spaces. If tabsize is not given, a tab size of 8 characters is assumed.

str.join(sequence) → string

>>> s.join(“***”)
‘*witamy Państwa na naszym przyjęciu*witamy Państwa na naszym przyjęciu*’
>>>

>>> s.join(“@@@”)
‘@witamy Państwa na naszym przyjęciu@witamy Państwa na naszym przyjęciu@’
>>>

>>> s.join(”   “)
‘ witamy Państwa na naszym przyjęciu witamy Państwa na naszym przyjęciu ‘
>>>

Return a string which is the concatenation of the strings in the sequence. Each separator between elements is a copy of the given string object.

str.ljust(width) → string

>>> s.ljust(10)
‘witamy Państwa na naszym przyjęciu’
>>> s.ljust(100)
‘witamy Państwa na naszym przyjęciu
                     ‘
>>> s.ljust(50)
‘witamy Państwa na naszym przyjęciu                ‘
>>>

Return a copy of string left justified in a string of length width. Padding is done using spaces.

str.lower() → string

>>> s.lower()
‘witamy państwa na naszym przyjęciu’
>>>

Return a copy of string converted to lowercase.

str.lstrip() → string

>>> s.lstrip()
‘witamy Państwa na naszym przyjęciu’
>>> t.lstrip()
‘witamy Państwa na przyjęciu\t\t’
>>>

Return a copy of string with leading whitespace removed.

str.replace(old, new[, maxsplit]) → string

>>> s.replace(“Państwa”,”Was”)
‘witamy Was na naszym przyjęciu’
>>>

>>> v = “witamy Państwa. witamy Państwa. witamy Państwa na przyjęciu”

>>> v.replace(“Państwa”, “Was”, 0)
‘witamy Państwa. witamy Państwa. witamy Państwa na przyjęciu’
>>> v.replace(“Państwa”, “Was”, 1)
‘witamy Was. witamy Państwa. witamy Państwa na przyjęciu’
>>> v.replace(“Państwa”, “Was”, 2)
‘witamy Was. witamy Was. witamy Państwa na przyjęciu’
>>> v.replace(“Państwa”, “Was”, 3)
‘witamy Was. witamy Was. witamy Was na przyjęciu’
>>>

Return a copy of string with all occurrences of substring old replaced by new. If the optional argument maxsplit is given, only the first maxsplit occurrences are replaced.

str.rjust(width) → string

>>> s.rjust(10)
‘witamy Państwa na naszym przyjęciu’
>>> s.rjust(50)
‘                witamy Państwa na naszym przyjęciu’
>>>

Return a copy of string right justified in a string of length width. Padding is done using spaces.

str.rstrip() → string

>>> w = “witamy Państwa na naszym przyjęciu             “
>>> w.rstrip()
‘witamy Państwa na naszym przyjęciu’
>>>

Return a copy of string with trailing whitespace removed.

str.strip() → string

>>> z = ”       witamy Państwa na naszym przyjęciu              “

>>> z.strip()
‘witamy Państwa na naszym przyjęciu’
>>>

Return a copy of string with leading and trailing whitespace removed.

str.swapcase() → string

>>> s.swapcase()
‘WITAMY pAŃSTWA NA NASZYM PRZYJĘCIU’
>>>

Return a copy of string with uppercase characters converted to lowercase and vice versa.

str.title() → string

>>> s.title()
Witamy Państwa Na Naszym Przyjęciu’
>>>

Return a copy of string with words starting with uppercase characters, all remaining characters in lowercase.

str.translate(table[, deletechars]) → string

>>> s.translate(s.maketrans(“witamy”, “żegnam”))
‘żegnam Pnńsgżn nn nnszma przmjęceu’
>>>

Return a copy of the string, where all characters occurring in the optional argument deletechars are removed, and the remaining characters have been mapped through the given translation table. The table must be a string of length 256, providing a translation for each 1-byte ASCII character.

The translation tables are built using the string.maketrans() function in the string module.

str.upper() → string

>>> s.upper()
‘WITAMY PAŃSTWA NA NASZYM PRZYJĘCIU’
>>>

Return a copy of string converted to uppercase.

The following accessor methods provide information about a string.

str.count(sub[, start][, end]) → integer

>>> v
‘witamy Państwa. witamy Państwa. witamy Państwa na przyjęciu’
>>>

>>> v.count(“witamy”)
3
>>>

>>> v.count(“żegnamy”)
0
>>>

>>> v.count(“witamy”, 0,2)
0
>>> v.count(“witamy”, 0,10)
1
>>> v.count(“witamy”, 0,20)
1
>>> v.count(“witamy”, 0,50)
3
>>> v.count(“witamy”, 0,35)
2
>>>

Return the number of occurrences of substring sub in string. If start or end are present, these have the same meanings as a slice string[start:end].

str.endswith(suffix[, start][, end]) → boolean

>>> v
‘witamy Państwa. witamy Państwa. witamy Państwa na przyjęciu’
>>>

>>> s.endswith(“przyjęciu”)
True
>>> s.endswith(“bankiecie”)
False
>>>

>>> s.endswith(“przyjęciu”, 0, 10)
False
>>> s.endswith(“przyjęciu”, 0, 35)
True
>>>

Return True if string ends with the specified suffix, otherwise return False. The suffix can be a single string or a sequence of individual strings. If start or end are present, these have the same meanings as a slice string[start:end].

str.find(sub[, start][, end]) → integer

>>> v
‘witamy Państwa. witamy Państwa. witamy Państwa na przyjęciu’
>>>

>>> v.find(“witamy”)
0
>>> v.find(“żegnamy”)
-1
>>>

>>> v.find(“witamy”,0,10)
0
>>> v.find(“witamy”,0,35)
0
>>> v.find(“witamy”,0,50)
0

>>> v.find(“Państwa”)
7

>>> v.find(“Państwa”,0,10)
-1
>>> v.find(“Państwa”,0,35)
7
>>> v.find(“Państwa”,0,50)
7
>>>

Return the lowest index in string where substring sub is found. Return -1 if the substring is not found. If start or end are present, these have the same meanings as a slice string[start:end].

str.index(sub[, start][, end]) → integer

>>> v.index(“witamy”)
0
>>> v.index(“witamy”,0,10)
0
>>> v.index(“witamy”,0,35)
0
>>> v.index(“witamy”,0,50)
0
>>> v.index(“żegnamy”)
Traceback (most recent call last):
  File “<stdin>”, line 1, in <module>
ValueError: substring not found
>>> v.index(“żegnamy”,0,10)
Traceback (most recent call last):
  File “<stdin>”, line 1, in <module>
ValueError: substring not found
>>>

Return the lowest index in string where substring sub is found. Raise ValueError if the substring is not found. If start or end are present, these have the same meanings as a slice string[start:end].

str.isalnum() → boolean

>>> v.isalnum()
False
>>>

Return True if all characters in string are alphanumeric and there is at least one character in string; False otherwise.

str.isalpha() → boolean

>>> v.isalpha()
False
>>>

Return True if all characters in string are alphabetic and there is at least one character in string; False otherwise.

str.isdigit() → boolean

>>> v.isdigit()
False
>>>

Return True if all characters in string are digits and there is at least one character in string; False otherwise.

str.islower() → boolean

>>> v.islower()
False
>>>

Return True if all characters in string are lowercase and there is at least one cased character in string; False otherwise.

str.isspace() → boolean

>>> v.isspace()
False
>>>

Return True if all characters in string are whitespace and there is at least one character in string, False otherwise.

str.istitle() → boolean

>>> v.istitle()
False
>>>

Return True if string is titlecased. Uppercase characters may only follow uncased characters (whitespace, punctuation, etc.) and lowercase characters only cased ones, False otherwise.

str.isupper() → boolean

>>> v.isupper()
False
>>>

Return True if all characters in string are uppercase and there is at least one cased character in string; False otherwise.

str.rfind(sub[, start][, end]) → integer

>>> v.rfind(“witamy”)
32
>>> v.rfind(“witamy”,0,10)
0
>>> v.rfind(“witamy”,0,35)
16
>>> v.rfind(“witamy”,0,50)
32
>>>

Return the highest index in string where substring sub is found. Return -1 if the substring is not found. If start or end are present, these have the same meanings as a slice string[start:end].

str.rindex(sub[, start][, end]) → integer

>>> v.index(“witamy”)
0
>>> v.rindex(“witamy”)
32
>>> v.rindex(“witamy”,0,10)
0
>>> v.rindex(“witamy”,0,35)
16
>>> v.rindex(“witamy”,0,50)
32
>>>

Return the highest index in string where substring sub is found. Raise ValueError if the substring is not found. If start or end are present, these have the same meanings as a slice string[start:end].

str.startswith(sub[, start][, end]) → boolean

>>> v.startswith(“witamy”)
True
>>> v.startswith(“żegnamy”)
False
>>>

>>> v.startswith(“witamy”,0,10)
True
>>> v.startswith(“witamy”,0,35)
True
>>> v.startswith(“witamy”,0,50)
True
>>>

Return True if string starts with the specified prefix, otherwise return False. The prefix can be a single string or a sequence of individual strings.If start or end are present, these have the same meanings as a slice string[start:end].

The following generators create another kind of object, usually a sequence, from a string.

str.partition(separator) → 3-tuple

>>> v
‘witamy Państwa. witamy Państwa. witamy Państwa na przyjęciu’
>>>

>>> v.partition(“amy”)
(‘wit’, ‘amy’, ‘ Państwa. witamy Państwa. witamy Państwa na przyjęciu’)
>>>

>>> v.partition(“na”)
(‘witamy Państwa. witamy Państwa. witamy Państwa ‘, ‘na’, ‘ przyjęciu’)
>>>

Return three values: the text prior to the first occurrence of separator in string, the sep as the delimiter, and the text after the first occurance of the separator. If the separator doesn’t occur, all of the input string is in the first element of the 3-tuple; the other two elements are empty strings.

str.split(separator[, maxsplit]) → sequence

>>> v.split()
[‘witamy’, ‘Państwa.’, ‘witamy’, ‘Państwa.’, ‘witamy’, ‘Państwa’, ‘na’, ‘przyjęc
iu’]
>>>

>>> v.split(“.”)
[‘witamy Państwa’, ‘ witamy Państwa’, ‘ witamy Państwa na przyjęciu’]
>>>

>>> v.split(“m”)
[‘wita’, ‘y Państwa. wita’, ‘y Państwa. wita’, ‘y Państwa na przyjęciu’]
>>>

Return a list of the words in the string, using separator as the delimiter. If maxsplit is given, at most maxsplit splits are done. If separator is not specified, any whitespace character is a separator.

str.splitlines(keepends) → sequence

>>> v.splitlines()
[‘witamy Państwa. witamy Państwa. witamy Państwa na przyjęciu’]
>>>

>>> y = “””
… the first line
… the second line
… the third line
… “””

>>> y.splitlines()
[”, ‘the first line’, ‘the second line’, ‘the third line’]
>>>

Return a list of the lines in string, breaking at line boundaries. Line breaks are not included in the resulting list unless keepends is given and set to True.

>>> s = “spam”
>>> s[len(s)-1]
‘m’

is the same as:
>>> s[-1]
‘m’
>>>

Slicing:

>>> s[:]
‘spam’
>>> s[1:]
‘pam’
>>> s[1:len(s)]
‘pam’

String formatting:

>>> “I speak %s, %s, and %s”%(“Polish”, “English”, “Russian”)
‘I speak Polish, English, and Russian’
>>> “I speak {0}, {1}, and {2}”.format(“Polish”, “English”, “Russian”)
‘I speak Polish, English, and Russian’
>>>

Leave a comment