Strings#
Overview#
String is a fundamental data type used to represent text, and can contain letters, numbers, symbols, and special characters. A string is a sequence of characters enclosed in either single quotes (’) or double quotes (“). They can be used to store and represent textual information, such as names, file paths, and more.
Examples:
name = "earth"
message = 'ABE for Earth Science'
address = "130 Creelman street"
path2dir = 'C:\\projects\\myproject'
Strings are immutable (i.e., they cannot be changed once they are created), but several operations are available, including concatenation, slicing, formatting, and more.
Concatenation
word1 = "Earth"
word2 = "Science"
concatenate = word1 + " " + word2
print(concatenate)
Earth Science
Slicing: This is useful to extract parts of string
# Slicing
text = "Earth Science!"
substring = text[6:13]
print(substring)
Science
Note
Slicing uses the following syntax:
sequence[start:end:step]
Sequence is your variable name
Start and end are indices where the slice starts and ends, respectively. Python counts from zero, instead of 1.
Step is optional index that increments the step between indices
text = "Earth Science!"
print(text[0:5])
Earth
# omitting the first index, and it will start from zero
print(text[:5])
Earth
# increment in 2
print(text[0:5:2])
Erh
# get the last five characters. Note the negative sign to count backwards
print(text[:-5])
Earth Sci
Formatting: This is powerful way to modify system paths and filenames. The “f” prefix before the string allows you to directly embed expressions inside curly braces {} within the string. The {expressions} will be replaced with their corresponding values.
name= "Vitor"
language='Python'
info = f"My name is {name} and I'm teaching {language}"
print(info)
My name is Vitor and I'm teaching Python
String Methods#
A string method is like a function that acts on the created variable. If the variable my_name is a string, then the code my_name.lower() runs the lower()
method on that string and returns the result (this idea is the foundation of Object-Oriented Programming).
See the most common string methods:
String methods |
Descriptions |
---|---|
|
returns the lowercase or uppercase version of the string |
|
returns a string with whitespace removed from the start and end |
|
verifies if all the string chars are in the various character classes |
|
verifies if the string starts or ends with the given other string |
|
searches for “other” within my_name, and returns the index where it begins or -1 if not found |
|
returns a string where all occurrences of ‘old’ have been replaced by ‘new’ |
|
returns a list of substrings separated by the given delimiter. The delimiter is not a regular expression, it’s just text. ‘aaa,bbb,ccc’.split(‘,’) -> [‘aaa’, ‘bbb’, ‘ccc’]. |
|
opposite of split(), joins the elements in the given list together using the string as the delimiter. e.g. ‘—‘.join([‘aaa’, ‘bbb’, ‘ccc’]) -> aaa—bbb—ccc |
Let’s see the applications:
upper()
| Converts a string to uppercase
my_string = "hello, world!"
print(my_string.upper()) # "HELLO, WORLD!"
HELLO, WORLD!
lower()
- Converts a string to lowercase
my_string = "HELLO, WORLD!"
print(my_string.lower()) # "hello, world!"
hello, world!
capitalize()
- Capitalizes the first character of a string and converts the rest to lowercase
my_string = "hello, world!"
print(my_string.capitalize()) # "Hello, world!"
Hello, world!
count(substring)
- Returns the number of occurrences of a substring in a string
my_string = "hello, hello, world!"
print(my_string.count("hello")) # 2
2
startswith(prefix)
- Checks if a string starts with a specific prefix
my_string = "Hello, world!"
print(my_string.startswith("Hello")) # True
True
endswith(suffix)
- Checks if a string ends with a specific suffix
my_string = "Hello, world!"
print(my_string.endswith("world!")) # True
True
split(separator)
- Splits a string into a list of substrings based on a separator
my_string = "Hello, world!"
print(my_string.split(", ")) # ["Hello", "world!"]
['Hello', 'world!']
replace(old, new)
- Replaces occurrences of a substring with a new value
my_string = "Hello, world!"
print(my_string.replace("world", "Python")) # "Hello, Python!"
Hello, Python!
strip()
- Removes leading and trailing whitespace from a string
my_string = " hello, world! "
print(my_string.strip()) # "hello, world!"
hello, world!
Escaping characters#
Escaping characters refers to special characters or symbols within a string that would be interpreted differently, such as quotes, newlines, tabs, and backslashes within a string.
Most common escape sequences:
\"
: Double quote
\'
: Single quote\\
: Backslash\n
: Newline\t
: Tab\b
: Backspace\r
: Carriage return
Let’s try:
my_string = 'It's a brillant future!'
File "C:\Users\vsm71\AppData\Local\Temp\ipykernel_13092\1979304785.py", line 1
my_string = 'It's a brillant future!'
^
SyntaxError: invalid syntax
my_string = 'It\'s a brillant future!'
print(my_string)
Alternatively, you can also use double-quotes if your string contains a single-quote.
my_string = "It's a brillant future!"
print(my_string)
Note
If a string contains both single and double quotes, you can use triple quotes to ensure that the string is treated correctly. By enclosing the string within triple quotes (either triple single quotes ‘’’ or triple double quotes “””), you can include both types of quotes without the need for escape sequences.
latitude = '''30° 18' 07.2992" N'''
longitude = '''88° 39' 18.6692" W'''
print(latitude, longitude)
Operating System#
Important
Backslashes pose another problem when dealing with paths in OS Windows because the escaping characters are going to return the error for some paths
path = 'C:\Users\udir'
To solve the issue, you can prefix a string with r
or add \\
:
path = r'C:\Users\udir'
path = 'C:\\Users\\udir'