Strings#

Overview#

String is a fundamental data type used to represent text, and can contain letters, numbers, symbols, and special characters. A string is a sequence of characters enclosed in either single quotes (’) or double quotes (“). They can be used to store and represent textual information, such as names, file paths, and more.

Examples:

name = "earth"
message = 'ABE for Earth Science'
address = "130 Creelman street"
path2dir = 'C:\\projects\\myproject'

Strings are immutable (i.e., they cannot be changed once they are created), but several operations are available, including concatenation, slicing, formatting, and more.

  • Concatenation

word1 = "Earth"
word2 = "Science"
concatenate = word1 + " " + word2
print(concatenate)
Earth Science
  • Slicing: This is useful to extract parts of string

# Slicing
text = "Earth Science!"
substring = text[6:13]
print(substring)
Science

Note

Slicing uses the following syntax:

sequence[start:end:step]

  • Sequence is your variable name

  • Start and end are indices where the slice starts and ends, respectively. Python counts from zero, instead of 1.

  • Step is optional index that increments the step between indices

text = "Earth Science!"
print(text[0:5])
Earth
# omitting the first index, and it will start from zero
print(text[:5])
Earth
# increment in 2
print(text[0:5:2])
Erh
# get the last five characters. Note the negative sign to count backwards
print(text[:-5])
Earth Sci
  • Formatting: This is powerful way to modify system paths and filenames. The “f” prefix before the string allows you to directly embed expressions inside curly braces {} within the string. The {expressions} will be replaced with their corresponding values.

name= "Vitor"
language='Python'
info = f"My name is {name} and I'm teaching {language}"
print(info)
My name is Vitor and I'm teaching Python

String Methods#

A string method is like a function that acts on the created variable. If the variable my_name is a string, then the code my_name.lower() runs the lower() method on that string and returns the result (this idea is the foundation of Object-Oriented Programming).

See the most common string methods:

String methods

Descriptions

my_name.lower(), my_name.upper()

returns the lowercase or uppercase version of the string

my_name.strip()

returns a string with whitespace removed from the start and end

my_name.isalpha()/my_name.isdigit()/my_name.isspace()

verifies if all the string chars are in the various character classes

my_name.startswith('other'), my_name.endswith('other')

verifies if the string starts or ends with the given other string

my_name.find('other')

searches for “other” within my_name, and returns the index where it begins or -1 if not found

my_name.replace('old', 'new')

returns a string where all occurrences of ‘old’ have been replaced by ‘new’

my_name.split('delim')

returns a list of substrings separated by the given delimiter. The delimiter is not a regular expression, it’s just text. ‘aaa,bbb,ccc’.split(‘,’) -> [‘aaa’, ‘bbb’, ‘ccc’].

my_name.join(list)

opposite of split(), joins the elements in the given list together using the string as the delimiter. e.g. ‘—‘.join([‘aaa’, ‘bbb’, ‘ccc’]) -> aaa—bbb—ccc

Let’s see the applications:

upper() | Converts a string to uppercase

my_string = "hello, world!"
print(my_string.upper())  # "HELLO, WORLD!"
HELLO, WORLD!

lower() - Converts a string to lowercase

my_string = "HELLO, WORLD!"
print(my_string.lower())  # "hello, world!"
hello, world!

capitalize() - Capitalizes the first character of a string and converts the rest to lowercase

my_string = "hello, world!"
print(my_string.capitalize())  # "Hello, world!"
Hello, world!

count(substring) - Returns the number of occurrences of a substring in a string

my_string = "hello, hello, world!"
print(my_string.count("hello"))  # 2
2

startswith(prefix) - Checks if a string starts with a specific prefix

my_string = "Hello, world!"
print(my_string.startswith("Hello"))  # True
True

endswith(suffix) - Checks if a string ends with a specific suffix

my_string = "Hello, world!"
print(my_string.endswith("world!"))  # True
True

split(separator) - Splits a string into a list of substrings based on a separator

my_string = "Hello, world!"
print(my_string.split(", "))  # ["Hello", "world!"]
['Hello', 'world!']

replace(old, new) - Replaces occurrences of a substring with a new value

my_string = "Hello, world!"
print(my_string.replace("world", "Python"))  # "Hello, Python!"
Hello, Python!

strip() - Removes leading and trailing whitespace from a string

my_string = "   hello, world!   "
print(my_string.strip())  # "hello, world!"
hello, world!

Escaping characters#

Escaping characters refers to special characters or symbols within a string that would be interpreted differently, such as quotes, newlines, tabs, and backslashes within a string.

Most common escape sequences:

  • \": Double quote

  • \': Single quote

  • \\: Backslash

  • \n: Newline

  • \t: Tab

  • \b: Backspace

  • \r: Carriage return

Let’s try:

my_string = 'It's a brillant future!'
  File "C:\Users\vsm71\AppData\Local\Temp\ipykernel_13092\1979304785.py", line 1
    my_string = 'It's a brillant future!'
                    ^
SyntaxError: invalid syntax
my_string = 'It\'s a brillant future!'
print(my_string)

Alternatively, you can also use double-quotes if your string contains a single-quote.

my_string = "It's a brillant future!"
print(my_string)

Note

If a string contains both single and double quotes, you can use triple quotes to ensure that the string is treated correctly. By enclosing the string within triple quotes (either triple single quotes ‘’’ or triple double quotes “””), you can include both types of quotes without the need for escape sequences.

latitude = '''30° 18' 07.2992" N'''
longitude = '''88° 39' 18.6692" W'''
print(latitude, longitude)

Operating System#

Important

Backslashes pose another problem when dealing with paths in OS Windows because the escaping characters are going to return the error for some paths

path = 'C:\Users\udir'

To solve the issue, you can prefix a string with r or add \\:

path = r'C:\Users\udir'

path = 'C:\\Users\\udir'