Python Programming Made Easy (2016)
Chapter 5: Strings
Strings are amongst the most popular types in Python. We can create them simply by enclosing characters in quotes. Python treats single quotes the same as double quotes.
Creating strings is as simple as assigning a value to a variable. For example:
str1 = 'Hello World!'
str2 = "Python Programming"
Accessing Values in Strings
Python does not support a character type; these are treated as strings of length one, thus also considered a substring. To access substrings, we use the square brackets for slicing along with the index or indices to obtain our substring. Following is a simple example:
str1 = 'Hello World!' str2 = "Python Programming" print "str1[0]: ", str1[0] print "str2[7:11]: ", str2[7:11] |
When the above code is executed, it produces the following result:
Fig 5.1: Accessing strings output screenshot
Updating Strings
We can "update" an existing string by (re)assigning a variable to another string. The new value can be related to its previous value or to a completely different string altogether. Following is a simple example:
str1 = 'Hello World!' str2 = str1[:6] + 'Python' print "Updated String :- ", str2 |
When the above code is executed, it produces the following result:
Fig 5.2: Updating strings
Escape Characters
Following table is a list of escape or non-printable characters that can be represented with backslash notation.
An escape character gets interpreted; in a single quoted as well as double quoted strings.
Backslash |
Description |
\a |
Bell or alert |
\b |
Backspace |
\cx |
Control-x |
\C-x |
Control-x |
\e |
Escape |
\f |
Formfeed |
\M-\C-x |
Meta-Control-x |
\n |
Newline |
\nnn |
Octal notation, where n is in the range 0.7 |
\r |
Carriage return |
\s |
Space |
\t |
Tab |
\v |
Vertical tab |
\x |
Character x |
\xnn |
Hexadecimal notation, where n is in the range 0.9, a.f, or A.F |
Table 5.1: Escape characters
String Special Operators
Assume string variable a holds 'Hello' and variable b holds 'Python', then:
Operator |
Description |
Example |
+ |
Concatenation - Adds values on either side of the operator |
a + b results in HelloPython |
* |
Repetition - Creates new strings, concatenating multiple copies of the same string |
a*2 results in HelloHello |
[] |
Slice - Gives the character from the given index |
a[1] results in e |
[ : ] |
Range Slice - Gives the characters from the given range |
a[1:4] results in ell |
In |
Membership - Returns true if a character exists in the given string |
H in a results in 1 |
not in |
Membership - Returns true if a character does not exist in the given string |
M not in a results in 1 |
r/R |
Raw String - Suppresses actual meaning of Escape characters. The syntax for raw strings is exactly the same as for normal strings with the exception of the raw string operator, the letter "r," which precedes the quotation marks. The "r" can be lowercase (r) or uppercase (R) and must be placed immediately preceding the first quote mark. |
print r'\n' prints \n and print R'\n' prints \n |
% |
Format - Performs String formatting |
Table 5.2: String Special Operators
String Formatting Operator
One of Python's coolest features is the string format operator %. This operator is unique to strings and makes up for the pack of having functions from C's printf() family. Following is a simple example:
print "My name is %s and weight is %d kg!" % ('Tara', 50) |
When the above code is executed, it produces the following result:
Fig 5.3: formatting using %
Here is the list of complete set of symbols which can be used along with %:
Format Symbol |
Conversion |
%c |
Character |
%s |
string conversion via str() prior to formatting |
%i |
signed decimal integer |
%d |
signed decimal integer |
%u |
unsigned decimal integer |
%o |
octal integer |
%x |
hexadecimal integer (lowercase letters) |
%X |
hexadecimal integer (UPPERcase letters) |
%e |
exponential notation (with lowercase 'e') |
%E |
exponential notation (with UPPERcase 'E') |
%f |
floating point real number |
%g |
the shorter of %f and %e |
%G |
the shorter of %f and %E |
Table 5.3: format symbols
Built-in string functions
Python includes the following built-in methods to manipulate strings:
SNo |
Methods |
Description |
Example |
1 |
capitalize() |
Capitalizes first letter of string |
str="hello python " print str.capitalize() Hello python |
2 |
endswith(suffix, beg=0, end=len(string)) |
Determines if string or a substring of string (if starting index beg and ending index end are given) ends with suffix; returns true if so and false otherwise |
Computer".endswith("er") True |
3 |
find(str, beg=0 end=len(string)) |
Determine if str occurs in string or in a substring of string if starting index beg and ending index end are given returns index if found and -1 otherwise |
str='Computer' print str.find('put') >>3 # On omitting the start parameters, the function starts the search from # the beginning. print str.find('put',2) >>3 print str.find('put',1,3) >> -1 # Displays -1 because the substring could not be found between the index 1 and 3-1 |
4 |
index(str, beg=0, end=len(string)) |
Same as find(), but raises an exception if str not found |
"Computer".index('pat') Traceback (most recent call last): File "<pyshell#1>", line 1, in <module> "Computer".index('pat') ValueError: substring not found |
5 |
isalnum() |
Returns true if string has at least 1 character and all characters are alphanumeric and false otherwise |
str='Hello Python' print str.isalnum() False # The function returns False as space is an alphanumeric character. print "Python123".isalnum() True |
6 |
isalpha() |
Returns true if string has at least 1 character and all characters are alphabetic and false otherwise |
print 'Python123'.isalpha() False print 'python'.isalpha() True |
7 |
isdigit() |
Returns true if string contains only digits and false otherwise |
str = '1234' print str.isdigit() True |
8 |
islower() |
Returns true if string has at least 1 cased character and all cased characters are in lowercase and false otherwise |
print 'HELLO python'.islower() hello python |
9 |
isnumeric() |
Returns true if a unicode string contains only numeric characters and false otherwise |
str = u"year2013" print str.isnumeric() False |
10 |
isspace() |
Returns true if string contains only whitespace characters and false otherwise |
str=' ' print str.isspace() True |
11 |
istitle() |
Returns true if string is properly "titlecased" and false otherwise |
print 'Hello World'.istitle() True |
12 |
isupper() |
Returns true if string has at least one cased character and all cased characters are in uppercase and false otherwise |
print "'HELLO python'.isupper() True |
13 |
join(seq) |
Merges (concatenates) the string representations of elements in sequence seq into a string, with separator string |
str1=('10', 'oct' ,'2013') str="-" print str.join(str1) 10-oct-2013 |
14 |
len(string) |
Returns the length of the string |
len("Computer") 8 |
15 |
lower() |
Converts all uppercase letters in string to lowercase |
"Optical Fibre".lower() 'optical fibre' |
16 |
lstrip() |
Removes all leading whitespace in string |
" Python ".lstrip() 'Python ' |
17 |
max(str) |
Returns the max alphabetical character from the string str |
max("Python") ‘y’ |
18 |
min(str) |
Returns the min alphabetical character from the string str |
min("Python") ‘P’ |
19 |
replace(old, new [, max]) |
Replaces all occurrences of old in string with new or at most max occurrences if max given |
"C++ is a powerful language".replace("C++","Python") 'Python is a powerful language' |
20 |
rfind(str, beg=0,end=len(string)) |
Same as find(), but search backwards in string |
"Python Program".rfind("P") 7 |
21 |
rindex( str, beg=0, end=len(string)) |
Same as index(), but search backwards in string |
"Python Program".rindex("P") 7 |
22 |
rstrip() |
Removes all trailing whitespace of string |
" Python ".rstrip() ' Python' |
23 |
split(str="", num=string.count(str)) |
Splits string according to delimiter str (space if not provided) and returns list of substrings; split into at most num substrings if given |
"1:Anand-2:Babu-3:Charles-4:Dravid".split('-') ['1:Anand', '2:Babu', '3:Charles', '4:Dravid'] |
24 |
splitlines( num=string.count('\n')) |
Splits string at all (or num) NEWLINEs and returns a list of each line with NEWLINEs removed |
"1:Anu-\n2:Banu-\n3:Charles-\n4:David".splitlines(3) ['1:Anu-\n', '2:Banu-\n', '3:Charles-\n', '4:David'] |
25 |
startswith(str, beg=0,end=len(string)) |
Determines if string or a substring of string (if starting index beg and ending index end are given) starts with substring str; returns true if so and false otherwise |
"Python is a free software…….".startswith("Python") True |
26 |
strip([chars]) |
Performs both lstrip() and rstrip() on string |
" Python ".strip() 'Python' |
27 |
swapcase() |
Inverts case for all letters in string |
"python PROGRAMMING".swapcase() 'PYTHON programming' |
28 |
title() |
Returns "titlecased" version of string, that is, all words begin with uppercase and the rest are lowercase |
"python PROGRAMMING".title() 'Python Programming' |
29 |
upper() |
Converts lowercase letters in string to uppercase |
"python".upper() 'PYTHON' |
30 |
zfill (width) |
Returns original string leftpadded with zeros to a total of width characters; intended for numbers, zfill() retains any sign given (less one zero) |
"1234".zfill(10) '0000001234' |
Table 5.4: Built-in string functions
String constants
Constant |
Description |
Example |
string.ascii_uppercase |
The command displays a string containing uppercase characters. |
>>> string.ascii_uppercase 'ABCDEFGHIJKLMNOPQRSTUVWXYZ' |
string.ascii_lowercase |
The command displays a string containing all lowercase characters. |
>>> string.ascii_lowercase 'abcdefghijklmnopqrstuvwxyz' |
string.ascii_letters |
The command displays a string containing both uppercase and lowercase characters. |
>>> string.ascii_letters 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' |
string.digits |
The command displays a string containing digits. |
>>> string.digits '0123456789' |
string.hexdigits |
The command displays a string containing hexadecimal characters. |
>>> string.hexdigits '0123456789abcdefABCDEF' |
string.octdigits |
The command displays a string containing octal characters |
>>> string.octdigits '01234567' |
string.punctuations |
The command displays a string containing all the punctuation characters. |
>>> string.punctuations '!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}-' |
string.whitespace |
The command displays a string containing all ASCII characters that are considered whitespace. This includes the characters space, tab, linefeed, return, formfeed, and vertical tab. |
>>> string.whitespace '\t\n\x0b\x0c\r ' |
string.printable |
The command displays a string containing all characters which are considered printable like letters, digits, punctuations and whitespaces. |
>>> string.printable '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ! "#$%&\'()*+,-./:;<=>?@[\\]^_`{|}- \t\n\r\x0b\x0c' |
Table 5.6: string constants
Example 5.1: Write a program to check whether the string is a palindrome or not. def palin(): str=input("Enter the String") l=len(str) p=l-1 index=0 while (index<p): if(str[index]==str[p]): index=index+1 p=p-1 else: print "String is not a palindrome" break else: print "String is a Palindrome" Execution steps: Fig 5.4: Palindrome execution steps Output:
|
Regular expressions and Pattern matching
A regular expression is a sequence of letters and some special characters (also called meta characters). These special characters have symbolic meaning. The sequence formed by using meta characters and letters can be used to represent a group of patterns.
Regular expressions can be used in python for matching a particular pattern by importing the re module.
Note: re module includes functions for working on regular expression.
The following table explains how the meta characters are used to form regular expressions.
S.No |
MetaCharacter |
Usage |
Example |
1 |
[ ] |
Used to match a set of characters |
[star] The regular expression would match any of the characters s, t , a or r. [a-z] The regular expression would match only lowercase characters. |
2 |
^ |
Used to complementing a set of characters |
[^star] The regular expression would match any other characters than s, t , a or r. |
3 |
$ |
Used to match the end of string only |
Star$ The regular expression would match Star in BlueStar but will not match Star in Stardom |
4 |
* |
Used to specify that the previous character can be matched zero or more times. |
B*e The regular expression would match strings like Bye, Blue, Bee and so on. |
5 |
+ |
Used to specify that the previous character can be matched one or more times. |
B+r The regular expression would match strings like Beer, Baar and so on. |
6 |
? |
Used to specify that the previous character can be matched either once or zero times |
B?ar The regular expression would only match strings like Bar or Bear |
7 |
{ } |
The curly brackets accept two integer value s. The first value specifies the minimum no of occurrences and second value specifies the maximum of occurrences |
wate{1,4}r The regular expression would match only strings water, wateer, wateeer or wateeeer |
Table 5.7: meta characters
Functions from re module
Function |
Description |
re.compile() |
compile the pattern into pattern objects. After the compilation the pattern objects will be able to access methods for various operations like searching and substitutions |
re.match() |
The match function is used to determine if the regular expression (RE) matches at the beginning of the string. |
re.group() |
The group function is used to return the string matched the RE |
re.start() |
The start function returns the starting position of the match. |
re.end() |
The end function returns the end position of the match. |
re.span() |
The span function returns the tuple containing the (start, end) positions of the match |
re.search() |
The search function traverses through the string and determines the position where the RE matches the string |
re.findall() |
The function determines all substrings where the RE matches, and returns them as a list. |
re.finditer() |
The function determines all substrings where the RE matches, and returns them as an iterator. |
Table 5.8: functions in re module
Example 5.2: Demonstration of re functions import re P=re.compile("hell*o") m=re.match("hell*o", "hellooooo python") print m.group() m=re.match('hell*o', 'hellooooo python') print m.start() m=re.match('hell*o', 'hellooooo python') print m.end() m=re.match('hell*o', 'hellooooo python') print m.span() m=re.search('hell*o', 'hellooo python') print m m=re.findall('hell*o', 'hello helloo python') print m |
Solved Programs
1. Write a program to count no of ‘s’ in the string ‘mississippi’.
Ans:
def lettercount():
word = 'mississippi'
count = 0
for letter in word:
if letter == 's':
count = count + 1
print(count)
2. Write a program that reads a string and display the longest substring of the given string having just the consonants.
Ans:
string = raw_input (‘‘Enter a string :’’)
length = len (string)
max length = 0
max sub = ‘ ’
sub = ‘ ’
lensub = 0
for a in range (length) :
if string [a] in ‘aeiou ‘or string [a] in ‘AEIOU’:
if lensub > maxlength :
maxsub = sub
maxlength = lensub
sub = ‘ ’
lensub = 0
else :
sub + = string [a]
lensub = len (sub)
a + = 1
print ‘‘Maximum length consonent substring is :’’, maxsub,
print ‘‘with’’, maxlength, ‘‘characters’’
3. Write a program to determine if the given substring is present in the string.
import re
substring='Rain'
search1=re.search(substring,'Rain Rain go away !')
if search1:
position=search1.start()
print "matched", substring, "at position", position
else:
print “No match found”
4. Write a program to determine if the given substring (defined using meta characters) is present in the given string
Ans:
import re
p=re.compile('dance+')
search1=re.search(p,'Western dancers dance for English music well')
if search1:
match=search1.group()
print "matched =",match
index=search1.start()
print "at position",index
else:
print "No match found"
5. Write Python script that takes a string with multiple words and then capitalizes the first letter of each word and forms a new string out of it.
Ans.
string = raw_input (‘‘Enter a string :’’)
length = len (string)
a = 0
end = length
string 2 = ‘ ’ # empty string
while a < length
if a = = 0
string 2 + = string [0].upper() a + = 1
elif (string [a]==‘ ’ and string [a+1) !=‘ ’) :
string 2 + = string [a]
string 2 + = string [a+1].upper() a + = 2
else :
string 2 + = string [a]
a + = 1
print ‘‘Original string :’’, string
print ‘‘Converted string :’’, string2
6. Which string method is used to implement the following?
a) To count the number of characters in the string.
b) To change the first character of the string in capital letter.
c) To check whether given character is letter or a number.
d) To change lower case to upper case letter.
e) Change one character into another character.
Ans.
a) len(str)
b) str.title() or str.capitalize()
c) str.isalpha and str.isdigit()
d) lower(str[i])
e) str.replace(char, newchar)
7. Write a program to input any string and to find number of words in the string.
Ans.
str = "Honesty is the best policy"
words = str.split()
print len(words)
8. Consider the string str=”A Friend Indeed”. Write statements in Python to implement the following
a) To display the last four characters.
b) To display the substring starting from index 4 and ending at index 8.
c) To check whether string has alphanumeric characters or not.
d) To trim the last four characters from the string.
e) To trim the first four characters from the string.
f) To display the starting index for the substring “end”.
g) To change the case of the given string.
h) To check if the string is in title case.
i) To replace all the occurrences of letter ‘e’ in the string with ‘*’?
Ans:
a)str[-4:]
b)str[4:8]
c)str.isalnum()
d)str = str[4:]
e)str=str[-4:]
f)str.find(“end”)
g)str.swapcase()
h)str.istitle()
i)str.replace(‘e’,’*’)
Practice Problems
1. Write a program in Python to count the number of vowels in a given word.
2. Write a program using regular expressions in Python to validate passwords in a given list of passwords.
3. Write a program to partition the string at the occurrence of a given letter.
4. Write a program in Python to sort a given array of student names in alphabetical order.
5. Consider the string str=” Hello Python”. Write statements in python to implement the following
a) To display the last six characters.
b) To display the substring starting from index 2 and ending at index 6
c) To check whether string has alphanumeric characters or not.
d) To trim the last six characters from the string.
e) To trim the first six characters from the string.
f) To change the case of the given string.
g) To check if the string is in title case.