Strings can be thought of as sequences (as lists are sequences) of characters. As such many of the methods that work on lists work on strings. Strings in fact have more functionality associated with them, by virtue of the fact that in manipulating text, many more tasks involving character (as opposed to values of arbitrary type) are common and useful. We'll start with the ones familiar from lists. Note that the complete list of methods associated with strings is available in the python documentation, which describes additional optional parameters not discussed here for the sake of brevity.
<string>.count(<substring>)
returns
the number of times substring occurs within the string.<string>.find(<substring>)
returns the
index within the string of the first (from the left) occurrence of
'substring'. Returns -1 if substring cannot be found.<string>.rfind(<substring>)
returns
the index within the string of the last (from the right) occurrence
of 'substring'. Returns -1 if substring cannot be found.<string>.index(<substring>)
returns
the index within the string of the first (from the left) occurrence
of 'substring'. Causes an error if substring cannot be found.<string>.rindex(<substring>)
returns
the index within the string of the last (from the right) occurrence
of 'substring'. Causes an error if substring cannot be found.Python 2.4.3 (#1, Oct 2 2006, 21:50:13) [GCC 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> s = "The quick brown fox jumps slowly over the lazy cow" >>> s.count("ow") 3 >>> s.find("brown") 10 >>> s.find("not here") -1 >>> s.find("ow") 12 >>> s.rfind("ow") 48 >>> s.index("ow") 12 >>> s.rindex("ow") 48 >>> s.rindex("not here") Traceback (most recent call last): File ">stdin<", line 1, in ? ValueError: substring not found >>>
The most commonly used methods on strings are those to change the format of text. With these methods we can change the case of various characters in the text, according to common patterns, pad the text with spaces on the left and right to justify it appropriately or even center it across a given width, and strip out whitespace in various ways.
<string>.capitalize()
returns a copy of the
string with only the first character in uppercase.<string>.swapcase()
returns a copy of the
string with every character's case inverted.<string>.center(<width>)
returns a
string of width 'width' with the original string centered, i.e.
equally padded with spaces on the left and right, within it.<string>.ljust(<width>)
returns the
original string left justified within a string of width 'width',
i.e. padded with spaces up to length 'width'.<string>.rjust(<width>)
returns the
original string right justified within a string of width 'width',
i.e. padded on the left with spaces to make a string of length
'width'.<string>.lower()
returns a copy of the
original string, but with all characters in lowercase.<string>.upper()
returns a copy of the
original string, but with all characters in uppercase.<string>.strip()
returns a copy of the
string with all whitespace at the beginning and end of the string
stripped away.<string>.lstrip()
returns a copy of the
string with all whitespace at the beginning of the string stripped
away.<string>.rstrip()
returns a copy of the
string with all whitespace at the end of the string stripped
away.<string>.replace(<old>, <new>)
returns a copy of the string in which all non-overlapping instances
of 'old' are replaced by 'new'.>>> "a sentence poorly capitalized".capitalize() 'A sentence poorly capitalized' >>> >>> "aBcD".swapcase() 'AbCd' >>> >>> "center me please".center(60) ' center me please ' >>> >>> "I need some justification here".ljust(60) 'I need some justification here ' >>> >>> "No! Real Justification, the RIGHT justification".rjust(60) ' No! Real Justification, the RIGHT justification' >>> >>> "LOWER me Down".lower() 'lower me down' >>> >>> "raise Me UP".upper() 'RAISE ME UP' >>> >>> " I put my whitespace left, I put my whitespace right ".strip() 'I put my whitespace left, I put my whitespace right' >>> >>> " I strip it all off, and I shake all about ".lstrip() 'I strip it all off, and I shake all about ' >>> >>> " and now I've been arrested for indecent exposure ".rstrip() " and now I've been arrested for indecent exposure" >>> >>> "Sung to the tune of 'The h0ky p0ky'".replace("0ky","okey") "Sung to the tune of 'The hokey pokey'" >>>
After all that, let's cut to the chase. The interpolation operator on strings. This provides the majority of string formatting operations in a single consistent pattern. Learn it, understand it, appreciate its inner beauty!
Formally put, the interpolation operator interpolates a sequence of values (i.e. a list, tuple, or in some special cases a dictionary) into a string containing interpolation points (Placeholders). Wowsers we say? Again in English? The interpolation operator combines a string containing certain codes and a sequence containing values, such that those values are inserted into their respective positions within the string, defined by the position of the codes, formatted according to the specification of those codes, and replacing those codes... Example time
>>> s = "My very %s monkey jumps swiftly under %i planets" % ("energetic", 9) >>> s 'My very energetic monkey jumps swiftly under 9 planets' >>>
Examining the above example, we had a string containing two strange % thingies, and a tuple containing 2 elements. Spot the correlation! 2 % thingies, 2 elements. When combined using the '%' operator, the contents of the tuple were 'merged into' the string at the points where the % thingies were, at their respective positions (by relative position left to right), replacing the % thingies.
Time to get technical. And thingie is not a technical term, except amongst electrical engineers and biochemists. So firstly, the % thingie in the string is called a conversion specification. This is because all values in the sequence are converted to strings during the merge. It has a specific format, namely it starts with a '%' symbol, and must be at least two characters. It's easier to show the complete format in point form, so here it is...
%
(<mapping/key name>)
*optional#
, 0
, -
,
, +
*optional<field width>
*optional.<precision>
*optional<Conversion Type>
mandatory>>> "An integer with field width of three: %3i"%(5,) 'An integer with field width of three: 5' >>> >>> "An integer left justified: %-3i"%(5,) 'An integer left justified: 5 ' >>> >>> "An integer with leading zeros: %03i"%(5,) 'An integer with leading zeros: 005' >>> >>> "An integer right justified with forced +: %+3i"%(5,) 'An integer right justified with forced +: +5' >>> >>> "A float: %f"%2.5 'A float: 2.500000' >>> >>> "A float: %.1f"%2.5 'A float: 2.5' >>> >>> "A float: %4.1f"%2.5 'A float: 2.5' >>> >>> "A float: %04.1f"%2.5 'A float: 02.5' >>> >>> "A float in sci notation: %06.1e"%(0.0000025) 'A float in sci notation: 2.5e-06' >>> >>> "A percentage symbol: %% %s"%(" ") 'A percentage symbol: % ' >>>
Finally, there are a few miscellaneous methods that prove very useful when dealing with strings. These include
<string>.isupper()
return True if the string
contains only uppercase characters.<string>.islower()
return True if the string
contains only lowercase characters.<string>.isalpha()
return True if the string
contains only alphabetic characters.<string>.isalnum()
return True if the string
contains only alphabetic characters and/or digits.<string>.isdigit()
return True if the string
contains only digits.<string>.isspace()
return True if the string
contains only white space characters.<string>.endswith(<substring>)
returns
True if the string ends with the substring 'substring'.<string>.startswith(<substring>)
returns True is the string starts with the substring
'substring.<string>.join(<sequence>)
returns the
elements of 'sequence' (which must be strings) concatenated in
order with the string between each element.<string>.split([substring])
returns a list
of strings, such that the string is split by 'substring' and each
portion is an element of the returned list. If substring is not
specified, the string is split on whitespace.<string>.rsplit([substring])
; the same as
split, but the search for the split string is performed from right
to left>>> "The quick brown fox".endswith("dog") False >>> >>> "The quick brown fox".endswith("fox") True >>> >>> "The quick brown fox".startswith("A") False >>> >>> "The quick brown fox".startswith("The ") True >>> >>> ", ".join(['1', '2', '3', '4']) '1, 2, 3, 4' >>> >>> "a, b, c, d".split(',') ['a', ' b', ' c', ' d'] >>> >>> "a, b, c, d".split(', ') ['a', 'b', 'c', 'd'] >>> >>> "abababa".split("bab") ['a', 'aba'] >>> >>> "abababa".rsplit("bab") ['aba', 'a'] >>>
"Laziness is a
%s."%("virtue")
?"%i days hath %s, %s, %s and %s. I
use my %s for the other %i, because I can't remember this rhyme for
%s"%(30, "September", "April", "June", "November", "knuckles", 8,
"...")
?"%02i/%02i/%04i"%(10,3,2009)
?"%5.3f"%(3.1415)
?