1.空格剥离
空格剥离是字符串处理的一种基本操作,可以使用lstrip()方法(左)剥离前导空格,使用rstrip()(右)方法对尾随空格进行剥离,以及使用strip()剥离前导和尾随空格。
s = "This is a sentence with whitespace."
print( "Strip leading whitespace: {}" .format(s.lstrip()))
print( "Strip trailing whitespace: {}" .format(s.rstrip()))
print( "Strip all whitespace: {}" .format(s.strip()))
Strip leading whitespace: This is a sentence with whitespace.
Strip trailing whitespace: This is a sentence with whitespace.
Strip all whitespace: This is a sentence with whitespace.
对剥离除空格以外的字符感兴趣吗?同样的方法也很有用,可以通过传递想要剥离的字符来剥离字符。
s = "This is a sentence with unwanted characters.AAAAAAAA"
print( "Strip unwanted characters: {}" .format(s.rstrip( "A" )))
必要时不要忘记检查字符串 format()文档。
2.字符串拆分
利用Python中的 split() 方法可以轻易将字符串拆分成较小的子字符串列表
split() 方法:https://docs.python.org/3/library/stdtypes.html#str.split
s = "KDnuggets is a fantastic resource"
print(s.split())
[ KDnuggets , is , a , fantastic , resource ]
默认情况下,split()根据空格进行拆分,但同样也可以将其他字符序列传递给split()进行拆分。
s = "these,words,are,separated,by,comma "
print("separated split -> {}" .format(s.split( "," )))
s = "abacbdebfgbhhgbabddba"
print("b separated split -> {}" .format(s.split( b )))
, separated split -> [ these , words , are , separated , by , comma ]
b separated split -> [ a , ac , de , fg , hhg , a , dd , a ]
3.将列表元素合成字符串
需要实现上述操作的一个逆向操作?没问题,利用Python中的join()方法便可将列表中的元素合成一个字符串。
s = [ "KDnuggets " , "is " , "a " , "fantastic ", "resource" ]
print(" ".join(s))
print ----> KDnuggets is a fantastic resource
事实果真如此!如果想将列表元素用空格以外的东西连接起来?这可能有点陌生,但也很容易实现。
s = ["Eleven" , "Mike" , "Dustin" , "Lucas" , "Will" ]
print(" and ".join(s))
print ----> Eleven and Mike and Dustin and Lucas and Will
4.字符串反转
Python没有内置的字符串反转方法。但是,可以先将字符串看做是字符的列表,再利用反转列表元素的方式进行反转。
5. 大小写转换
利用upper(), lower(),和swapcase()方法可以进行大小写之间的转换。
upper()方法:lower()方法:
https://docs.python.org/3/library/stdtypes.html#str.lower
swapcase()方法:
https://docs.python.org/3/library/stdtypes.html#str.swapcase
s = "KDnuggets"
print( "KDnuggets as uppercase: {}" .format(s.upper()))
print( "KDnuggets as lowercase: {}" .format(s.lower()))
print( "KDnuggets as swapped case: {}" .format(s.swapcase()))
KDnuggets as uppercase: KDNUGGETS
KDnuggets as lowercase: kdnuggets
KDnuggets as swapped case: kdNUGGETS
6. 检查是否有字符串成员
在Python中检查字符串成员的最简单方法是使用in运算符,语法与自然语言非常类似。
s1 = "perpendicular "
s2 = "pen"
s3 = "pep"
print("pen in perpendicular -> {}" .format(s2 in s1))
print("pep in perpendicular -> {}" .format(s3 in s1))
pen in perpendicular -> True
pep in perpendicular -> False
如果对找到字符串中子字符串的位置更感兴趣(而不是简单地检查是否包含子字符串),则利用find()方法可能更为有效。
s = "Does this string contain a substring?"
print( "string location -> {}" .format(s.find( "string" )))
print( "spring location -> {}" .format(s.find( "spring" )))
string location -> 10
spring location -> -1
默认情况下,find()返回子字符串第一次出现的第一个字符的索引,如果找不到子字符串,则返回-1。对这一默认情况拿捏不准时,可以查阅一下相关文档。
7. 子字符串替换
找到子字符串之后,如果想替换这一子字符串,该怎么办?Python 中的replace()字符串方法将解决这一问题。
s1 = "The theory of data science is of the utmost importance."
s2 = "practice"
print( "The new sentence: {}" .format(s1.replace("theory" , s2)))
The new sentence: The practice of data science is of the utmost importance.
如果同一个子字符串出现多次的话,利用计数参数这一选项,可以指定要进行连续替换的最大次数。
8. 组合多个列表的输出
如何以某种元素的方式将多个字符串列表组合在一起?利用zip()函数便没问题。
countries = [ "USA" , "Canada" , "UK" , "Australia" ]
cities = [ "Washington" , "Ottawa" , "London" , "Canberra" ]
for x, y in zip(countries, cities):
print( "The capital of {} is {}" .format(x, y))
The capital of USA is Washington.
The capital of Canada is Ottawa.
The capital of UK is London.
The capital of Australia is Canberra.
9. 同字母异序词检查
想检查一对字符串中,其中一个字符串是否是另一个字符串的同字母异序词?从算法上来讲,需要做的是对每个字符串中每个字母的出现次数进行计数,再检查二者计数值是否相等,直接使用collections模块的Counter类便可实现。
f
from collections import Counter
def is_anagram(s1, s2):
return Counter(s1) == Counter(s2)
s1 = "listen"
s2 = "silent"
s3 = "runner"
s4 = "neuron"
print( "listen is an anagram of silent -> {}" .format(is_anagram(s1, s2)))
print( "runner is an anagram of neuron -> {}" .format(is_anagram(s3, s4)))
listen is an anagram of silent -> True
runner is an anagram of neuron -> False
10. 回文检查
如果想检查给定的单词是否是回文,怎么办?从算法上看,需要创建一个单词的反转,然后利用 == 运算符来检查这2个字符串(原始字符串和反向字符串)是否相等。
def is_palindrome(s):
reverse = s[::-1]
if (s == reverse):
return True
return False
s1 = "racecar"
s2 = "hippopotamus"
print( "racecar is a palindrome -> {}" .format(is_palindrome(s1)))
print( "hippopotamus is a palindrome -> {}" .format(is_palindrome(s2)))
racecar is a palindrome -> True
hippopotamus is a palindrome -> False