Python计算化学式相对分子质量(含完整代码)

  • 具体步骤
  • 1. 创建相对原子质量表
  • 2. 预处理
  • 3. 判断并得到化学式系数
  • 4. 计算不含括号的化学式相对分子质量
  • 5. 计算不含括号的化学式相对分子质量完整代码
  • 6. 处理括号
  • (1) 得到括号层数
  • (2) 逐层剥离
  • (3) 处理“拟元素”
  • 完整程序



化学式(Chemical Formula)是指用元素符号和数字的组合表示纯净物物质组成的式子。化学式中,既有大小写字母构成的元素符号,也有数字、括号、“.”等符号,因此,该程序需要对这些符号分别处理。

具体步骤

1. 创建相对原子质量表

定义函数,并创建一个含有化学元素相对原子质量的字典,此处不考虑放射性元素。

def mr(chemf):
	Ar = {'H':1,'He':4,'Li':7,'Be':9,'B':11,'C':12,'N':14,'O':16,'F':19,'Ne':20,'Na':23,'Mg':24,'Al':27,'Si':28,'P':31,'S':32,'Cl':35.5,'Ar':40,'K':39,'Ca':40,'Sc':45,'Ti':48,'V':51,'Cr':52,'Mn':55,'Fe':56,'Co':59,'Ni':59,'Cu':64,'Zn':65,'Ga':70,'Ge':73,'As':75,'Se':79,'Br':80,'Kr':84,'Rb':85.5,'Sr':88,'Y':89,'Zr':91,'Nb':93,'Mo':96,'Ru':101,'Rh':103,'Pd':106,'Ag':108,'Cd':112,'In':115,'Sn':119,'Sb':122,'Te':128,'I':127,'Xe':131,'Cs':133,'Ba':137,'La':139,'Ce':140,'Pr':141,'Nd':144,'Sm':150,'Eu':152,'Gd':157,'Tb':159,'Dy':162.5,'Ho':165,'Er':167,'Tm':169,'Yb':173,'Lu':175,'Hf':178.5,'Ta':181,'W':184,'Re':186,'Os':190,'Ir':192,'Pt':195,'Au':197,'Hg':201,'Tl':204,'Pb':207,'Bi':209}

2. 预处理

如CuSO4.5H2O等化学式,为了计算方便,需要将之分隔为CuSO4和5H2O两个部分。

chemflist = chemf.split('.')

接下来的操作是将每个元素的相对原子质量乘上系数后加入一个列表中,最后进行累加得到结果,因此,创建一个空列表。

mrlist = []

3. 判断并得到化学式系数

对于化学式CuSO4.5H2O,经过处理后变成列表[‘CuSO4’, ‘5H2O’]。需要对每一项进行分别处理,同时,H2O前的系数5需要单独考虑。

for chemf0 in chemflist:
    # 判断并得到化学式系数
    if chemf0[0].isdigit():
        firstnum = 0
        # 循环至不再为数字
        while chemf0[firstnum].isdigit():
            firstnum += 1
        k = int(chemf0[0:firstnum])
    else:
        k = 1
    chemf1 = chemf0 + ' ' # 防止后续操作中超出范围的错误

4. 计算不含括号的化学式相对分子质量

不含括号的化学式只由元素符号和数字构成,而元素符号仅由一位大写字母抑或是一位大写字母加一位小写字母构成。因此,只需先判断大写字母出现的位置,再判断其后一位是否是小写字母,最后判断元素符号之后是否有数字即可。

def cwb(kk,wbchemf): # kk是系数,wbchemf是不含括号的化学式字符串
    returnlist = []
    # 遍历化学式中每一个字符
    for c in range(len(wbchemf)):
        cc = wbchemf[c]
        # 判断是否为元素(开头大写)
        if cc.isupper():
            # 两位字母元素符号
            if wbchemf[c+1].islower():
                ele1 = wbchemf[c:c+2]
                numfirstplace = 2
            # 一位字母元素符号
            else:
                ele1 = wbchemf[c]
                numfirstplace = 1
            # 判断并得到元素后的数值
            numornot = True
            numplace = numfirstplace
            while numornot:
                if wbchemf[c+numplace].isdigit():
                    numplace += 1
                else:
                    numornot = False
            if numplace <= numfirstplace:
                num1 = 1
            else:
                num1 = int(wbchemf[c+numfirstplace:c+numplace])
            # 将系数和元素相对原子质量相乘
            returnlist.append(kk * num1 * Ar[ele1])
    return returnlist

该函数的返回值是一个列表,列表中每一项相加就是该化学式的相对分子质量。因此,在步骤3的代码后加上:

mrlist.append(cwb(k,chemf1))

对于不含括号的化学式,至此已可以进行累加输出结果:

mrresult = 0
for mrresult1 in mrlist:
    for mrresult2 in mrresult1:
        mrresult += mrresult2
return mrresult

5. 计算不含括号的化学式相对分子质量完整代码

def mr(chemf):
    Ar = {'H':1,'He':4,'Li':7,'Be':9,'B':11,'C':12,'N':14,'O':16,'F':19,'Ne':20,'Na':23,'Mg':24,'Al':27,'Si':28,'P':31,'S':32,'Cl':35.5,'Ar':40,'K':39,'Ca':40,'Sc':45,'Ti':48,'V':51,'Cr':52,'Mn':55,'Fe':56,'Co':59,'Ni':59,'Cu':64,'Zn':65,'Ga':70,'Ge':73,'As':75,'Se':79,'Br':80,'Kr':84,'Rb':85.5,'Sr':88,'Y':89,'Zr':91,'Nb':93,'Mo':96,'Ru':101,'Rh':103,'Pd':106,'Ag':108,'Cd':112,'In':115,'Sn':119,'Sb':122,'Te':128,'I':127,'Xe':131,'Cs':133,'Ba':137,'La':139,'Ce':140,'Pr':141,'Nd':144,'Sm':150,'Eu':152,'Gd':157,'Tb':159,'Dy':162.5,'Ho':165,'Er':167,'Tm':169,'Yb':173,'Lu':175,'Hf':178.5,'Ta':181,'W':184,'Re':186,'Os':190,'Ir':192,'Pt':195,'Au':197,'Hg':201,'Tl':204,'Pb':207,'Bi':209}
    chemflist = chemf.split('.')
    mrlist = []
    def cwb(kk,wbchemf):
        returnlist = []
        for c in range(len(wbchemf)):
            cc = wbchemf[c]
            # 判断是否为元素(开头大写)
            if cc.isupper():
                # 两位字母元素符号
                if wbchemf[c+1].islower():
                    ele1 = wbchemf[c:c+2]
                    numfirstplace = 2
                # 一位字母元素符号
                else:
                    ele1 = wbchemf[c]
                    numfirstplace = 1
                # 判断并得到元素后的数值
                numornot = True
                numplace = numfirstplace
                while numornot:
                    if wbchemf[c+numplace].isdigit():
                        numplace += 1
                    else:
                        numornot = False
                if numplace <= numfirstplace:
                    num1 = 1
                else:
                    num1 = int(wbchemf[c+numfirstplace:c+numplace])
                returnlist.append(kk * num1 * Ar[ele1])
        return returnlist
    for chemf0 in chemflist:
        # 判断并得到化学式系数
        if chemf0[0].isdigit():
            firstnum = 0
            while chemf0[firstnum].isdigit():
                firstnum += 1
            k = int(chemf0[0:firstnum])
        else:
            k = 1
        chemf1 = chemf0 + ' '
        mrlist.append(cwb(k,chemf1))
    mrresult = 0
    for mrresult1 in mrlist:
        for mrresult2 in mrresult1:
            mrresult += mrresult2
    return mrresult

6. 处理括号

方便起见,输入的化学式中的括号一律采用圆括号“(”“)”表示。简单的化学式中仅有一层括号,如Fe(OH)3,而复杂的化学式中会含有嵌套括号,如(Fe2(OH)2(H2O)8)2。因此,本程序的处理方法是,先得到化学式中括号层数,在由内及外进行逐层剥离,即将括号内的部分视为一个整体,计算括号内的部分的相对质量,并将其定义为一个“拟元素”。

(1) 得到括号层数

在步骤3代码后加入判断括号存在并计算括号嵌套层数。

# 若有括号
if '(' in chemf1:
	# 计算括号嵌套层数,用brac1表示
	brac0 = brac1 = 0
	for each1 in chemf1:
	if each1 == '(':
    	brac0 += 1
	elif each1 == ')':
    	brac0 -= 1
	if brac0 > brac1:
        brac1 = brac0

同时,在函数初始位置创建用来储存“拟元素”的字典。

newAr = {}

(2) 逐层剥离

将化学式中处于正在处理层级的括号内部分剥离,设为“拟元素”,替代原化学式中的括号部分,并由内及外进行循环。
在代码初始位置创建用来为“拟元素”命名的变量:

usednum = 0

并在步骤(1)代码后加入逐层剥离代码:

# 逐层剥离
for grade in range(brac1, 0, -1):
	# 得到括号部分位置
	gradelist = []
	for n1 in range(len(chemf1)):
		if chemf1[n1] == '(':
			brac0 += 1
		elif chemf1[n1] == ')':
			brac0 -= 1
		gradelist.append(brac0)
	placelist = []
	for g1 in range(len(gradelist)):
		if gradelist[g1] == grade and (g1 == 0 or g1 > 0 and gradelist[g1-1] != grade):
			placelist.append(g1)
		if gradelist[g1] == grade and gradelist[g1+1] != grade:
			placelist.append(g1)
	clist = []
	for p1 in range(0, len(placelist), 2):
		clist.append(chemf1[placelist[p1]:placelist[p1+1]+2]) # 得到括号部分列表clist
	# 将括号部分替换为”拟元素“
	cnum = len(clist)
	for cnum1 in range(cnum):
		usednum += 1
		usedname = '|' + str(usednum) + '|' # 将“拟元素”设为“|usednum|”形式,usednum为正整数
		usedlist = cwb(1,clist[cnum1] + ' ')
		usedmr = 0
		for used1 in usedlist:
			usedmr += used1
		newAr[usedname] = usedmr # 将”拟元素“及其对应的相对质量存入newAr中
		chemf1 = chemf1.replace(clist[cnum1], usedname) # 替换原化学式中最内层括号部分

(3) 处理“拟元素”

针对“拟元素”,需要在cwb函数中进行分别处理。
在cwb函数初始位置设置:

eleornot = 0

并在函数中加入:

elif cc == '|' and eleornot % 2 == 0:
	# 得到“拟元素”字符串长度
	distance = 1
	while wbchemf[c+distance].isdigit():
		distance += 1
	ele1 = wbchemf[c:c+distance+1]
	# 判断并得到”拟元素“后的系数
	numfirstplace = distance + 1
	numornot = True
	numplace = numfirstplace
	while numornot:
		if wbchemf[c+numplace].isdigit():
			numplace += 1
		else:
			numornot = False
	if numplace <= numfirstplace:
		num1 = 1
	else:
		num1 = int(wbchemf[c+numfirstplace:c+numplace])
	eleornot += 1
	returnlist.append(kk * num1 * newAr[ele1]) # 需要在“拟元素”字典newAr中查找
elif cc == '|' and eleornot % 2 != 0:
	eleornot += 1

完整程序

全部代码如下:

def mr(chemf):
    Ar = {'H':1,'He':4,'Li':7,'Be':9,'B':11,'C':12,'N':14,'O':16,'F':19,'Ne':20,'Na':23,'Mg':24,'Al':27,'Si':28,'P':31,'S':32,'Cl':35.5,'Ar':40,'K':39,'Ca':40,'Sc':45,'Ti':48,'V':51,'Cr':52,'Mn':55,'Fe':56,'Co':59,'Ni':59,'Cu':64,'Zn':65,'Ga':70,'Ge':73,'As':75,'Se':79,'Br':80,'Kr':84,'Rb':85.5,'Sr':88,'Y':89,'Zr':91,'Nb':93,'Mo':96,'Ru':101,'Rh':103,'Pd':106,'Ag':108,'Cd':112,'In':115,'Sn':119,'Sb':122,'Te':128,'I':127,'Xe':131,'Cs':133,'Ba':137,'La':139,'Ce':140,'Pr':141,'Nd':144,'Sm':150,'Eu':152,'Gd':157,'Tb':159,'Dy':162.5,'Ho':165,'Er':167,'Tm':169,'Yb':173,'Lu':175,'Hf':178.5,'Ta':181,'W':184,'Re':186,'Os':190,'Ir':192,'Pt':195,'Au':197,'Hg':201,'Tl':204,'Pb':207,'Bi':209}
    newAr = {}
    usednum = 0
    chemflist = chemf.split('.')
    mrlist = []
    def cwb(kk,wbchemf):
        returnlist = []
        eleornot = 0
        for c in range(len(wbchemf)):
            cc = wbchemf[c]
            # 判断是否为元素(开头大写)
            if cc.isupper():
                # 两位字母元素符号
                if wbchemf[c+1].islower():
                    ele1 = wbchemf[c:c+2]
                    numfirstplace = 2
                # 一位字母元素符号
                else:
                    ele1 = wbchemf[c]
                    numfirstplace = 1
                # 判断并得到元素后的数值
                numornot = True
                numplace = numfirstplace
                while numornot:
                    if wbchemf[c+numplace].isdigit():
                        numplace += 1
                    else:
                        numornot = False
                if numplace <= numfirstplace:
                    num1 = 1
                else:
                    num1 = int(wbchemf[c+numfirstplace:c+numplace])
                returnlist.append(kk * num1 * Ar[ele1])
            elif cc == '|' and eleornot % 2 == 0:
                distance = 1
                while wbchemf[c+distance].isdigit():
                    distance += 1
                ele1 = wbchemf[c:c+distance+1]
                numfirstplace = distance + 1
                numornot = True
                numplace = numfirstplace
                while numornot:
                    if wbchemf[c+numplace].isdigit():
                        numplace += 1
                    else:
                        numornot = False
                if numplace <= numfirstplace:
                    num1 = 1
                else:
                    num1 = int(wbchemf[c+numfirstplace:c+numplace])
                eleornot += 1
                returnlist.append(kk * num1 * newAr[ele1])
            elif cc == '|' and eleornot % 2 != 0:
                eleornot += 1
        return returnlist
    for chemf0 in chemflist:
        # 判断并得到化学式系数
        if chemf0[0].isdigit():
            firstnum = 0
            while chemf0[firstnum].isdigit():
                firstnum += 1
            k = int(chemf0[0:firstnum])
        else:
            k = 1
        chemf1 = chemf0 + ' '
        # 若有括号
        if '(' in chemf1:
            # 计算括号嵌套层数,用brac1表示
            brac0 = brac1 = 0
            for each1 in chemf1:
                if each1 == '(':
                    brac0 += 1
                elif each1 == ')':
                    brac0 -= 1
                if brac0 > brac1:
                    brac1 = brac0
            # 逐层剥离
            for grade in range(brac1, 0, -1):
                gradelist = []
                for n1 in range(len(chemf1)):
                    if chemf1[n1] == '(':
                        brac0 += 1
                    elif chemf1[n1] == ')':
                        brac0 -= 1
                    gradelist.append(brac0)
                placelist = []
                for g1 in range(len(gradelist)):
                    if gradelist[g1] == grade and (g1 == 0 or g1 > 0 and gradelist[g1-1] != grade):
                        placelist.append(g1)
                    if gradelist[g1] == grade and gradelist[g1+1] != grade:
                        placelist.append(g1)
                clist = []
                for p1 in range(0, len(placelist), 2):
                    clist.append(chemf1[placelist[p1]:placelist[p1+1]+2])
                cnum = len(clist)
                for cnum1 in range(cnum):
                    usednum += 1
                    usedname = '|' + str(usednum) + '|'
                    usedlist = cwb(1,clist[cnum1] + ' ')
                    usedmr = 0
                    for used1 in usedlist:
                        usedmr += used1
                    newAr[usedname] = usedmr
                    chemf1 = chemf1.replace(clist[cnum1], usedname)
        mrlist.append(cwb(k,chemf1))
    mrresult = 0
    for mrresult1 in mrlist:
        for mrresult2 in mrresult1:
            mrresult += mrresult2
    return mrresult

运行结果如下:

>>> mr('CO2')
44
>>> mr('(NH4)2Fe(SO4)2.6H2O')
392
>>> mr('CuSO4.5H2O')
250
>>> mr('(Fe2(OH)2(H2O)8)2')
580