[TOC]
如何改变Series和DataFrame对象:
1. 算术运算根据行列索引,补齐后运算,运算默认产生浮点数,补齐时缺项填充NaN
2. 二维和一维,一维和零维间为广播运算
3. 采用 + - * / 符号进行的二元运算会产生新的对象
一、数据类型的算术运算
1.1 简单的算术运算
例子1:
import pandas as pd
import numpy as np
a = pd.DataFrame(np.arange(12).reshape(3,4))
a
Out[42]:
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
b = pd.DataFrame(np.arange(20).reshape(4,5))
b
Out[44]:
0 1 2 3 4
0 0 1 2 3 4
1 5 6 7 8 9
2 10 11 12 13 14
3 15 16 17 18 19
a+b
Out[45]:
0 1 2 3 4
0 0.0 2.0 4.0 6.0 NaN
1 9.0 11.0 13.0 15.0 NaN
2 18.0 20.0 22.0 24.0 NaN
3 NaN NaN NaN NaN NaN
a*b
Out[46]:
0 1 2 3 4
0 0.0 1.0 4.0 9.0 NaN
1 20.0 30.0 42.0 56.0 NaN
2 80.0 99.0 120.0 143.0 NaN
3 NaN NaN NaN NaN NaN
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
1.2 方法形成的运算
方法 | 说明 |
---|---|
.add(d,**argws) | 加法运算,可选参数 |
.sub(d,**argws) | 减法运算,可选参数 |
.mul(d,**arwgs) | 乘法运算,可选参数 |
.div(d,**argws) | 除法运算,可选参数 |
例子2:
import pandas as pd
import numpy as np
a = pd.DataFrame(np.arange(12).reshape(3,4))
b = pd.DataFrame(np.arange(20).reshape(4,5))
b.add(a,fill_value=100) #空的数据用100来填充
Out[47]:
0 1 2 3 4
0 0.0 2.0 4.0 6.0 104.0
1 9.0 11.0 13.0 15.0 109.0
2 18.0 20.0 22.0 24.0 114.0
3 115.0 116.0 117.0 118.0 119.0
a.mul(b,fill_value=0)
Out[48]:
0 1 2 3 4
0 0.0 1.0 4.0 9.0 0.0
1 20.0 30.0 42.0 56.0 0.0
2 80.0 99.0 120.0 143.0 0.0
3 0.0 0.0 0.0 0.0 0.0
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
1.3 不同维度间的运算
import pandas as pd
import numpy as np
b = pd.DataFrame(np.arange(20).reshape(4,5))
b
Out[50]:
0 1 2 3 4
0 0 1 2 3 4
1 5 6 7 8 9
2 10 11 12 13 14
3 15 16 17 18 19
c = pd.Series(np.arange(4))
c
Out[52]:
0 0
1 1
2 2
3 3
dtype: int32
c-10
Out[53]:
0 -10
1 -9
2 -8
3 -7
dtype: int32
b-c
Out[54]:
0 1 2 3 4
0 0.0 0.0 0.0 0.0 NaN
1 5.0 5.0 5.0 5.0 NaN
2 10.0 10.0 10.0 10.0 NaN
3 15.0 15.0 15.0 15.0 NaN
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
注1:不同维度间为广播运算,一维Series默认在轴1参与运算(b的每一行减去c的元素)
若要使轴0参与运算,需要指定axis
b.sub(c,axis=0)
Out[55]:
0 1 2 3 4
0 0 1 2 3 4
1 4 5 6 7 8
2 8 9 10 11 12
3 12 13 14 15 16
- 1
- 2
- 3
- 4
- 5
- 6
- 7
二、比较运算
1. 比较运算只能比较相同索引的元素,不进行补齐
2. 二维和一维、一维和零维间为广播运算
3. 采用 > < >= <= != 等符号进行的二元运算产生布尔对象
a = pd.DataFrame(np.arange(12).reshape(3,4))
a
Out[57]:
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
b = pd.DataFrame(np.arange(12,0,-1).reshape(3,4))
b
Out[59]:
0 1 2 3
0 12 11 10 9
1 8 7 6 5
2 4 3 2 1
a>b
Out[60]:
0 1 2 3
0 False False False False
1 False False False True
2 True True True True
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
注:同维度运算,尺寸一致
a = pd.DataFrame(np.arange(12).reshape(3,4))
a
Out[65]:
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
c = pd.Series(np.arange(4))
c
Out[67]:
0 0
1 1
2 2
3 3
dtype: int32
a>c #不同维度,广播运算,默认在1轴
Out[68]:
0 1 2 3
0 False False False False
1 True True True True
2 True True True True
c>0
Out[69]:
0 False
1 True
2 True
3 True
dtype: bool
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
参考资料:北京理工大学嵩天老师教学视频