Python3 ---关于numpy的方法总结笔记。

原创

深度不学习 2022-10-27 19:48:10 博主文章分类：交流学习 ©著作权

文章标签 numpy python 数组数据数据类型 文章分类 虚拟化云计算

©著作权归作者所有：来自51CTO博客作者深度不学习的原创作品，请联系作者获取转载授权，否则将追究法律责任

文章目录

写在开头
一. 使用numpy生成数组:
二.查看数据类型
三. 指定数据类型
四.指定小数位数
五.数组维度的读取和重构
六 .flatten()方法、where()方法和、repeat()方法、tile()方法、intersect1d () 方法、setdiff1d()方法、unique()方法
七.数组之间的四则运算
八.轴、方向和转置
九.切片和索引
十.剪裁(剪枝)、拼接和三元符
十一.方阵的构造
十二.各种运算函数
十三.随机函数与随机种子,复制与视图操作
十四. nan和inf
十五.文件加载
十六.控制输出格式
练习题

写在开头

List转numpy.array：temp = np.array(list)
numpy.array转List：arr = temp.tolist()

一. 使用numpy生成数组:

1.arange和range的用法是一样的，可以写start end 和step。
2.生成的数据类型都是 numpy.ndarray

import numpy as np
res = np.array([0,1,2,3,4,5,6,7,8,9])
res_1 = np.array(range(10))
res_2 = np.arange(10)

三种方法输出都是: [0 1 2 3 4 5 6 7 8 9]

二.查看数据类型

1.type是查看变量的类型
2.dtype是查看变量里存的数据的类型

res = np.arange(10)
print(type(res))
print(res.dtype)

输出:
<class 'numpy.ndarray'>

三. 指定数据类型

1.在创建数组时指定类型

res_3 = np.arange(10,dtype='float32')

2. 改变已经存在数据的数据类型

res_4 = res_3.astype('int32')

原来是float32 改成int32 并赋值给res_4了.
但是原来的res_3的数据类型并没有改变

四.指定小数位数

res_5 = 0.124
res_5 = np.round(res_5,2) # 保留2位数字 存在四舍五入

五.数组维度的读取和重构

1.shape 方法读取维数

res_6 = np.array([[1,2,3],[4,5,6]]) # 建立一个二维数组
res_6.shape[0] # 读取第一维维数输出2   即读取行
res_6.shape[1] # 读取第二维维数输出3    即读取列

2.reshape方法表示重塑数组结构

res_7 = np.array([1,2,3,4,5,6,7,8,9,10,11,12]).reshape((3,4))

构造三行四列的二维数组 
res_7 输出:
[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]

res_9 = np.arange(24).reshape((2,3,4)) 
# 变成一维就写reshape((24,))
# 把24个数据重塑成 三维数组   两大块。每块里有三小块，每小块里四个数据
# reshape里面一个数就一维 两个数就二维，三个数就三维，

如果遇到reshape(2,-1)或者reshape(-1,3)。这种-1的情况就是说仅仅指定另一个维度，-1的那个维度让计算机自己算出。
比如一个十个数的数组 reshape(2,-1)，-1处自动算成5.会生成(2.5) 格式的数组。

六 .flatten()方法、where()方法和、repeat()方法、tile()方法、intersect1d () 方法、setdiff1d()方法、unique()方法

1.flatten()将一个array对象数据变成一维展示出来。
仅仅变为1维展示出来，原本的数据结构不变。

res_9 = np.arange(24).reshape((4,6))
res_10 = res_9.flatten()


res_9:
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]
res_10:
 [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]

where()方法
where函数的两种用法:
1.np.where(condition, x, y)
满足条件(condition)，输出x，不满足输出y。
2.np.where(condition)
只有条件 (condition)，没有x和y，则输出满足条件 (即非0) 元素的坐标(索引)。

比如:

arr = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
res = np.where(arr % 2 == 1,-1,arr) # where不改变原数组的值
# 将奇数替换成-1 有点像正则里的sub
输出res:[ 0 -1  2 -1  4 -1  6 -1  8 -1]

= np.array([1,2,3,2,3,4,3,4,5,6])
b = np.array([7,2,10,2,7,4,9,4,9,8])
res = np.where(a == b)
print(res)
输出:(array([1, 3, 5, 7]),) # 1,3,5,7 为ab数组相同值的索引

repeat()方法
用法一:

# 随机生成[0, 5)之间的数，形状1行4列，将此数组0轴重复3次纵向排列

pop = np.random.randint(0, 5, size=(1, 4)).repeat(3, axis=0)
print(pop)
[[1 4 0 4]
 [1 4 0 4]
 [1 4 0 4]]

a = np.array([1,2,3])
res = np.repeat(a,3)
# 将数组a展平复制三次
输出:[1,1,1,2,2,2,3,3,3]

用法二:

b = np.repeat(1, 10) # 默认变成一行

将1重复10次

tile()方法:
就是把数组沿各个方向复制。

b = np.array([[1, 2], [3, 4]])
np.tile(b, 2)  # 复制出2个 B数组
array([[1, 2, 1, 2],
       [3, 4, 3, 4]])
np.tile(b, (2, 1))# 复制出2个0轴，1个1轴
array([[1, 2],
       [3, 4],
       [1, 2],
       [3, 4]])

intersect1d () 方法
numpy.intersect1d()函数查找两个数组的交集，并返回两个输入数组中都有序的，唯一的值。

a = np.array([1,2,3,2,3,4,3,4,5,6])
b = np.array([7,2,10,2,7,4,9,4,9,8])
np.intersect1d(a,b)
输出:
[2 4]

setdiff1d()取交集 :

a = np.array([1,2,3,4,5])
b = np.array([5,6,7,8,9])
res = np.setdiff1d(a,b)
print(res)

输出:
[1 2 3 4]

七.数组之间的四则运算

1.一个数组和一个数进行运算，则数组中每一个数都和这个数进行运算。
2.两个数组结构一样时。四则运算也是对位进行运算。
3.inf 正无穷 -inf负无穷 nan = Not A Number (不是一个数)

= np.arange(4).reshape((4,1))
res_12 = np.arange(20).reshape((4,5))

res_11 的结果
[[0]
[1]
[2]
[3]]


res_12 的结果
[[ 0  1  2  3  4]
 [ 5  6  7  8  9]
[10 11 12 13 14]
[15 16 17 18 19]]

res_12 - res_11 的结果
[[ 0  1  2  3  4]
 [ 4  5  6  7  8]
[ 8  9 10 11 12]
[12 13 14 15 16]]

unique()方法:

a = np.unique(A):

对于一维数组或者列表，unique函数去除其中重复的元素，并按元素由大到小返回一个新的无元素重复的元组或者列表

c,s=np.unique(b,return_index=True) :

return_index=True表示返回新列表元素在旧列表中的位置，并以列表形式储存在s中。 c接收去重之后的数组。

八.轴、方向和转置

1.轴(axis): 也叫方向，使用数字表示,一维数组只有一个0轴，二维数组 0轴和1轴,三维数组0，1，2轴。

Python3 ---关于numpy的方法总结笔记。_数据类型

Python3 ---关于numpy的方法总结笔记。_数据类型_02

2.数组的三种转置方法:

res_13 = np.arange(24).reshape((4,6))
print(res_13.transpose())
print(res_13.T)
print(res_13.swapaxes(1,0))  
# 后面的参数是写轴 原来是0轴和1轴。现在变成1轴和0轴交换

九.切片和索引

1. 切片 (从0开始), 左闭右开
2. 二维数组 res_13[2]表示取一行,res_13[2:] 表示取多行（从2行开始的全部）
3. res_13[[2,8,10]] 取2，8，10行。表示取不连续的多行
4. 有一种通用的写法 res_13[行,列] 比如取第三行第四列 res_13[2,3]
5. res_13[1,:] 这样就是取第一行列表示都取
6. 显然可以这样 res_13[[1,8,10],2:] 表示行取1，8，10行列取从2开始的列
7. res_13[[0,2],[0,1]] 表示取 0行0列的内容和2行1列的内容

print(res_14<10)

会输出这样的形式:
[[ True  True  True  True  True  True]
 [ True  True  True  True False False]
 [False False  True  True False False]
 [False False  True  True False False]]

res_14[res_14<10] = 0  # 为True的位置将会被赋值

十.剪裁(剪枝)、拼接和三元符

1. 三元符

res_15 = np.where(res_14<3,1,99)


res_14里小于3的赋值1。其余赋值99。
赋值后的新数组给了res_15.

2. 剪裁(剪枝)

clip() 方法

res_16 = np.arange(24).reshape(4,6)
res_16 = res_16.clip(10,15) 
# 小于10的变成10 大于15的变成15

3. 拼接

.hstack((res_16,res_15))  # 水平拼接 16在左15在右
np.vstack((res_16,res_15)) # 垂直拼接 16在上15在下

np.r_和np.c_:

np.c_:沿第二个轴（按列）连接。

.c_[np.array([1,2,3]), np.array([4,5,6])]
array([[1, 4],
       [2, 5],
       [3, 6]])

np.c_[np.array([[1,2,3]]), 0, 0, np.array([[4,5,6]])]
array([[1, 2, 3, 0, 0, 4, 5, 6]])

都是矩阵啊！！！！！！！一个列表虽然是横着表示的，但它是列向量。

np.r_:沿第一个轴（按行）连接。

np.r_[np.array([1,2,3]), 0, 0, np.array([4,5,6])]
 array([1, 2, 3, 0, 0, 4, 5, 6])

4. 数组自身某行或者某列的交换.类似于python里的不用第三个遍历互换两个变量的值 a,b=b,a
比如:

[:,[0,2]] = res_16[:,[2,0]] # 0,2列互换

十一.方阵的构造

1. 构造全是0或1的数组

res_17 = np.zeros((3,3))   # 后面的参数是行数和列数
res_18 = np.ones((4,4))

2. 构造方阵

np.eye(3) # 后面参数就是几行几列的方阵。对角线为1其余位0

十二.各种运算函数

1.返回最大最小值的位置。

= np.eye(4)
ans = np.argmax(res_19,axis=0) # 后面那个axis就是方向。
ans1 = np.argmin(res_19,axis=0)
print(res_19)
# 输出
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
print(ans,ans1) # 输出 [0 1 2 3] [1 0 0 0]

其他各种函数:

Python3 ---关于numpy的方法总结笔记。_数组_03

十三.随机函数与随机种子,复制与视图操作

1. 随机函数

res_20 = np.random.rand(2,2)  # 2行2列的随机浮点数 范围0-1
res_21 = np.random.randn(3,3) # 3行3列的标准正太分布随机数 平均数0(最高那一条在Y轴) 标准差1
res_22 = np.random.randint(10,20,(4,2)) # 10到20的随机数 。形状4行2
res_23 = np.random.uniform(10,20,(2,3)) # 10到20的随机浮点数 。形状2行3列

2. 随机种子
这个随机种子seed()相当于一个标志。使用该函数之后紧跟着的随机数将是相等的。比如:

np.random.seed(1)
res_24 = np.random.randint(1,5,(2,2))
np.random.seed(1)
res_25 = np.random.randint(1,5,(2,2))

上述代码的res_24和res_25随机前都使用了seed(1)。所以两个数组随机出来的数组是一样的。

若有:

np.random.seed(1)
res_24 = np.random.randint(1,5,(2,2))
np.random.seed(2)
res_25 = np.random.randint(1,5,(2,2))
np.random.seed(1)
res_26 = np.random.randint(1,5,(2,2))
np.random.seed(2)
res_27 = np.random.randint(1,5,(2,2))

显然 res_24和res_26 随机出来是一样的数组。
而res_25和res_27 随机出来是一样的数组。

3.
a = b 这种情况a和b会相互影响相当于浅拷贝
a = b[:] 视图操作，一种切片，会创建新的对象a，但是a和b的数据变化是一致的（a变b跟着变，b变a跟着变，相互影响）
a=b.copy() 不会相互影响。相当于深拷贝

十四. nan和inf

1. 当读取本地文件为float时，如有数据缺失，则为nan。

2. 当做了一个不合适的计算的时候(比如无穷大减去无穷大)也会出现nan。

3. 一个数除以0。可能出现inf。

4.nan和inf都是float类型。

5.两个nan是不相等的。

6.nan和任何值计算都是nan。

7. 将nan赋值给数组中的数据

出错代码:

res_16 = np.arange(24).reshape((4,6))
res_16[3,3] = np.nan

此时报错 ValueError: cannot convert float NaN to integer
就是不能把float类型的Nan给int型的res_16。所以吧res_16 变成float就行了.

更改后的代码:

res_16 = res_16.astype(float)
res_16[3,3] = np.nan

**8.**统计数组中nan的个数

np.count_nonzero(res_25) # 统计其中不为0的个数
np.count_nonzero(res_25!=res_25) # 可以统计res_25中为nan的个数
因为res_25 != res_25的时候 只有nan为True 而false为0 true为1
所以统计出来的不为0的数，就是nan的数

也可以使用isnan()方法

temp = np.count_nonzero(np.isnan(t1)) 
统计t1里的nan个数
np.isnan(t1) 会返回一个数组，nan的地方为True其余地方False

9. 一般将nan换成均值或者中值或者直接删除nan哪一行

10. nan不能直接换成0

十五.文件加载

frame 写文件路径

Python3 ---关于numpy的方法总结笔记。_python_04

十六.控制输出格式

np.set_printoptions()函数

np.set_printoptions(precision=4)  #控制输出的小数点个数是4
np.array([1.123456789])
[ 1.1235]

np.set_printoptions(threshold=5) #控制输出的值的个数为6，其余以...代替
np.arange(10)
[0 1 2 ..., 7 8 9]

precision：控制输出的小数点个数，默认是8
threshold：控制输出的值的个数，其余以…代替；
当设置打印显示方式threshold=np.nan，意思是输出数组的时候完全输出，不需要省略号将中间数据省略
suppress：当suppress=True，表示小数不需要以科学计数法的形式输出

练习题

用平均值替换数组中的nan

import  numpy as np

'''
练习题  用平均值替换数组中的nan
'''
t1 = np.arange(12).reshape((3,4)).astype("float")
t1[1,2:] = np.nan


for i in range(t1.shape[1]): # 遍历每一列
    temp = t1[:,i]  # 当前这一列 给了temp
    nan_num = np.count_nonzero(temp!=temp) # 找出来nan的个数
    if nan_num != 0:
        # 取出来不是nan值的数，找平均值
        temp_not_nan = temp[temp==temp] # 只有不为nan的时候才是等于的
        avg = temp_not_nan.mean() # 找到平均值
        temp[np.isnan(temp)] = avg # nan找出来 给他平均值