Python数据分析入门（四）：Pandas索引操作

转载

mb6063e3ee33cbf 2021-04-04 23:27:40

索引对象Index

Series和DataFrame中的索引都是Index对象

示例代码：

print(type(ser_obj.index))print(type(df_obj2.index))print(df_obj2.index)

运行结果：

<class 'pandas.indexes.range.RangeIndex'><class 'pandas.indexes.numeric.Int64Index'>Int64Index([0, 1, 2, 3], dtype='int64')

索引对象不可变，保证了数据的安全

示例代码：

# 索引对象不可变df_obj2.index[0] = 2

运行结果：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-7f40a356d7d1> in <module>()  1 # 索引对象不可变----> 2 df_obj2.index[0] = 2/Users/Power/anaconda/lib/python3.6/site-packages/pandas/indexes/base.py in __setitem__(self, key, value)   1402 
   1403     def __setitem__(self, key, value):
-> 1404         raise TypeError("Index does not support mutable operations")   1405 
   1406     def __getitem__(self, key):TypeError: Index does not support mutable operations

常见的Index种类

Index，索引
Int64Index，整数索引
MultiIndex，层级索引
DatetimeIndex，时间戳类型

Series索引

index 指定行索引名

示例代码：

ser_obj = pd.Series(range(5), index = ['a', 'b', 'c', 'd', 'e'])print(ser_obj.head())

运行结果：

a    0
b    1
c    2
d    3
e    4
dtype: int64

行索引

ser_obj[‘label’], ser_obj[pos]

示例代码：

# 行索引print(ser_obj['b'])print(ser_obj[2])

运行结果：

1
2

切片索引

ser_obj[2:4], ser_obj[‘label1’: ’label3’]

注意，按索引名切片操作时，是包含终止索引的。

示例代码：

# 切片索引print(ser_obj[1:3])print(ser_obj['b':'d'])

运行结果：

b    1
c    2
dtype: int64
b    1
c    2
d    3
dtype: int64

不连续索引

ser_obj[[‘label1’, ’label2’, ‘label3’]]

示例代码：

# 不连续索引print(ser_obj[[0, 2, 4]])print(ser_obj[['a', 'e']])

运行结果：

a    0
c    2
e    4
dtype: int64
a    0
e    4
dtype: int64

布尔索引

示例代码：

# 布尔索引ser_bool = ser_obj > 2print(ser_bool)print(ser_obj[ser_bool])print(ser_obj[ser_obj > 2])

运行结果：

a    Falseb    Falsec    Falsed     Truee     Truedtype: bool
d    3e    4dtype: int64
d    3e    4dtype: int64

DataFrame索引

Python数据分析入门（四）：Pandas索引操作_Pandas

columns 指定列索引名

示例代码：

import numpy as np

df_obj = pd.DataFrame(np.random.randn(5,4), columns = ['a', 'b', 'c', 'd'])
print(df_obj.head())

运行结果：

          a         b         c         d0 -0.241678  0.621589  0.843546 -0.3831051 -0.526918 -0.485325  1.124420 -0.6531442 -1.074163  0.939324 -0.309822 -0.2091493 -0.716816  1.844654 -2.123637 -1.3234844  0.368212 -0.910324  0.064703  0.486016

列索引

df_obj[[‘label’]]

示例代码：

# 列索引print(df_obj['a']) # 返回Series类型

运行结果：

0   -0.2416781   -0.5269182   -1.0741633   -0.7168164    0.368212Name: a, dtype: float64

不连续索引

df_obj[[‘label1’, ‘label2’]]

示例代码：

# 不连续索引print(df_obj[['a','c']])

运行结果：

          a         c0 -0.241678  0.8435461 -0.526918  1.1244202 -1.074163 -0.3098223 -0.716816 -2.1236374  0.368212  0.064703

索引对象Index

Series和DataFrame中的索引都是Index对象

示例代码：

print(type(ser_obj.index))
print(type(df_obj2.index))

print(df_obj2.index)

运行结果：

<class 'pandas.indexes.range.RangeIndex'><class 'pandas.indexes.numeric.Int64Index'>Int64Index([0, 1, 2, 3], dtype='int64')

索引对象不可变，保证了数据的安全

示例代码：

# 索引对象不可变
df_obj2.index[0] = 2

运行结果：

---------------------------------------------------------------------------TypeError                                 Traceback (most recent call last)<ipython-input-23-7f40a356d7d1> in <module>()  1 # 索引对象不可变----> 2 df_obj2.index[0] = 2/Users/Power/anaconda/lib/python3.6/site-packages/pandas/indexes/base.py in __setitem__(self, key, value)
   1402 
   1403     def __setitem__(self, key, value):-> 1404         raise TypeError("Index does not support mutable operations")
   1405 
   1406     def __getitem__(self, key):TypeError: Index does not support mutable operations

常见的Index种类

Index，索引
Int64Index，整数索引
MultiIndex，层级索引
DatetimeIndex，时间戳类型

Series索引

index 指定行索引名

示例代码：

ser_obj = pd.Series(range(5), index = ['a', 'b', 'c', 'd', 'e'])print(ser_obj.head())

运行结果：

a    0b    1c    2d    3e    4dtype: int64

行索引

ser_obj[‘label’], ser_obj[pos]

示例代码：

# 行索引
print(ser_obj['b'])
print(ser_obj[2])

运行结果：

1
2

切片索引

ser_obj[2:4], ser_obj[‘label1’: ’label3’]

注意，按索引名切片操作时，是包含终止索引的。

示例代码：

# 切片索引
print(ser_obj[1:3])
print(ser_obj['b':'d'])

运行结果：

b    1c    2dtype: int64b    1c    2d    3dtype: int64

不连续索引

ser_obj[[‘label1’, ’label2’, ‘label3’]]

示例代码：

# 不连续索引
print(ser_obj[[0, 2, 4]])
print(ser_obj[['a', 'e']])

运行结果：

a    0c    2e    4dtype: int64a    0e    4dtype: int64

布尔索引

示例代码：

# 布尔索引
ser_bool = ser_obj > 2
print(ser_bool)
print(ser_obj[ser_bool])

print(ser_obj[ser_obj > 2])

运行结果：

a    Falseb    Falsec    Falsed     Truee     Truedtype: bool
d    3e    4dtype: int64
d    3e    4dtype: int64

DataFrame索引

Python数据分析入门（四）：Pandas索引操作_Python_02

columns 指定列索引名

示例代码：

import numpy as np

df_obj = pd.DataFrame(np.random.randn(5,4), columns = ['a', 'b', 'c', 'd'])print(df_obj.head())

运行结果：

          a         b         c         d
0 -0.241678  0.621589  0.843546 -0.383105
1 -0.526918 -0.485325  1.124420 -0.653144
2 -1.074163  0.939324 -0.309822 -0.209149
3 -0.716816  1.844654 -2.123637 -1.323484
4  0.368212 -0.910324  0.064703  0.486016

列索引

df_obj[[‘label’]]

示例代码：

# 列索引
print(df_obj['a']) # 返回Series类型

运行结果：

0   -0.241678
1   -0.526918
2   -1.074163
3   -0.716816
4    0.368212Name: a, dtype: float64

不连续索引

df_obj[[‘label1’, ‘label2’]]

示例代码：

# 不连续索引
print(df_obj[['a','c']])

运行结果：

          a         c
0 -0.241678  0.843546
1 -0.526918  1.124420
2 -1.074163 -0.309822
3 -0.716816 -2.123637
4  0.368212  0.064703

本文章为转载内容，我们尊重原作者对文章享有的著作权。如有内容错误或侵权问题，欢迎原作者联系我们进行内容更正或删除文章。

上一篇：mysql炸了--innodb如何在仅有frm跟ibd文件的情况下恢复数据

下一篇：原型+原型链+闭包+立即执行函数+插件开发初识

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯

Python数据分析入门（四）：Pandas索引操作

Python数据分析入门（四）：Pandas索引操作

Series和DataFrame中的索引都是Index对象

索引对象不可变，保证了数据的安全

index 指定行索引名

行索引

切片索引

不连续索引

布尔索引

columns 指定列索引名

列索引

不连续索引

Series和DataFrame中的索引都是Index对象

索引对象不可变，保证了数据的安全

index 指定行索引名

行索引

切片索引

不连续索引

布尔索引

columns 指定列索引名

列索引

不连续索引

51CTO博客