python split 从后往前遍历分割 python从后往前读文件

转载

mob6454cc784c23 2024-03-01 16:25:39

文章标签 readline和seek 参数说明（r、r+、w、w+、a、a+） os模块读取非纯文本文件上下文管理器 文章分类 Python 后端开发

一、文件相关参数

1. 打开文件的步骤：打开 ---> 操作 ---> 关闭

2. 参数介绍

参数	参数说明
r（默认）	（1）只能读，不能写（2）读取的文件不存在，会报错
r+	（1）可读写（2）文件不存在，会报错（3）默认从文件指针所在位置开始写入
w	（1）只能写（2）会清空文件之前的内容（3）文件不存在，不会报错，会创建新的文件并写入
w+	（1）可读写（2）会清空文件内容（3）文件不存在，不会报错，会创建新的文件并写入
a	（1）只能写（2）文件不存在，不报错（3）不会清空文件内容
a+	（1）可读写（2）文件不存在，不报错（3）不会清空文件内容

3. 判断文件可读可写

f.readable()    #读判断
f.writable()    #写判断

python split 从后往前遍历分割 python从后往前读文件_读取非纯文本文件

4. 参数的使用（以r+为例）

目的：验证r+模式，默认从文件指针所在位置开始写入

（1）先读后写

python split 从后往前遍历分割 python从后往前读文件_上下文管理器_02

查看写入内容所在位置

python split 从后往前遍历分割 python从后往前读文件_os模块_03

（2）先写内容

python split 从后往前遍历分割 python从后往前读文件_上下文管理器_04

写入内容后所在位置

python split 从后往前遍历分割 python从后往前读文件_上下文管理器_05

二、文件读取方式

1. 普通文件的读取

（1）read()

f = open('/tmp/passwd','r+')
print(f.read())
f.close()

python split 从后往前遍历分割 python从后往前读文件_上下文管理器_06

（2）readline()

f = open('/tmp/passwd','r+')
print(f.readline())
f.close()

python split 从后往前遍历分割 python从后往前读文件_os模块_07

print(f.read(4))            #读取4个字符
print(f.readline(),end='')    #读取一行

python split 从后往前遍历分割 python从后往前读文件_读取非纯文本文件_08

（3）readlines()

readlines()：读取文件内容，返回一个列表，列表里的元素分别为文件每行的内容

f = open('/tmp/passwd','r+')
print(f.readlines())
f.close()

python split 从后往前遍历分割 python从后往前读文件_readline和seek_09

我们可以看到readlines()方法，读取文件内容，会返回一个列表，列表中的每个项后都会产生一个换行符，那么我们该怎样去掉这些换行符。

方法一：
print([line.strip() for line in f.readlines()])
方法二：
print(list(map(lambda x:x.strip(),f.readlines())))

python split 从后往前遍历分割 python从后往前读文件_os模块_10

2. seek方法

seek方法，移动指针

seek第一个参数是偏移量：>0,代表向右移动，<0，代表向左移动

seek第二个参数是： 0（移动指针到文件开头）、 1（不移动指针）、2（移动指针到末尾）

（1）使用seek移动指针，可以查看到之前写入的内容

f = open('/tmp/passwd','r+')
print(f.tell())
print(f.write('redhat'))
print(f.tell())
f.seek(0,0)
print(f.tell())
print(f.read())
f.close()

python split 从后往前遍历分割 python从后往前读文件_参数说明（r、r+、w、w+、a、a+）_11

（2）不移动指针，写入内容后，读取文件内容会直接从写完内容后的位置开始读取

f = open('/tmp/passwd','r+')
print(f.tell())
print(f.write('redhat'))
print(f.tell())
print(f.read())
f.close()

python split 从后往前遍历分割 python从后往前读文件_参数说明（r、r+、w、w+、a、a+）_12

3. 非纯文本文件的读取

读取文本文件：r、r+、w、w+、a、a+

读取二进制文件：rb、rb+、wb、wb+、ab、ab+

f = open('redhat.jpg',mode='rb')    #读取二进制文件，如图片
content = f.read()
f.close()

f1 = open('hello.jpg',mode='wb')    #将redhat.jpg写入到hello.jpg文件中（拷贝）
f1.write(content)
f1.close()

运行后，结果显示栏没有提示信息，表示文件成功读取，并写入，可以在左侧的项目栏发现redhat.jpg和hello.jpg两个文件

python split 从后往前遍历分割 python从后往前读文件_上下文管理器_13

4. 上下文管理器

with open('/tmp/passwd') as f1,open('/tmp/passwd1','w+') as f2:
    f2.write(f1.read())
    f2.seek(0,0)
    print(f2.read())

读取多个文件，可以用【，】隔开，该用法只支持python3，不支持python2

python split 从后往前遍历分割 python从后往前读文件_读取非纯文本文件_14

练习：创建文件data.txt，共100000行，每行存放一个1～100之间的整数

import random
#方法一：
f = open('data.txt','w+')
for i in range(100000):
    f.write(str(random.randint(1,100)) + '\n')
f.seek(0,0)
print(f.read())
f.close()
#方法二：
import random
with open('data2.txt','w+') as f:
    for i in range(10000):
        f.write(str(random.randint(1,100)) + '\n')

三、os模块的使用

1. 返回操作系统类型

posix：表示linux操作系统，nt：表示windows操作系统

import os
print(os.name)

python split 从后往前遍历分割 python从后往前读文件_上下文管理器_15

2. 查看操作系统详细信息

import os
info = os.uname()
print(info)
print(info.sysname)
print(info.nodename)

python split 从后往前遍历分割 python从后往前读文件_上下文管理器_16

3. 查看环境变量

import os
print(os.environ)
print(os.environ.get('PATH'))

python split 从后往前遍历分割 python从后往前读文件_读取非纯文本文件_17

4. 判断是否为绝对路径

import os
from os.path import exists,splitext,join
print(os.path.isabs('/tmp/hello/westos'))   #判断的路径可以不存在
print(os.path.isabs('hello'))

python split 从后往前遍历分割 python从后往前读文件_上下文管理器_18

5. 生成绝对路径

import os
from os.path import exists,splitext,join
print(os.path.abspath('hello.png'))
print(os.path.join('/home/kiosk','hello.png'))
print(os.path.join(os.path.abspath('.'),'hello.png'))

python split 从后往前遍历分割 python从后往前读文件_读取非纯文本文件_19

6. 获取目录名或者文件名

import os
from os.path import exists,splitext,join
filename = '/home/kiosk/PycharmProjects/20190322/day05/hello.png'
#获取路径中的文件名
print(os.path.basename(filename))
#获取路径中的目录名
print(os.path.dirname(filename))

python split 从后往前遍历分割 python从后往前读文件_readline和seek_20

import os
os.mkdir('test')
os.makedirs('test/file') #创建递归目录
os.rmdir('test')        #删除空目录

8. 创建文件/删除文件

import os
os.mknod('a.txt')    #新建文件
os.remove('a.txt')    #删除文件

9. 文件重命名

import os
os.rename('data.txt','data1.txt')

10. 判断文件或者目录是否存在

import os
from os.path import exists,splitext,join
print(os.path.exists('imgs'))

python split 从后往前遍历分割 python从后往前读文件_readline和seek_21

11. 分离后缀名和文件名

import os
from os.path import exists,splitext,join
print(os.path.splitext('hello.jpg'))

python split 从后往前遍历分割 python从后往前读文件_读取非纯文本文件_22

12. 目录名和文件名分离

import os
from os.path import exists,splitext,join
print(os.path.split('/tmp/hello/hello.jpg'))

python split 从后往前遍历分割 python从后往前读文件_readline和seek_23

四、遍历指定目录

import os
from os.path import join

for root,dir,files in os.walk('/var/log'):
    print(root)
    print(dir)          #遍历该目录下的所有目录
    print(files)        #遍历该目录下的所有文件
    for name in files:
        print(join(root,name))      #目录下所有文件的绝对路径

python split 从后往前遍历分割 python从后往前读文件_readline和seek_24

练习：

1. 生成一个大文件ips.txt,要求1200行每行随机为172.25.254.0/24段的ip;

2. 读取ips.txt文件统计这个文件中ip出现频率排前10的ip;

import random

def create_ip(filename):
    ip = ['172.25.254.' + str(i) for i in range(1,255)]
    with open(filename,'a+') as f:
        for i in range(1200):
            f.write(random.sample(ip,1)[0] + '\n')

def sorted_by_ip(filename,count=10):
    ips_dict = dict()
    with open(filename) as f:
        for ip in f:
            ip = ip.strip()
            if ip in ips_dict:
                ips_dict[ip] += 1
            else:
                ips_dict[ip] = 1
    sorted_ip = sorted(ips_dict.items(),key=lambda x:x[1],reverse=True)[:count]
    return sorted_ip

print(sorted_by_ip('ips.txt'))

python split 从后往前遍历分割 python从后往前读文件_参数说明（r、r+、w、w+、a、a+）_25