本次的内容为python的应用,关于日期、文件、词云统计应用,均多应用对象思想,及字典。
习题一
要求:1.初始化start_day,end_day两个日期
from datetime import datetime
start_day=datetime(2019,4,1)
end_day=datetime(2019,4,30)
其它时间数据生成要用datetime或date模块的方法编程实现
2.不能使用calendar模块生成
以下是代码内容:
1 from datetime import *
2
3 start_day=datetime(2019,4,1)
4 end_day=datetime(2019,4,30)
5 day=end_day-start_day#记总天数
6
7 print(start_day.strftime('\t\t\t%Y/%m'))#输出年份与月份
8 print("周日\t周一\t周二\t周三\t周四\t周五\t周六")
9
10 first_day=start_day.weekday()#第一日的是周几
11 count=0#计是否换行数
12 space=0#计空格数
13
14 #第一天前面的空格数
15 while space <= first_day:
16 space += 1 #空格数控制格式
17 print("\t", end="")
18 count += 1 #计换行数控制格式
19
20 that_day = 1#计第一天为一号
21 while that_day <= day.days:#显示每天
22 print(that_day,end="\t")
23 that_day += 1
24 count += 1
25 if (count % 7 == 0):#每计七个数进行换行
26 print("\n")
以下是运行结果:
本题更多的是格式上的规划,通过循环,控制输入与格式达到输出结果,呈现出想要的格式。
代码中有可以通过更改起始时间及结束时间来,控制该输出。
题目不难,更多的是逻辑上要清晰,考虑好循环的内容。
习题二
要求:1.参考“三国演义”词频统计程序,实现对红楼梦出场人物的频次统计。
2.将红楼梦出场人物的频次统计结果用词云显示。
以下是代码内容:
1 import jieba
2 excludes = {"什么","一个","我们","那里","你们","如今","说道","知道","起来","这里","出来","他们","众人","自己",
3 "奶奶","一面","只见","怎么","姑娘","两个","没有","不是","不知","这个","听见","这样","进来","这是",
4 "告诉","就是","咱们","东西","回来","只是","大家","老爷","只得","丫头","这些","不敢","出去","所以",
5 "不过","的话","不好","姐姐"}
6 txt = open("红楼梦.txt", "r", encoding='utf8').read() #打开文件并定义
7
8 words = jieba.lcut(txt)
9
10 counts = {} #定义字典
11
12 for word in words:
13 if len(word) == 1:
14 continue
15 elif (word == "宝玉" or word == "宝玉道"or word == "宝二爷"
16 or word == "混世魔王"or word == "怡红公子"or word == "绛洞花主"
17 or word == "无事忙"or word == "遮天大王"or word == "富贵闲人"or word =="贾宝玉"):
18 rword = "贾宝玉"
19 elif word == "黛玉" or word == "黛玉道"or word =="林黛玉":
20 rword = "林黛玉"
21 elif word == "宝钗" or word == "宝钗道"or word =="薛宝钗":
22 rword = "薛宝钗"
23 elif word == "姨太太" or word == "薛姨妈":
24 rword = "薛姨妈"
25 elif word == "老祖宗" or word == "老太太"or word == "史太君"or word =="贾母":
26 rword = "贾母"
27 elif word == "太太" or word == "二太太":
28 rword = "王夫人"
29 elif word == "熙凤" or word == "熙凤道"or word == "凤姐"or word == "凤姐儿"or word == "王熙凤":
30 rword = "王熙凤"
31 elif word == "平儿" or word == "袭人"or word == "小平":
32 rword = "平儿"
33 elif word == "探春" or word == "探春道":
34 rword = "贾探春"
35 elif word == "晴雯" or word == "勇晴雯"or word == "芙蓉仙子"or word == "病西施":
36 rword = "晴雯"
37 else:
38 rword = word
39 counts[rword] = counts.get(rword, 0) + 1 #词汇加入字典
40
41 #从字典中删除无用词
42 for word in excludes:
43 del (counts[word])
44
45 #字典转换为列表
46 items = list(counts.items())
47
48 #lambda是一个隐函数,是固定写法
49 items.sort(key=lambda x: x[1], reverse=True)
50
51 for i in range(10): #出现的词频统计
52 word, count = items[i] #将键和值分别赋予列表word和count
53 print("{0:<10}{1:>7}".format(word, count)) #0:<10左对齐,宽度10,”>5"右对齐
以下是运行结果:
本题更多的是在对代码原理的理解后对,词云统计的使用。
根据源代码,进行修改,通过增加限制条件,
限制词云统计中的词汇,来搜索出你想要的对应信息的数据。
在每一次运行结束,通过在exclude中的增加词语,来规避,不想要的数据。
通过增加IF、elif条件判断来使数据更合理更符合预期。
代码内容不难,更多的在于理解代码内容,收集信息,与不厌其烦地去修改筛选条件。
本次习题结束。
所以说很多时候不是你不会,只是缺少更多的思考,更多的细心罢了..