python抓包后怎么从html中拿标签中的数据 pycharm抓包

转载

西门吹雪 2024-06-15 05:47:55

文章标签 git github HTTPS 文章分类 Python 后端开发

一：任务。

1.基础：使用 fiddler 抓包工具+代码，实时监控朴朴上某产品的详细价格信息

实验结果如下：

python抓包后怎么从html中拿标签中的数据 pycharm抓包_HTTPS

二：解体思路。

1.首先需要自学了解如何使用fiddler，按照ppt里的步骤配置软件设置。

2.研究学习python爬虫的代码，csdn或者哔哩哔哩上有很多爬取数据的教程。

3.下载学习编写Python的工具pyCharm，因为专业版是需要付费的，所以下载的是社区版。

4.在pycharm中运行代码，得到基础题的结果。

5.下载github，创建远程连接，上传py文件到仓库中。

三：实现过程。

1.默认情况下，fiddler是不会捕获https会话的，所以需要自行设置一下。启动软件，点击【tools】-【Options】，在弹出的新窗口中，点击HTTPS选项卡，将捕获HTTPS连接这一项前面全打上勾，点击OK就就行了。

python抓包后怎么从html中拿标签中的数据 pycharm抓包_HTTPS_02

2.fidder寻找需要的信息

python抓包后怎么从html中拿标签中的数据 pycharm抓包_git_03

3.python代码。
  
def url_requests():
#获取url地址
url = 'https://j1.pupuapi.com/client/product/storeproduct/detail/4dcdeca2-f5a3-4be8-9e2f-e099889a23a0/0b777094-c58b-4fd0-843c-8c9d264d1e88'
#头
head = {
    'User-Agent': ': Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) '
                  'Chrome/53.0.2785.143 Safari/537.36 MicroMessenger/7.0.9.501 NetType/WIFI '
                  'MiniProgramEnv/Windows WindowsWechat '
}

response = requests.get(url, headers=head, verify=False)

dict = json.loads(response.text)#将字符串转为字典
name = dict["data"]["name"]#获取商品的名称
spec = dict["data"]["spec"]#获取商品规格
price = str(int(dict["data"]["price"]) / 100)#获取商品原价
market_price = str(int(dict["data"]["market_price"]) / 100)#获取商品原价/折扣价
share_content = dict["data"]["share_content"]#获取商品详细内容
#输出语句
print("---------------商品：" + name + "----------------")
print("规格：" + spec)
print("原价：" + price)
print("原价/折扣价：" + price + "/" + market_price)
print("详细内容：" + share_content)
print('\n')
print("---------------" + name + "------------------")
#循环获取信息
i = 1
for i in range(15):
    nowTimeandprint = time.strftime('%Y' + '-' + '%m' + '-' + '%d' + ' %H:%M:%S,价格为：' + price)
    print(nowTimeandprint)
    time.sleep(5)

  #main函数
 __name__ == '__main__':
  url_requests()#调用函数

4.git远程提交到仓库

python抓包后怎么从html中拿标签中的数据 pycharm抓包_HTTPS_04