Java洪君：python爬虫2

原创

楚君android 2024-06-23 01:08:26 ©著作权

文章标签 HTML html python 文章分类 Python 后端开发

©著作权归作者所有：来自51CTO博客作者楚君android的原创作品，请联系作者获取转载授权，否则将追究法律责任

import requests
from bs4 import BeautifulSoup

# 豆瓣电影评论html页面
url = 'https://movie.douban.com/subject/26363254/comments'

response = requests.get(url)

python爬取豆瓣电影影评

# 对HTML内容进行BeautifulSoup分析
soup = BeautifulSoup(response.text, 'html.parser')

# 查找影评list,解析HTML元素
comment_list = soup.find_all('div', class_='comment')

# 对每个影评的具体信息进行解析
for comment in comment_list:
    # 得到评论者的名字
    commenter = comment.find('a', class_='').text
    # 获得评论内容
    content = comment.find('p', class_='comment-content').text.strip()
    # 获取评分
    rating_tag = comment.find('span', class_='rating')
    # 有些评论可能没有评分
    rating = rating_tag['title'] if rating_tag else '无'
    # 打印评论者和评论内容
    print(f'评论者: {commenter}, 评分: {rating}')
    print(f'评论内容： {content}\n')