这段代码是自己学了python的基本语法之后,参考一个网上视频写的代码,功能是截取搜索引擎360的关键词。
代码:
1 #!/usr/bin/python 2 #encoding:utf-8 3 4 import urllib 5 import urllib2 6 import re 7 import time 8 from random import choice 9 10 ipList = ['1.9.189.65:3128', '27.24.158.130:80', '27.24.158.154:80'] 11 12 listKeyWords = ["集团", "科技"] 13 for item in listKeyWords: 14 ip = choice(ipList) 15 gjc = urllib.quote(item) 16 url = "http://sug.so.360.cn/suggest?callback=suggest_so&encodein=utf-8&encodeout=utf-8&word=" + gjc 17 headers = { 18 "GET":url, 19 "Host":"sug.so.360.cn", 20 "Referer":"http://www.so.com/", 21 "User-Agent":"Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/37.0.2062.120 Chrome/37.0.2062.120 Safari/537.36" 22 } 23 24 #proxy_support = urllib2.ProxyHandler({"http":"http://"+ip}) 25 26 #opener = urllib2.build_opener(proxy_support) 27 #urllib2.install_opener(opener) 28 req = urllib2.Request(url) 29 30 for key in headers: 31 req.add_header(key, headers[key]) 32 33 html = urllib2.urlopen(req).read() 34 print html 35 36 ss = re.findall("\"(.*?)\"", html) 37 for item in ss: 38 print item 39 40 time.sleep(3)
主要使用了python自i带的几个库,用法可以查看帮助文档。