最近遇到个很棘手的问题弄了很久才明白。 网页http://www.hkexnews.hk/sdw/search/mutualmarket_c.aspx 我想获取里面的资料,但是需要选取一个日期,那么意味着我需要发送一个post包给此页面。 从而发现了2个随机参数:VIEWSTATE、EVENTVALIDATION

具体解决办法如下: def get_hiddenvalue(url): request=urllib.request.Request(url) reponse=urllib.request.urlopen(request) resu=reponse.read() html = resu.decode('utf-8') # python3 VIEWSTATE =re.findall(r'<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="(.?)" />', html,re.I) EVENTVALIDATION =re.findall(r'<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="(.?)" />', html,re.I) return VIEWSTATE[0],EVENTVALIDATION[0]

编写函数先访问一次网页。随后获取该值之后再发送post包 。解决!

全部源码如下: import requests import urllib.request import re NIAN = '2017' YUE = '12' RI = '30' url = 'http://www.hkexnews.hk/sdw/search/mutualmarket_c.aspx' def get_hiddenvalue(url): request=urllib.request.Request(url) reponse=urllib.request.urlopen(request) resu=reponse.read() html = resu.decode('utf-8') # python3 VIEWSTATE =re.findall(r'<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="(.?)" />', html,re.I) EVENTVALIDATION =re.findall(r'<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="(.?)" />', html,re.I) return VIEWSTATE[0],EVENTVALIDATION[0] VIEWSTATE, EVENTVALIDATION=get_hiddenvalue(url) data = { '__EVENTVALIDATION':EVENTVALIDATION, '__VIEWSTATE':VIEWSTATE, '__VIEWSTATEGENERATOR':'EC4ACD6F', 'btnSearch.x':'23', 'btnSearch.y':'12', 'ddShareholdingDay':NIAN, 'ddShareholdingMonth':YUE, 'ddShareholdingYear':RI, 'today':'20180509' } html_post = requests.post(url, data=data) print(html_post.text)