一、python2

  1. 发送get请求
# -*- coding: utf-8 -*-
import urllib2
url = "http://localhost:80/webtest/test?name=xuejianbest"
req = urllib2.Request(url)
response = urllib2.urlopen(req)
page_html = response.read()
print page_html
  1. 若urlopen方法数据参数不为空,则发送post请求:
# -*- coding: utf-8 -*-
import urllib2
import urllib

url = "http://localhost:80/webtest/test?name=xuejianbest"
req = urllib2.Request(url)
values = {}
values["age"] = "23"
values["sex"] = "男"
data = urllib.urlencode(values)
print data
response = urllib2.urlopen(req, data)
page_html = response.read()
print page_html
  1. 可以在请求头中加入浏览器标识,模拟浏览器访问:
# -*- coding: utf-8 -*-
import urllib2
user_agent= r'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.2669.400 QQBrowser/9.6.10990.400'
headers = { r'User-Agent' : user_agent }
url = "http://localhost:80/webtest/test"
req = urllib2.Request(url, headers = headers)
response = urllib2.urlopen(req)
page_html = response.read()
print page_html
  1. 让多次请求共有一个session,可在请求头加入cookies信息:

-- coding: utf-8 --

import urllib2
user_agent= r'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.104 Safari/537.36 Core/1.53.2669.400 QQBrowser/9.6.10990.400'
headers = { r'User-Agent' : user_agent }
url = "http://localhost:80/webtest/test"
req = urllib2.Request(url, headers = headers)
response = urllib2.urlopen(req)
cookie = response.headers.get('Set-Cookie') # 从第一次的请求返回中获取cookie
print cookie
page_html = response.read()
print page_html

req.add_header('cookie', cookie) #将cookie加入以后的请求头,保证多次请求属于一个session
response = urllib2.urlopen(req)
page_html = response.read()
print page_html

二、python3

1. get请求

import requests
url = 'https://xxxxx?name=aaa'
cookies = {'Cookie':'xxxxx'}
r = request.get(url, cookies = cookies)

print(r.text)


2. post  请求

import requests
url = 'https://xxxxx?name=aaa'
p = requests.post(url, data=postdata)
print(p.text)

3. session

如果要爬虫的话,一般建议带上session会话和headers表头信息,session会话可以记录cookie.

s = requests.Session()
headers = {'Host':'www.xxx.com'}
postdata = {'name':'aaa'}
url = 'http://xxxxx'
s.headers.update(headers)
r = s.post(url, data=postdata)