HTTP请求：Requests的进阶使用方法浅析( 二 ) _Requests

print(res.status_code)
except ReadTimeout:
print("捕捉到超时异常")
2.1.3 allow_redirects
设置重定向开关。
>>> import requests
>>> r = requests.get('http://Github.com')
>>> r.url
'https://github.com/'
>>> r.status_code
200
>>> r.history
[<Response [301]>]
# 如果使用GET、OPTIONS、POST、PUT、PATCH或DELETE ，则可以使用allow_redirects参数禁用重定向
>>> r = requests.get('http://github.com', allow_redirects=False)
>>> r.status_code
301
>>> r.history
[]
# 用HEAD启动重定向
>>> r = requests.head('http://github.com', allow_redirects=True)
>>> r.url
'https://github.com/'
>>> r.history
[<Response [301]>]
import requests
import re
# 第一次请求
r1=requests.get('https://github.com/login')
r1_cookie=r1.cookies.get_dict() #拿到初始cookie(未被授权)
authenticity_token=re.findall(r'name="authenticity_token".*?value=https://www.isolves.com/it/wl/zs/2023-06-28/"(.*?)"',r1.text)[0] #从页面中拿到CSRF TOKEN
# 第二次请求：带着初始cookie和TOKEN发送POST请求给登录页面，带上账号密码
data=https://www.isolves.com/it/wl/zs/2023-06-28/{
'commit':'Sign in',
'utf8':'?',
'authenticity_token':authenticity_token,
'login':'xxxxxx@qq.com',
'password':'password'
}
# 测试一：没有指定allow_redirects=False,则响应头中出现Location就跳转到新页面，
# r2代表新页面的response
r2=requests.post('https://github.com/session',
data=https://www.isolves.com/it/wl/zs/2023-06-28/data,
cookies=r1_cookie
print(r2.status_code) # 200
print(r2.url) # 看到的是跳转后的页面
print(r2.history) # 看到的是跳转前的response
print(r2.history[0].text) # 看到的是跳转前的response.text
# 测试二：指定allow_redirects=False,则响应头中即便出现Location也不会跳转到新页面，
# r2代表的仍然是老页面的response
r2=requests.post('https://github.com/session',
data=https://www.isolves.com/it/wl/zs/2023-06-28/data,
cookies=r1_cookie,
allow_redirects=False
print(r2.status_code) # 302
print(r2.url) # 看到的是跳转前的页面https://github.com/session
print(r2.history) # []
2.1.4 proxies
同添加 headers 方法一样，代理参数是 dict 。
import requests
import re
def get_html(url):
proxy = {
'http': '120.25.253.234:812',
'https' '163.125.222.244:8123'
}
heads = {}
heads['User-Agent'] = 'Mozilla/5.0 (windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.221 Safari/537.36 SE 2.X MetaSr 1.0'
req = requests.get(url, headers=heads,proxies=proxy)
html = req.text
return html
def get_ipport(html):
regex = r'<td data-title="IP">(.+)</td>'
iplist = re.findall(regex, html)
regex2 = '<td data-title="PORT">(.+)</td>'
portlist = re.findall(regex2, html)
regex3 = r'<td data-title="类型">(.+)</td>'
typelist = re.findall(regex3, html)
sumray = []
for i in iplist:
for p in portlist:
for t in typelist:
pass
pass
a = t+','+i + ':' + p
sumray.append(a)
print('代理')
print(sumray)
if __name__ == '__mAIn__':
url = 'http://www.baidu.com'
get_ipport(get_html(url))
某些接口增加了防骚扰模式，对于大规模且频繁的请求，可能会弹出验证码，或者跳转到登录验证页面，或者封禁 IP 地址，此时如果想要正常访问，可以通过设置代理来解决这个问题。
除了基本的 HTTP 代理外， requests 还支持 SOCKS 协议的代理。
# 安装socks库
pip3 install "requests[socks]"
# 进行代理
import requests
proxies = {
'http': 'socks5://user:password@host:port',
'https': 'socks5://user:password@host:port'
}
res = requests.get('http://www.baidu.com', proxies=proxies)
print(res.status) # 200
2.1.5 hooks
即钩子方法， requests 库只支持一个 response 的钩子，即在响应返回时，可以捎带执行自定义方法。可以用于打印一些信息、做一些响应检查、或者向响应中添加额外的信息。

HTTP请求：Requests的进阶使用方法浅析( 二 )

推荐阅读

@湖北人，今年5.20很特殊，因为70年前的今天……

文化|“黄河杯”2020山东文创设计大赛获奖名单揭晓

云南教育厅发布新规：中考体育与语数英各占100分

旅行|外出旅行时，有5种东西尽量别带或少带？导游来一一告诉你原因

下载不用等，群晖联合迅雷带来畅快的家庭影院体验

每天放松一笑|回家吧！，搞笑gif-风雨无阻啊！是不是有点过分

何洁一家五口北京到达，老公刁磊帅气有型，大女儿简直何洁翻版

欧莱雅护肤系列分别适用的年龄段;欧莱雅护肤品哪种好用?

农历七月十五为什么要吃饺子七月十五为什么要吃包子

第一名第二名第三名第二名的逆袭每周几更新更几集

高冷女生——女生高冷魅力签名，微信个性签名女生简单气质

生科医学|疫情传播链背后：为何老人爱去西北线旅游？

【豆虫】农村人的优势，这些野味城里可能吃不到！

国际连连看|他要选让他舒服女竞选伙伴，拜登妻子：若老公入主白宫我继续教书

『美食档案』北方春天吃什么时令蔬菜好？老厨师说：10种露地应季蔬菜可多吃！

客厅沙发后挂什么画好寓意连年有余的荷花九鱼图

日本|日本公园现透明厕所：锁门变雾面 30分钟自动恢复成透明

钱江晚报|15天谈个恋爱，后面5天就坏事了！杭州公安刚抓的这些骗子，专骗大龄单身女子

幽默笑话大王|妈妈，你是怕他们放屁熏到我吗，幽默笑话：带孩子坐电梯

如来|菩提暗中给悟空安排了位保镖，实力不输如来，猴子却毫不知情