本文记录学习 cookies 和 session 的一些小练习和知识点
知识点1 cookies 和 session 的由来
HTTP协议是无状态的协议,因为一旦浏览器和服务器之间的请求和响应完毕后,两者就会立马断开,也就是恢复成无状态。
这样就会导致一个问题,服务器永远无法辨认,也记不住用户的信息,于是cookies和session就出现了。
cookies不仅仅能实现自动登录(他自身携带了session的编码信息),网站还能根据cookies,记录你的浏览足迹,从而知道你的偏好,只要再加上推荐算法,就可以给你推送定制化的内容。
当然一份cookies不是永久有效的,他是有有效期的,过期后重新获取一份就可以了
知识点2 cookies的使用方法
获取 cookies
res = requests.post(url,headers=headers,data=data)
cookies = res.cookies
使用cookies
res = requests.post(url,headers=headers,data=data,cookies=cookies)
ps: data 是字典的形式
知识点3 session 的使用,以及使用过程中cookies的作用
session = requests.session()
用requests.session() 创建session对象,相当于创建了一个特定的会话,帮我们自动保持了cookies,但是此时的cookies是空的
print(type(session.cookies))
-- <class 'requests.cookies.RequestsCookieJar'>
print(session.cookies)
-- <RequestsCookieJar[]>
login = session.post(login_url,data=login_data,headers=headers)
在创建的session下用post发起登录请求,这里使用的是空的cookies和用户名密码,当用用户名和密码认证通过后,服务器发给了我一个cookies,后续可以使用这个cookies访问网站而不需要输入用户名和密码
print(type(session.cookies))
-- <class 'requests.cookies.RequestsCookieJar'>
print(session.cookies)
-- <RequestsCookieJar[<Cookie 328dab9653f517ceea1f6dfce2255032=2584219941bfcd0f4a161828d7340553 for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_logged_in_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ce293f09056a29f312ea6de87972ceac5d163d5ca00f12fc53cc2535294a7f7ae for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_test_cookie=WP+Cookie+check for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ca4d98d3fe22d30a73ecda7699820e7d5d538b687d195c18e6f1e79c88a55248b for wordpress-edu-3autumn.localprod.forc.work/wp-admin>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ca4d98d3fe22d30a73ecda7699820e7d5d538b687d195c18e6f1e79c88a55248b for wordpress-edu-3autumn.localprod.forc.work/wp-content/plugins>]>
comment = session.post(comment_url,data=comment_data,headers=headers)
在创建的session下用post发起评论请求,这次访问不需要输入用户名和密码,因为这个session里面有上次访问时服务器给我的cookies
print(type(session.cookies))
-- <class 'requests.cookies.RequestsCookieJar'>
print(session.cookies)
-- <RequestsCookieJar[<Cookie 328dab9653f517ceea1f6dfce2255032=2584219941bfcd0f4a161828d7340553 for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_logged_in_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ce293f09056a29f312ea6de87972ceac5d163d5ca00f12fc53cc2535294a7f7ae for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_test_cookie=WP+Cookie+check for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ca4d98d3fe22d30a73ecda7699820e7d5d538b687d195c18e6f1e79c88a55248b for wordpress-edu-3autumn.localprod.forc.work/wp-admin>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555590783%7CBNYpZI8bplOTGChwGNAPactQQc4PUfGFAY5WFx01Igv%7Ca4d98d3fe22d30a73ecda7699820e7d5d538b687d195c18e6f1e79c88a55248b for wordpress-edu-3autumn.localprod.forc.work/wp-content/plugins>]>
知识点4 cookies的保存
session = requests.session()
login = session.post(login_url,data=login_data,headers=headers)
获取cookies
print(type(session.cookies))
-- <class 'requests.cookies.RequestsCookieJar'>
print(session.cookies)
-- <RequestsCookieJar[<Cookie 328dab9653f517ceea1f6dfce2255032=2584219941bfcd0f4a161828d7340553 for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_logged_in_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7C7356d08183376075cb450dd37ee0f0234d809409aaacd09b817c61b6bb9e0433 for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_test_cookie=WP+Cookie+check for wordpress-edu-3autumn.localprod.forc.work/>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7Cae47c12929ccc844b49063d1a2cc034b8eccaf846059b7b33d49bdc980ec9c7e for wordpress-edu-3autumn.localprod.forc.work/wp-admin>, <Cookie wordpress_sec_9927dadafec8b913479e6af0fba5e181=spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7Cae47c12929ccc844b49063d1a2cc034b8eccaf846059b7b33d49bdc980ec9c7e for wordpress-edu-3autumn.localprod.forc.work/wp-content/plugins>]>
cookies_dict = requests.utils.dict_from_cookiejar(session.cookies)
将cookies转换成字典
print(type(cookies_dict))
-- <class 'dict'>
print(cookies_dict)
-- {'328dab9653f517ceea1f6dfce2255032': '2584219941bfcd0f4a161828d7340553', 'wordpress_logged_in_9927dadafec8b913479e6af0fba5e181': 'spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7C7356d08183376075cb450dd37ee0f0234d809409aaacd09b817c61b6bb9e0433', 'wordpress_test_cookie': 'WP+Cookie+check', 'wordpress_sec_9927dadafec8b913479e6af0fba5e181': 'spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7Cae47c12929ccc844b49063d1a2cc034b8eccaf846059b7b33d49bdc980ec9c7e'}
cookies_str = json.dumps(cookies_dict)
将字典转换成字符串
print(type(cookies_str))
-- <class 'str'>
print(cookies_str)
-- {"328dab9653f517ceea1f6dfce2255032": "2584219941bfcd0f4a161828d7340553", "wordpress_logged_in_9927dadafec8b913479e6af0fba5e181": "spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7C7356d08183376075cb450dd37ee0f0234d809409aaacd09b817c61b6bb9e0433", "wordpress_test_cookie": "WP+Cookie+check", "wordpress_sec_9927dadafec8b913479e6af0fba5e181": "spiderman%7C1555568045%7CkYUFJRt5ChpenXTSNaMaBnFLuXvpwZkgqfmNDzE2aSb%7Cae47c12929ccc844b49063d1a2cc034b8eccaf846059b7b33d49bdc980ec9c7e"}
with open('cookies.str','w',encoding='utf-8') as strfile:
strfile.write(cookies_str)
将字符串写入文件
cookies获取和保存过程如下:
1 session = requests.session()
2
3 login = session.post(login_url,data=login_data,headers=headers)
4
5 cookies_dict = requests.utils.dict_from_cookiejar(session.cookies)
6
7 cookies_str = json.dumps(cookies_dict)
8
9 with open('cookies.str','w',encoding='utf-8') as strfile:
10 strfile.write(cookies_str)
11
12
13 '''cookies_str_read = open('cookies.str','r')
14
15 cookies_dict_read = json.loads(cookies_str_read.read())
16
17 cookies_read = requests.utils.cookiejar_from_dict(cookies_dict_read)
18
19 session.cookies = cookies_read
20
21 comment = session.post(comment_url,data=comment_data,headers=headers)
22
23 print(comment)'''
知识点5 cookies的读取
读取cookies的过程正好与保存cookies的过程相反
cookies读取和使用过程如下:
1 session = requests.session()
2
3
4 '''login = session.post(login_url,data=login_data,headers=headers)
5
6 cookies_dict = requests.utils.dict_from_cookiejar(session.cookies)
7
8 cookies_str = json.dumps(cookies_dict)
9
10 with open('cookies.str','w',encoding='utf-8') as strfile:
11 strfile.write(cookies_str)
12 '''
13
14
15 cookies_str_read = open('cookies.str','r')
16
17 cookies_dict_read = json.loads(cookies_str_read.read())
18
19 cookies_read = requests.utils.cookiejar_from_dict(cookies_dict_read)
20
21 session.cookies = cookies_read
22
23 comment = session.post(comment_url,data=comment_data,headers=headers)
24
25 print(comment)
完整代码如下
1 import requests
2 import json
3
4 headers = {
5 'Connection': 'keep-alive' ,
6 'Pragma': 'no-cache' ,
7 'Cache-Control': 'no-cache' ,
8 'Origin': 'https://wordpress-edu-3autumn.localprod.forc.work' ,
9 'Upgrade-Insecure-Requests': '1' ,
10 'Content-Type': 'application/x-www-form-urlencoded' ,
11 'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36' ,
12 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3' ,
13 'Accept-Encoding': 'gzip, deflate, br' ,
14 'Accept-Language': 'zh-CN,zh;q=0.9'
15 }
16
17 login_url = 'https://wordpress-edu-3autumn.localprod.forc.work/wp-login.php'
18 login_data = {
19 'log': 'spiderman',
20 'pwd': 'crawler334566',
21 'wp-submit': '登录',
22 'redirect_to': 'https://wordpress-edu-3autumn.localprod.forc.work',
23 'testcookie': '1'
24 }
25
26 comment_url = 'https://wordpress-edu-3autumn.localprod.forc.work/wp-comments-post.php'
27 comment_data = {
28 'comment': '最新的评论内容',
29 'submit': '发表评论',
30 'comment_post_ID': '15',
31 'comment_parent': '0'
32 }
33
34 session = requests.session()
35
36 login = session.post(login_url,data=login_data,headers=headers)
37
38 cookies_dict = requests.utils.dict_from_cookiejar(session.cookies)
39
40 cookies_str = json.dumps(cookies_dict)
41
42 with open('cookies.str','w',encoding='utf-8') as strfile:
43 strfile.write(cookies_str)
44
45
46 cookies_str_read = open('cookies.str','r')
47
48 cookies_dict_read = json.loads(cookies_str_read.read())
49
50 cookies_read = requests.utils.cookiejar_from_dict(cookies_dict_read)
51
52 session.cookies = cookies_read
53
54 comment = session.post(comment_url,data=comment_data,headers=headers)
55
56 print(comment)
下面是老师的完整代码,几个功能模块做成了函数,调用起来非常方便
1 import requests, json
2 session = requests.session()
3 headers = {
4 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36'}
5
6 def cookies_read():
7 cookies_txt = open('cookies.txt', 'r')
8 cookies_dict = json.loads(cookies_txt.read())
9 cookies = requests.utils.cookiejar_from_dict(cookies_dict)
10 return (cookies)
11 # 以上4行代码,是cookies读取。
12
13 def sign_in():
14 url = ' https://wordpress-edu-3autumn.localprod.forc.work/wp-login.php'
15 data = {'log': input('请输入你的账号'),
16 'pwd': input('请输入你的密码'),
17 'wp-submit': '登录',
18 'redirect_to': 'https://wordpress-edu-3autumn.localprod.forc.work/wp-admin/',
19 'testcookie': '1'}
20 session.post(url, headers=headers, data=data)
21 cookies_dict = requests.utils.dict_from_cookiejar(session.cookies)
22 cookies_str = json.dumps(cookies_dict)
23 f = open('cookies.txt', 'w')
24 f.write(cookies_str)
25 f.close()
26 # 以上5行代码,是cookies存储。
27
28 def write_message():
29 url_2 = 'https://wordpress-edu-3autumn.localprod.forc.work/wp-comments-post.php'
30 data_2 = {
31 'comment': input('请输入你要发表的评论:'),
32 'submit': '发表评论',
33 'comment_post_ID': '13',
34 'comment_parent': '0'
35 }
36 return (session.post(url_2, headers=headers, data=data_2))
37 #以上9行代码,是发表评论。
38
39 try:
40 session.cookies = cookies_read()
41 except FileNotFoundError:
42 sign_in()
43 session.cookies = cookies_read()
44
45 num = write_message()
46 if num.status_code == 200:
47 print('成功啦!')
48 else:
49 sign_in()
50 session.cookies = cookies_read()
51 num = write_message()