Ukulele百度贴吧图片python3单线程爬取
博客专区 > anglecv 的博客 > 博客详情
Ukulele百度贴吧图片python3单线程爬取
anglecv 发表于3个月前
Ukulele百度贴吧图片python3单线程爬取
  • 发表于 3个月前
  • 阅读 2
  • 收藏 0
  • 点赞 0
  • 评论 0

腾讯云 新注册用户 域名抢购1元起>>>   

摘要: python 单线程下载百度贴吧图片。手动读取Json.

前因

> 由于最近在找Uku谱子, 发现贴吧图片的资源还行,就弄着下载了。

代码

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Author: actanble
# @Date:   2017-9-12 11:59:24

import urllib.request as ur
import os

class Spyder():

    def __init__(self, url):
        self.url = url

    def open_url(self):
        headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20100101 Firefox/23.0'}
        req = ur.Request(url=self.url, headers=headers)  # python2,urllib.request()
        response = ur.urlopen(req)  # python2,urllib2.urlopen()
        return response.read()

def main():

    with open("json.txt", "r+") as f:
        urls = f.readlines()
        f.close()
    import json, requests
    res = []
    for url1 in urls:
        json1 = json.loads(requests.get(url1).text)
        print(json1["data"]["pic_list"])
        for x in json1["data"]["pic_list"]:
            res.append(x["pic_id"])

    def get_img_url(pid):
        return "https://imgsa.baidu.com/forum/pic/item/"+ pid +".jpg"

    i = 0
    with open("pic12.txt", "w+", ) as f:
        for x in res:
            f.write(get_img_url(x) + "\n")
            i += 1
        f.close()

    print(i)


def run():
    main()
    import requests
    def test_wirte():
        with open("pic12.txt", "r+") as f1:
            imgs = f1.readlines()
            i = 1
            for img in imgs:
                try:
                    os.mkdir("img_ukulele")
                except:
                    pass
                with open("./img_ukulele/"+str(i) + ".jpg", "wb") as f:
                    f.write(Spyder(img).open_url())
                    f.close()
                i+=1
            f1.close()


if __name__ == "__main__":
    run()

附件

json.txt

http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121321&pn=1&ps=1&pe=40&info=1&_=1505195101160
http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121321&pn=1&ps=121&pe=160&wall_type=v&_=1505195222386
http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121321&pn=1&ps=161&pe=200&wall_type=v&_=1505195222704
http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121321&pn=1&ps=41&pe=80&wall_type=v&_=1505195168755
http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121321&pn=1&ps=81&pe=120&wall_type=v&_=1505195221831
http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121318&pn=1&ps=1&pe=40&info=1&_=1505195457441
http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121318&pn=1&ps=41&pe=80&wall_type=v&_=1505195465864
http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121318&pn=1&ps=81&pe=120&wall_type=v&_=1505195490499
http://tieba.baidu.com/photo/g/bw/picture/list?kw=ukulele&alt=jview&rn=200&tid=2125121318&pn=1&ps=121&pe=160&wall_type=v&_=1505195492589
共有 人打赏支持
粉丝 0
博文 1
码字总数 462
×
anglecv
如果觉得我的文章对您有用,请随意打赏。您的支持将鼓励我继续创作!
* 金额(元)
¥1 ¥5 ¥10 ¥20 其他金额
打赏人
留言
* 支付类型
微信扫码支付
打赏金额:
已支付成功
打赏金额: