文档章节

8-[多线程] 进程池线程池

o
 osc_x4h57ch8
发布于 2018/04/24 00:33
字数 1088
阅读 7
收藏 0
def

精选30+云产品,助力企业轻松上云!>>>

1、为甚需要进程池,线程池

 

介绍

官网:https://docs.python.org/dev/library/concurrent.futures.html

concurrent.futures模块提供了高度封装的异步调用接口
ThreadPoolExecutor:线程池,提供异步调用
ProcessPoolExecutor: 进程池,提供异步调用
Both implement the same interface, which is defined by the abstract Executor class.

    

 

 

2、基本方法

1、submit(fn, *args, **kwargs)    异步提交任务

2、map(func, *iterables, timeout=None, chunksize=1)     取代for循环submit的操作

3、shutdown(wait=True) 
相当于进程池的pool.close()+pool.join()操作
wait=True,等待池内所有任务执行完毕回收完资源后才继续
wait=False,立即返回,并不会等待池内的任务执行完毕
但不管wait参数为何值,整个程序都会等到所有任务执行完毕
submit和map必须在shutdown之前

4、result(timeout=None)    取得结果

5、add_done_callback(fn)    回调函数

  

 

3、进程池

The ProcessPoolExecutor class is an Executor subclass that uses a pool of processes to execute calls asynchronously. 
ProcessPoolExecutor uses the multiprocessing module, which allows it to side
-step the Global Interpreter Lock but also means that only picklable objects can be executed and returned. class concurrent.futures.ProcessPoolExecutor(max_workers=None, mp_context=None) An Executor subclass that executes calls asynchronously using a pool of at most max_workers processes. If max_workers is None or not given, it will default to the number of processors on the machine. If max_workers is lower or equal to 0, then a ValueError will be raised.

 

from concurrent.futures import ProcessPoolExecutor, ThreadPoolExecutor
import os
import time

def task(name):
    print('%s is running 《pid: %s》' % (name, os.getpid()))
    time.sleep(2)

if __name__ == '__main__':
    # p = Process(target=task, args=('子',))
    # p.start

    pool = ProcessPoolExecutor(4)  # 进程池max_workers:4个
    for i in range(10):     # 总共执行10次,每次4个进程的执行
        pool.submit(task, '子进程%s' % i)

    print('')

 

 

 

 

4、线程池

ThreadPoolExecutor is an Executor subclass that uses a pool of threads to execute calls asynchronously.
class concurrent.futures.ThreadPoolExecutor(max_workers=None, thread_name_prefix='')
An Executor subclass that uses a pool of at most max_workers threads to execute calls asynchronously.

Changed in version 3.5: If max_workers is None or not given, 
it will default to the number of processors on the machine, multiplied by 5, 
assuming that ThreadPoolExecutor is often used to overlap I/O instead of CPU work and the number of workers should be higher than the number of workers for ProcessPoolExecutor.

New in version 3.6: The thread_name_prefix argument was added to allow users to control the threading.
Thread names for worker threads created by the pool for easier debugging.

 

 

 

 5、map函数:取代了for+submit

from concurrent.futures import ThreadPoolExecutor,ProcessPoolExecutor

import os,time,random
def task(n):
    print('%s is runing' %os.getpid())
    time.sleep(random.randint(1,3))
    return n**2

if __name__ == '__main__':

    executor=ThreadPoolExecutor(max_workers=3)

    # for i in range(11):
    #     future=executor.submit(task,i)

    executor.map(task,range(1,12)) #map取代了for+submit

 

 

 6、异步调用与回调机制

(1)提交任务的两种方式

# 提交任务的两种方式
# 1、同步调用     提交完任务后,拿到结果,再执行下一行代码,导致程序是串行执行
# 2、异步调用    提交完任务后,不用等待任务执行完毕

  

(2)同步调用

from concurrent.futures import ThreadPoolExecutor
import time
import random


# 吃饭
def eat(name):
    print('%s is eat' % name)
    time.sleep(random.randint(1,5))
    ret = random.randint(7, 13) * '#'
    return {'name': name, 'ret': ret}


# 称重
def weight(body):
    name = body['name']
    size = len(body['ret'])
    print('%s 现在的体重是%s' %(name, size))


if __name__ == '__main__':
    pool = ThreadPoolExecutor(15)

    rice1 = pool.submit(eat, 'alex').result()   # 取得结果       # 执行函数eat
    weight(rice1)                                               # 执行函数weight

    rice2 = pool.submit(eat, 'jack').result()   
    weight(rice2)

    rice3 = pool.submit(eat, 'tom').result()    
    weight(rice3)



(2)同步调用2

 

   (3)回调函数

   

  

 

  (4)是钩子函数?

钩子函数是Windows消息处理机制的一部分,通过设置“钩子”,应用程序可以在系统级对所有消息、事件进行过滤,访问在正常情况下无法访问的消息。钩子的本质是一段用以处理系统消息的程序,通过系统调用,把它挂入系统 --- 百度百科的定义

     

对于前端来说,钩子函数就是指再所有函数执行前,我先执行了的函数,即 钩住 我感兴趣的函数,只要它执行,我就先执行。此概念(或者说现象)跟AOP(面向切面编程)很像

  

 7.线程池爬虫应用

(1)requests模块

import requests

# 输入网址,得到网址的源代码

response = requests.get('http://www.cnblogs.com/venicid/p/8923096.html')
print(response)    # 输出<Response [200]>
print(response.text)    # 以文本格式输出

 

 

(2)线程池爬虫

import requests
import time
from concurrent.futures import ThreadPoolExecutor


# 输入网址,得到网址的源代码
def get_code(url):
    print('GET ', url)
    response = requests.get(url)
    time.sleep(3)
    code = response.text
    return {'url': url, 'code': code}


# 打印源代码的长度
def print_len(ret):
    ret = ret.result()
    url = ret['url']
    code_len = len(ret['code'])
    print('%s length is %s' % (url, code_len))

if __name__ == '__main__':


    url_list = [
            'http://www.cnblogs.com/venicid/default.html?page=2',
            'http://www.cnblogs.com/venicid/p/8747383.html',
            'http://www.cnblogs.com/venicid/p/8923096.html',
        ]
    pool = ThreadPoolExecutor(2)
    for i in url_list:
        pool.submit(get_code, i).add_done_callback(print_len)

    pool.map(get_code, url_list)

 

o
粉丝 0
博文 500
码字总数 0
作品 0
私信 提问
加载中
请先登录后再评论。

暂无文章

OSChina 周日乱弹 —— 那么长的绳子,你这是放风筝呢

Osc乱弹歌单(2020)请戳(这里) 【今日歌曲】 @ 巴拉迪维:黑豹乐队的单曲《无地自容》 耳畔突然响起旋律,是那首老歌。中国摇滚有了《一无所有》不再一无所有;中国摇滚有了《无地自容》不...

小小编辑
36分钟前
31
1
《吐血整理》-顶级程序员书单集

你知道的越多,你不知道的越多 给岁月以文明,而不是给文明以岁月 前言 王潇:格局决定了一个人的梦想,梦想反过来决定行为。 那格局是什么呢? 格局是你能够看见的深度、广度和密度。 王潇认...

敖丙
2019/12/11
8
0
我可以在Android版式中加下划线吗? - Can I underline text in an Android layout?

问题: 如何在Android布局xml文件中定义带下划线的文本? 解决方案: 参考一: https://stackoom.com/question/A31z/我可以在Android版式中加下划线吗 参考二: https://oldbug.net/q/A31z/...

法国红酒甜
39分钟前
18
0
干掉ELK | 使用Prometheus+Grafana搭建监控平台

什么是Prometheus? Prometheus是由SoundCloud开发的开源监控报警系统和时序列数据库(TSDB)。Prometheus使用Go语言开发,是Google BorgMon监控系统的开源版本。 Prometheus的特点 · 多维度...

木九天
58分钟前
34
0
拉勾网拉你上勾

预览 需求简介 拉勾网是一个互联网行业的一个招聘网站,上面有许多职位,于是乎,小编想提取指定职位的基本信息(职位名,薪水,工作经验,工作地点,教育背景),然后插入 MongoDB 数据库,...

木下瞳
2019/04/17
20
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部