Fiber 为什么做并发IO的时候更加高效

原创
2018/11/01 11:17
阅读数 881

Java 要在新的JDK版本中将支持协程,java loom project ,这个project的发起人正是Quasar的作者Ron. 相关的概念在里面都有解释。

Fiber 协程(轻量级用户态线程),后文统称为fiber。

fiber怎样执行

Disclaimer: this is a rough and inaccurate description of the scheduling in the kernel and in the go runtime aimed at explaining the concepts, not at being an exact or detailed explanation of the real system.

As you may (or not know), a CPU can't actually run two programs at the same time: a CPU only have one execution thread, which can execute one instruction at a time. The direct consequence on early systems was that you couldn't run two programs at the same time, each program needing (system-wise) a dedicated thread.

The solution currently adopted is called pseudo-parallelism: given a number of logical threads (e.g multiple programs), the system will execute one of the logical threads during a certain amount of time then switch to the next one. Using really small amounts of time (in the order of milliseconds), you give the human user the illusion of parallelism. This operation is called scheduling.

The Go language doesn't use this system directly: it itself implement a scheduler that run on top of the system scheduler, and schedule the execution of the goroutines itself, bypassing the performance cost of using a real thread for each routine. This type of system is called light/green thread.

上面描述的是基于fiber的并发模型,fiber是一段连续的指令,可以被暂停和恢复,被CPU调度。体现效率的原因是CPU调度fiber的时候可以连续不断的执行多个fiber不间断。

对比多线程,比如java的1:1线程模型。多个线程要不断的进行切换产生额外的context swicth开销,特别是当线程数量特别大的时候。

当fiber遇到system call(系统调用,需要切换用户态至内核态),会另外启用一个内核线程执行内核相关操作,完成之后回调通知这个fiber,可以继续被CPU调度了。请参考Go scheduler: Ms, Ps & Gs

关于fiber的调度,请参考这篇论文Analysis of the Go runtime scheduler 和 这篇博客Go's work-stealing scheduler

fiber和IO

请先参考这篇博客The Go netpoller ,这里描述了fiber io的编程模型。下面我们通过代码的形式分析一下IO操作怎么高效执行的。

IO模型: BIO: per connection per thread. 处理能力和thread数量相关。
NIO:reactor + handler, 处理能力 ( Limited by CPU, bandwidth, File descriptors (not threads))

编程模型:

aync thread model

CompletableFuture<String> future = CompletableFuture.supplyAsync(() -> {
            try {
			// do some block io request.
                Thread.sleep(200);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            return "result";
        });

这是我们通常的异步操作 + 阻塞IO模型,这个IO操作的并发量就看线程池的大小,而当线程池太大设计太大的时候就会给系统带来其他的问题,cpu 会花费大量时间在线程切换上。

fiber model

new Fiber<Void>(new SuspendableRunnable() {
  public void run() throws SuspendExecution, InterruptedException {
     try {     
	            //do some block io operation
                Thread.sleep(200);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
    bar(); // call bar;
  }
}).start();

而fiber则可以同时创建百万级别,并发量能和socket一个级别。这就是当并发量大之后的性能优势。同时能编写同步调用的代码。

对比netty.

Netty 基于异步IO模型 + 事件驱动 的网络IO框架,目标也是充分利用CPU。
在网络请求方面使用Netty,和使用Fiber+BIO都可以达到非阻塞的效果,具体性能测试,后续找机会测。

同步和异步编程模型

所以就衍化出了两种编程模式

  1. Reactive 模式,也就是 Spring5支持的响应式编程 WebFlux + Netty编程栈。
  2. Fiber 模式,因为fiber还在JDK之后的版本开发中,可以看下Quasar,文档里有完整的支持方案。
展开阅读全文
打赏
0
0 收藏
分享
加载中
更多评论
打赏
0 评论
0 收藏
0
分享
返回顶部
顶部