前几天有写到整合并发结果的文章,于是联想到了Fork/Join。因为在我看来整合并发结果其实就是Fork/Join中的Join步骤。所以今天我就把自己对Fork/Join一些浅显的理解记录下来。
1. Fork/Join是什么
Oracle的官方给出的定义是:Fork/Join框架是一个实现了ExecutorService接口的多线程处理器。它可以把一个大的任务划分为若干个小的任务并发执行,充分利用可用的资源,进而提高应用的执行效率。
Fork/Join实现了ExecutorService,所以它的任务也需要放在线程池中执行。它的不同在于它使用了工作窃取算法,空闲的线程可以从满负荷的线程中窃取任务来帮忙执行。(我个人理解的工作窃取大意就是:由于线程池中的每个线程都有一个队列,而且线程间互不影响。那么线程每次都从自己的任务队列的头部获取一个任务出来执行。如果某个时候一个线程的任务队列空了,而其余的线程任务队列中还有任务,那么这个线程就会从其他线程的任务队列中取一个任务出来帮忙执行。就像偷取了其他人的工作一样)
Fork/Join框架的核心是继承了AbstractExecutorService的ForkJoinPool类,它保证了工作窃取算法和ForkJoinTask的正常工作。
下面是引用Oracle官方定义的原文:
The fork/join framework is an implementation of the ExecutorService interface that helps you take advantage of multiple processors. It is designed for work that can be broken into smaller pieces recursively. The goal is to use all the available processing power to enhance the performance of your application.
As with any ExecutorService implementation, the fork/join framework distributes tasks to worker threads in a thread pool. The fork/join framework is distinct because it uses a work-stealing algorithm. Worker threads that run out of things to do can steal tasks from other threads that are still busy.
The center of the fork/join framework is the ForkJoinPool class, an extension of the AbstractExecutorService class. ForkJoinPool implements the core work-stealing algorithm and can execute ForkJoinTask processes.
2. Fork/Join的基本用法
(1)Fork/Join基类
上文已经提到,Fork/Join就是要讲一个大的任务分割成若干小的任务,所以第一步当然是要做任务的分割,大致方式如下:
if (这个任务足够小){
执行要做的任务
} else {
将任务分割成两小部分
执行两小部分并等待执行结果
}
要实现FrokJoinTask我们需要一个继承了RecursiveTask或RecursiveAction的基类,并根据自身业务情况将上面的代码放入基类的coupute方法中。RecursiveTask和RecursiveAction都继承了FrokJoinTask,它俩的区别就是RecursiveTask有返回值而RecursiveAction没有。下面是我做的一个选出字符串列表中还有"a"的元素的Demo:
@Override
protected List<String> compute() {
// 当end与start之间的差小于阈值时,开始进行实际筛选
if (end - this.start < threshold) {
List<String> temp = list.subList(this.start, end);
return temp.parallelStream().filter(s -> s.contains("a")).collect(Collectors.toList());
} else {
// 如果当end与start之间的差大于阈值时
// 将大任务分解成两个小任务。
int middle = (this.start + end) / 2;
ForkJoinTest left = new ForkJoinTest(list, this.start, middle, threshold);
ForkJoinTest right = new ForkJoinTest(list, middle, end, threshold);
// 并行执行两个“小任务”
left.fork();
right.fork();
// 把两个“小任务”的结果合并起来
List<String> join = left.join();
join.addAll(right.join());
return join;
}
}
(2)执行类
做好了基类就可以开始调用了,调用时首先我们需要Fork/Join线程池ForkJoinPool,然后向线程池中提交一个ForkJoinTask并得到结果。ForkJoinPool的submit方法的入参是一个ForkJoinTask,返回值也是一个ForkJoinTask,它提供一个get方法可以获取到执行结果。
代码如下:
ForkJoinPool pool = new ForkJoinPool();
// 提交可分解的ForkJoinTask任务
ForkJoinTask<List<String>> future = pool.submit(forkJoinService);
System.out.println(future.get());
// 关闭线程池
pool.shutdown();
就这样我们就完成了一个简单的Fork/Join的开发。
提示:Java8中java.util.Arrays的parallelSort()方法和java.util.streams包中封装的方法也都用到了Fork/Join。(细心的读者可能注意到我在Fork/Join中也有用到stream,所以其实这个Fork/Join是多余的,因为stream已经实现了Fork/Join,不过这只是一个Demo展示,没有任何实际用处也就无所谓了)
引用官方原文:
One such implementation, introduced in Java SE 8, is used by the java.util.Arrays class for its parallelSort() methods. These methods are similar to sort(), but leverage concurrency via the fork/join framework. Parallel sorting of large arrays is faster than sequential sorting when run on multiprocessor systems.
Another implementation of the fork/join framework is used by methods in the java.util.streams package, which is part of Project Lambda scheduled for the Java SE 8 release.
附完整代码以便以后参考:
1. 定义抽象类(用于拓展,此例中没有实际作用,可以不定义此类):
import java.util.concurrent.RecursiveTask;
/**
* Description: ForkJoin接口
* Designer: jack
* Date: 2017/8/3
* Version: 1.0.0
*/
public abstract class ForkJoinService<T> extends RecursiveTask<T>{
@Override
protected abstract T compute();
}
2. 定义基类
import java.util.List;
import java.util.stream.Collectors;
/**
* Description: ForkJoin基类
* Designer: jack
* Date: 2017/8/3
* Version: 1.0.0
*/
public class ForkJoinTest extends ForkJoinService<List<String>> {
private static ForkJoinTest forkJoinTest;
private int threshold; //阈值
private List<String> list; //待拆分List
private ForkJoinTest(List<String> list, int threshold) {
this.list = list;
this.threshold = threshold;
}
@Override
protected List<String> compute() {
// 当end与start之间的差小于阈值时,开始进行实际筛选
if (list.size() < threshold) {
return list.parallelStream().filter(s -> s.contains("a")).collect(Collectors.toList());
} else {
// 如果当end与start之间的差大于阈值时,将大任务分解成两个小任务。
int middle = list.size() / 2;
List<String> leftList = list.subList(0, middle);
List<String> rightList = list.subList(middle, list.size());
ForkJoinTest left = new ForkJoinTest(leftList, threshold);
ForkJoinTest right = new ForkJoinTest(rightList, threshold);
// 并行执行两个“小任务”
left.fork();
right.fork();
// 把两个“小任务”的结果合并起来
List<String> join = left.join();
join.addAll(right.join());
return join;
}
}
/**
* 获取ForkJoinTest实例
* @param list 待处理List
* @param threshold 阈值
* @return ForkJoinTest实例
*/
public static ForkJoinService<List<String>> getInstance(List<String> list, int threshold) {
if (forkJoinTest == null) {
synchronized (ForkJoinTest.class) {
if (forkJoinTest == null) {
forkJoinTest = new ForkJoinTest(list, threshold);
}
}
}
return forkJoinTest;
}
}
3. 执行类
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.ForkJoinTask;
/**
* Description: Fork/Join执行类
* Designer: jack
* Date: 2017/8/3
* Version: 1.0.0
*/
public class Test {
public static void main(String args[]) throws ExecutionException, InterruptedException {
String[] strings = {"a", "ah", "b", "ba", "ab", "ac", "sd", "fd", "ar", "te", "se", "te",
"sdr", "gdf", "df", "fg", "gh", "oa", "ah", "qwe", "re", "ty", "ui"};
List<String> stringList = new ArrayList<>(Arrays.asList(strings));
ForkJoinPool pool = new ForkJoinPool();
ForkJoinService<List<String>> forkJoinService = ForkJoinTest.getInstance(stringList, 20);
// 提交可分解的ForkJoinTask任务
ForkJoinTask<List<String>> future = pool.submit(forkJoinService);
System.out.println(future.get());
// 关闭线程池
pool.shutdown();
}
}
附源码地址:http://git.oschina.net/jack90john/forkJoin