概述
ForkJoinPool是JDK1.7加入的一个线程池类。Fork/Join技术是分治算法(Divide-and-Conquer)的并行实现,它是一项可以获得良好的并行性能的简单且高效的设计技术。目的是为了帮助我们更好的利用多处理器带来的好处,使用所有可用的运算能力来提升应用的性能。我们常用的数组工具类Arrays在JDK1.8之后添加的函数方式(如forEach等)也有运用。在整个JUC框架中,ForkJoinPool相对其他的类要复杂的多。
在java.util.concurrent包中,Fork/Join框架主要由ForkJoinPool、ForkJoinWorkerThread和ForkJoinTask来实现,它们之间有着复杂的联系。FOrkJoinPool中只可以运行ForkJoinTask类型的任务(在实际使用中,也可以接收Runnable/Callable任务,但是真正运行时,也会把这些任务封装成ForkJoinTask类型的任务);而ForkJoinWorkerThread是运行ForkJoinTask任务的工作线程。
ForkJoinPool并行的实现了分治算法:把任务递归的拆分为多个子任务,这样可以更好的利用系统资源,尽可能的使用所有可用的计算能力来提升应用性能。
ForkJoinPool的另一个特性是它使用了work-stealing(工作窃取)算法:线程池内的所有工作线程都尝试找到并执行已经提交的任务,或是被其他活动任务创建的子任务(如果不存在就阻塞等大概)。这种特性使得ForkJoinPool在运行多个可以产生子任务的任务,或者是提交的许多小任务时效率更高。尤其是构建异步模型的ForkJoinPool时,对不需要合并(join)的事件类型任务也非常适用。
在ForkJoinPool中,线程池中每个工作线程(ForkJoinWorkerThread)都对应一个任务队列(WorkQueue),工作线程优先处理来自自身队列的任务,然后以FIFO的顺序随机窃取其他队列中的任务。
ForkJoinPool中的任务分为两种:
- 一种是本地提交的任务(Submission Task,如execute、submit提交的任务)。
- 另一种是fork出的子任务(Worker Task)。
两种任务都会放在WorkQueue数组中,但是这两种任务并不会混合在同一个队列里,ForkJoinPool内部使用了一种随机哈希算法将工作队列与对应的工作线程关联起来。
- Submission任务放在WorkQueue数组的偶数索引位置。
- Worker任务存放在WorkerQueue数组的奇数索引位置。
要想看得懂ForkJoinPool的源码,必须先来阅读它类上的一大篇注释
/**
* An {@link ExecutorService} for running {@link ForkJoinTask}s.
* A {@code ForkJoinPool} provides the entry point for submissions
* from non-{@code ForkJoinTask} clients, as well as management and
* monitoring operations.
*
* ForkJoinPool是一个用于执行ForkJoinTask任务的线程池。
* ForkJoinPool还提供了非ForkJoinTask任务提交的入口以及管理和监控操作。
*
* <p>A {@code ForkJoinPool} differs from other kinds of {@link
* ExecutorService} mainly by virtue of employing
* <em>work-stealing</em>: all threads in the pool attempt to find and
* execute tasks submitted to the pool and/or created by other active
* tasks (eventually blocking waiting for work if none exist). This
* enables efficient processing when most tasks spawn other subtasks
* (as do most {@code ForkJoinTask}s), as well as when many small
* tasks are submitted to the pool from external clients. Especially
* when setting <em>asyncMode</em> to true in constructors, {@code
* ForkJoinPool}s may also be appropriate for use with event-style
* tasks that are never joined.
*
* ForkJoinPool线程池与其他ExecutorService线程池不同的点在于:
* ForkJoinPool线程池使用了work-stealing工作窃取算法。
* 线程池内的所有线程都努力尝试寻找并执行提交到线程池内的任务或者其他活跃
* 线程创建的子任务(如果不存在则阻塞)。
* 这种特性使得ForkJoinPool在运行多个可以产生子任务的任务,
* 或者是提交的许多小任务时效率更高。尤其是构建异步模型的ForkJoinPool时,
* 对不需要合并(join)的事件类型任务也非常适用。
*
* <p>A static {@link #commonPool()} is available and appropriate for
* most applications. The common pool is used by any ForkJoinTask that
* is not explicitly submitted to a specified pool. Using the common
* pool normally reduces resource usage (its threads are slowly
* reclaimed during periods of non-use, and reinstated upon subsequent
* use).
*
* 静态方法commonPool()构造的common池在很多场景下都适用。
* 没有显示提交给指定线程池的任何ForkJoinTask都使用这个common池。
* 使用公共池通常可以减少资源使用
* (common池内的线程在不使用时会逐渐被回收,再次需要用到时会重新恢复)。
*
* <p>For applications that require separate or custom pools, a {@code
* ForkJoinPool} may be constructed with a given target parallelism
* level; by default, equal to the number of available processors.
* The pool attempts to maintain enough active (or available) threads
* by dynamically adding, suspending, or resuming internal worker
* threads, even if some tasks are stalled waiting to join others.
* However, no such adjustments are guaranteed in the face of blocked
* I/O or other unmanaged synchronization. The nested {@link
* ManagedBlocker} interface enables extension of the kinds of
* synchronization accommodated.
*
* 对于一些需要独立的或者定制的线程池的应用,可以通过调用指定的并行级别
* 的ForkJoinPool构造器生成对应的线程池。默认的并行级别等于可用处理器的数量。
* 即使某些任务因等待加入其它线程而停滞不前,
* 这个线程池也会动态添加、挂起、恢复工作线程等方式
* 努力维护足够的(可用的)工作线程。
* 但是,面对阻塞的I/O或其它非托管同步操作,线程池不能保证有这样的调节操作。
* 嵌套的ManagedBlocker接口可以扩展锁容纳的同步类型。
*
* <p>In addition to execution and lifecycle control methods, this
* class provides status check methods (for example
* {@link #getStealCount}) that are intended to aid in developing,
* tuning, and monitoring fork/join applications. Also, method
* {@link #toString} returns indications of pool state in a
* convenient form for informal monitoring.
*
* 除了执行和生命周期控制等方法,ForkJoinPool还提供状态检查方法
* (如getStealCount),这些方法可以帮助开发,调试和监控fork/join线程池
* 同样,toString方法以方便的形式返回线程池的状态的指示,用于非正式监控。
*
* <p>As is the case with other ExecutorServices, there are three
* main task execution methods summarized in the following table.
* These are designed to be used primarily by clients not already
* engaged in fork/join computations in the current pool. The main
* forms of these methods accept instances of {@code ForkJoinTask},
* but overloaded forms also allow mixed execution of plain {@code
* Runnable}- or {@code Callable}- based activities as well. However,
* tasks that are already executing in a pool should normally instead
* use the within-computation forms listed in the table unless using
* async event-style tasks that are not usually joined, in which case
* there is little difference among choice of methods.
*
* 作为ExecutorServices的一种实现,ForkJoinPool有以下三种执行方法:
* 这些对象主要供尚未在当前fork/join池中进行计算的客户端使用。
* 这些方法的主要形式接收ForkJoinTask的实例,
* 但是重载形式还允许混合执行基于普通Runnable或Callable的任务。
*
* 以下是HTML语言的table语法:
*
* <table BORDER CELLPADDING=3 CELLSPACING=1>
* <caption>Summary of task execution methods</caption>
* <tr>
* <td></td>
* <td ALIGN=CENTER> <b>Call from non-fork/join clients</b></td>
* <td ALIGN=CENTER> <b>Call from within fork/join computations</b></td>
* </tr>
* <tr>
* <td> <b>Arrange async execution</b></td>
* <td> {@link #execute(ForkJoinTask)}</td>
* <td> {@link ForkJoinTask#fork}</td>
* </tr>
* <tr>
* <td> <b>Await and obtain result</b></td>
* <td> {@link #invoke(ForkJoinTask)}</td>
* <td> {@link ForkJoinTask#invoke}</td>
* </tr>
* <tr>
* <td> <b>Arrange exec and obtain Future</b></td>
* <td> {@link #submit(ForkJoinTask)}</td>
* <td> {@link ForkJoinTask#fork} (ForkJoinTasks <em>are</em> Futures)</td>
* </tr>
* </table>
*
* <p>The common pool is by default constructed with default
* parameters, but these may be controlled by setting three
* {@linkplain System#getProperty system properties}:
* <ul>
* <li>{@code java.util.concurrent.ForkJoinPool.common.parallelism}
* - the parallelism level, a non-negative integer
* <li>{@code java.util.concurrent.ForkJoinPool.common.threadFactory}
* - the class name of a {@link ForkJoinWorkerThreadFactory}
* <li>{@code java.util.concurrent.ForkJoinPool.common.exceptionHandler}
* - the class name of a {@link UncaughtExceptionHandler}
* </ul>
*
* common池使用默认的参数构造,但这些属性通过以下三个系统属性控制:
* 1.java.util.concurrent.ForkJoinPool.common.parallelism
* 系统并行度,是一个非负整数。
* 2.java.util.concurrent.ForkJoinPool.common.threadFactory
* 线程工厂
* 3.java.util.concurrent.ForkJoinPool.common.exceptionHandler
* UncaughtExceptionHandler对象
*
* If a {@link SecurityManager} is present and no factory is
* specified, then the default pool uses a factory supplying
* threads that have no {@link Permissions} enabled.
* The system class loader is used to load these classes.
* Upon any error in establishing these settings, default parameters
* are used. It is possible to disable or limit the use of threads in
* the common pool by setting the parallelism property to zero, and/or
* using a factory that may return {@code null}. However doing so may
* cause unjoined tasks to never be executed.
*
* 如果存在SecurityManager且未指定工厂,
* 则默认池将使用提供线程且未启用Permissions的工厂。
* 系统的类加载器将会加载这些类。
* 在建立这些设置时发生任何错误时,将使用默认参数。
* 通过将parallelism属性设置为0,或使用可能返回null的工厂,
* 可以在公共池中禁用或限制线程的使用。
* 但是,这样做可能导致未执行的任务永远不会执行。
*
* <p><b>Implementation notes</b>: This implementation restricts the
* maximum number of running threads to 32767. Attempts to create
* pools with greater than the maximum number result in
* {@code IllegalArgumentException}.
*
* 实现说明:此实现将运行线程的最大数量限制为32767。
* 尝试创建池的数量大于最大数量会导致IllegalArgumentException。
*
* <p>This implementation rejects submitted tasks (that is, by throwing
* {@link RejectedExecutionException}) only when the pool is shut down
* or internal resources have been exhausted.
*
* 仅当关闭池或内部资源已用尽时
* 此实现才拒绝提交的任务(即抛出RejectedExecutionException)
* @since 1.7
* @author Doug Lea
*/