Unlocking the Java Thread Pool: Practical Tips and Pitfalls to Avoid

Professional online typing practice website - Crafty Typewriter, only output valuable knowledge.

I. Preface

Thread pools are a common area of confusion for beginners, and in this class, we're going to go over some of the most common misconceptions about thread pools. In order to illustrate these points more clearly, let's take a specific definition of the thread pool as an example. The following is an example:

ThreadPoolExecutor executor = new ThreadPoolExecutor(20,50,100L, ,new LinkedBlockingQueue<>(100));

II Misconceptions about the timing of thread pool creation

concern: If 120 tasks are submitted to the thread pool (assuming that no task exits during submission when execution is complete), how many active threads would there normally be and how many tasks would be in the queue?

The key to answering this question lies in a deeper understanding of the underlying operation mechanism of the thread pool. Specifically, the number of core threads, the maximum number of threads, and the process of their co-operation with the task queue can be more clearly perceived by referring to the detailed description of the following figure:

suggestion: When dealing with front-office traffic-intensive business gateway systems, an optimized strategy is to set the number of core threads equal to the maximum number of threads. This is intended to avoid service response fluctuations caused by frequent creation and destruction of the thread pool as the system approaches the thread scaling threshold, the so-called "service response burr". The logic behind this practice is similar to the JVM's recommendation to configure the -Xms (initial heap memory size) and -Xmx (maximum heap memory size) parameters to equal values, both of which are intended to reduce performance fluctuations caused by dynamic resource adjustments and to ensure stable system operation.

III Is more threads better?

1. More threads are not always better

The specific reasons can be attributed to the following:

Each thread creation consumes system memory resources. According to the JVM specification, by default, the stack size of each thread is limited to about 1MB (this value can be adjusted by the JVM startup parameter -Xss). Therefore, when the number of threads is too large, it can significantly increase memory consumption and affect the efficient utilization of system resources.
If the sum of the time it takes to create and destroy threads exceeds the time it takes to actually perform the task, then creating additional threads is pointless and adds to the burden on the system.
Too many threads may also cause the operating system to make frequent thread context switches, which not only increases CPU overhead, but also reduces the amount of time the CPU has to efficiently execute user code, which adversely affects system performance.

2. What is the appropriate number of threads?

according toLittle's Law:: The number of requests for a system is equal to the product of the arrival rate of requests and the average time spent on each individual request.

The average number of system requests, estimated by the following formula:

Thread pool size = ((thread IO time + thread CPU time )/thread CPU time ) * number of CPUs

3. An example：

When the server has an 8-core CPU, if the CPU execution time of a task thread is 20 ms and the time spent by the thread due to waiting (e.g., network IO, disk IO) is 80 ms, the theoretical optimal number of threads is calculated as (Waiting time + CPU time) / CPU time * CPU cores = (80ms + 20ms) / 20ms * 8 = 40. This means that that setting up 40 threads may achieve optimal performance without considering other system loads and resource competition. However, this conclusion is based on theoretical calculations only and needs to be adjusted according to the specific performance of the system when actually deployed.

It is worth noting that a complex system often deploys multiple thread pools, which compete with each other for CPU, network bandwidth, memory and other valuable resources. Therefore, the setting of the optimal number of threads also needs to take into account the overall load condition of the system, resource utilization, and the actual execution characteristics of each task, and be verified and optimized through performance testing.

IV What is the appropriate thread pool queue length setting?

Improper thread pool queue configuration can lead to serious consequences, such as delayed execution of tasks and users not being able to get the results in time, or OOM (OutOfMemoryError) errors due to memory exhaustion. To avoid these problems, here are some suggestions on how to set the queue length reasonably:

Explicitly specify the queue size: Avoid using the default maximum value (e.g., Integer.MAX_VALUE), as this can lead to unlimited memory usage and eventually trigger a memory overflow. Explicitly setting a reasonable queue length limit is the key to preventing such problems.
Adjusting queue size based on actual scenarios: For tasks without strict runtime limitations, although a larger queue can be set to accommodate more tasks, system stability and task protection under abnormal conditions should be considered at the same time, such as system reboot that may lead to task loss. Therefore, when increasing the queue size, you need to weigh the task persistence and system security.
Tasks for C-end users need to calculate the queue size finely: for tasks with strict response time requirements, such as services for C-end users, the queue capacity needs to be calculated accurately according to the task execution speed and service timeout. For example, if the number of core threads is 20, the execution time of a single task is 500ms, and the response timeout time promised by the service is 2000ms, then the queue size can be calculated as 20*((2000/500)-1)=60, which not only ensures that the task will have a chance to be processed before the timeout time but also avoids the problem of request timeout failure caused by the queue being too long, so as to maintain the effectiveness and timeliness of the service response. Timeliness.

V. The discard strategy has its pitfalls

Issue 1: Denial policy set to DiscardPolicy or DiscardOldestPolicy with Future object calling get() method blocking issue

In concurrent programming in Java, a thread pool (e.g. ExecutorService) is a powerful tool for managing a set of concurrently executing threads. However, when the thread pool reaches its maximum capacity, newly submitted tasks need to be processed, which is usually defined through a rejection policy (RejectedExecutionHandler).DiscardPolicy and DiscardOldestPolicy are two common rejection policies, which represent the direct discarding of a new task and the discarding of the queue's oldest task without any kind of notification or processing.

problem analysis

DiscardPolicy: When the thread pool is unable to accept a new task, this policy silently discards the newly submitted task without throwing an exception or returning any error. This means that if you rely on the execution results of tasks and don't monitor their submission status in other ways, you may lose important tasks without realizing it.
DiscardOldestPolicy: unlike DiscardPolicy, this policy will try to make room for new tasks by discarding the longest waiting task in the queue. However, again, it does not provide any feedback to the task submitter unless you have additional mechanisms to track the execution status of the task.

Future object's get() method blocking problem
When using either of these rejection strategies and there is a rejected task, if you try to call the get() method to get the result via a Future object obtained by previously submitting the task, you may run into a situation where the thread is blocked indefinitely. This is because the get() method waits for the task to complete and returns its result, but if the task is never actually executed (because it was discarded), then the calling thread waits forever unless a timeout is set.

Setting a timeout: A timeout should always be specified when calling () (e.g., using get(long timeout, TimeUnit unit)) to prevent threads from waiting indefinitely.

try {
    Future<Result> future = (task);
    Result result = (10, ); // wait up to 10 seconds for the result to be processed
    // Process the result
} catch (TimeoutException e) {
    // Handle a timeout condition, either because the task was rejected or it took too long to execute.
} catch (InterruptedException | ExecutionException e) {
    // Handle other possible exceptions
}

Monitor Thread Pool Status: Regularly monitor the status of the thread pool (e.g., queue size, number of active threads, etc.) so that you can take action when necessary, such as adjusting the thread pool size or optimizing the task processing logic.
Use other rejection policies: If task loss is unacceptable, consider using a CallerRunsPolicy (which is executed directly in the thread where the task was submitted) or a custom rejection policy that provides more explicit feedback or processing logic.

Issue 2: Future object not calling get() method and perception of task exceptions

When a task is submitted using (), the method returns a Future object that represents the result of the asynchronous computation. However, if exceptions are thrown during task execution and you don't catch these exceptions inside the task or get the result by calling the () method, the exception information is not perceived outside the thread pool.

1. Detailed description of the problem:

Exception Lost: If an exception occurs in the task and is not caught, this exception will be encapsulated in an ExecutionException and thrown when () is called. However, if the get() method is never called, the exception is silently lost, leaving you with no way of knowing why the task execution failed.

2. Recommendations and examples:

Catching exceptions in tasks: Use try-catch blocks inside tasks to catch and handle exceptions that may occur. This can be done by printing logs, sending alerts, etc. so that if the task fails it can be detected and handled in time.

Runnable task = () -> {
    try {
        // Execute the task logic
    } catch (Exception e) {
        // Catch the exception and handle it
        ("Task execution failed", e); }
    }
}.

Call get() and specify a timeout: Even if you have already handled the exception inside the task, it is still recommended to call (long timeout, TimeUnit unit) to get the result and to handle any ExecutionException that may be thrown to ensure that all exceptions are handled properly.

Future<Void> future = (task).
try {
    (10, ); // Wait for the task to complete and handle any exceptions that may be thrown.
} catch (TimeoutException | InterruptedException | ExecutionException e) {
    // Handle the exception
}

VI Beware of Thread Pool Sharing for Multiple Services

Multiple lines of business sharing a single thread pool resource lurks with multiple pitfalls:

The difficulty of accommodating the unique needs of each line of business makes thread pool optimization complex and inefficient;
Once there is a problem with task processing in one business, its inefficiency or incorrect processing may spill over and affect the efficiency and stability of task execution in other lines of business;
In the problem troubleshooting phase, due to thread pool sharing, it is difficult to quickly locate the problem in a specific line of business directly by conventional means such as thread pool name.

Therefore, it is recommended to adopt the thread pool isolation strategy to ensure that the tasks of each line of business are processed in an independent thread pool environment from the beginning of the design, so as to ensure that they do not interfere with each other and run stably.

VII. Other potential risks

Misinformation when ThreadLocal is used in conjunction with thread pooling: Since threads in a thread pool are multiplexed, if ThreadLocal is used to store data within these threads, then when the threads are reassigned to a different task, it may result in the previously stored information being incorrectly accessed or modified, which can lead to misinformation about the data.
Blocking caused by nested parent-child thread pools in the line of business: In complex business logic, if there are parent-child thread pools nested with each other, the normal operation of the parent thread pool may be affected by the blocking or abnormality of the child thread pool, which may even lead to the blocking of the entire business process at a single point, affecting the overall performance and stability of the system.
Idle and Wasteful Thread Pool Resources: In some business scenarios, thread pools are not fully utilized after initialization, especially when some business functions are offline or adjusted, these idle thread pools still occupy system resources, resulting in unnecessary waste of resources.
Inefficiency due to real-time creation of thread pools: Creating thread pools in real-time during request processing not only fails to take advantage of thread reuse, but also may increase system overhead due to frequent creation and destruction of threads. It is recommended to define the thread pool as a static shared variable, which is created at the application startup or initialization phase, so that it can be reused throughout the application lifecycle, thus improving performance and resource utilization.