BOE interview: tell us how CMS works?

The CMS (Concurrent Mark Sweep) garbage collector is known for its "shortest pause" and is therefore one of the most widely used garbage collectors prior to JDK 9. So, the question is, why does CMS achieve the shortest pause time and how does the CMS garbage collector work? Let's take a look.

How CMS works

The reason why CMS achieves the shortest pause times is due to its working principles, which are causally related.Because of the way the CMS works, the CMS is able to achieve the shortest possible pause times.。

"Minimum stopping time" means that the application pauses for as short a time as possible during garbage collection. That is, during garbage collection, Stop The World (STW) should be as short as possible, because if STW is short enough, then the application can execute faster.

So how does a CMS work?

The CMS garbage collector execution step is divided into the following two steps:

marking
removals

In the above process, the "marking" phase takes a lot of time, while the "clearing" phase takes less time (because clearing only requires deleting the garbage object and "recycling" it). ).

So what can be done to improve overall execution efficiency and ensure the shortest possible downtime?

So the CMS designers started to think, and someone suggested: since the "mark" phase is time-consuming, let's handle the "mark" phase in stages, and preferably let it execute with the application thread, so that we don't need the STW (global stop), so the stopping time will be shorter, right?

Therefore, the designers of the CMS turned the "marking" phase of garbage collection into the following 3 phases:

Initial marking (STW): Marks only objects that are directly associated with GC Roots, which is very fast.
Concurrency flagging (concurrent execution with user threads): Objects directly associated with GC Roots continue to be traversed and labeled down (all the way), which takes longer.
Relabeling (STW): Corrects the marking of objects that have been changed due to the execution of the user thread in the previous step "Concurrent Marking".

The specific implementation process is described below:

In the whole marking process, only initial marking and re-marking need STW, but the initial marking only marks the objects directly associated with GC Roots, while re-marking only corrects the marking of the user threads in the "concurrent marking" phase, so both phases are executed in a very short period of time, and the whole STW pause time is also very short. Therefore, the whole STW pause time is very short.

And in order to allow CMS to have a shorter pause time, after the "mark" phase is completed, CMS adopts a concurrent cleanup strategy, i.e., the GC garbage collection thread is executed together with the user's thread, which eliminates the need for a STW, and thus makes the execution more efficient.

The complete CMS implementation process is as follows:

Initial marking (STW): Marks only objects that are directly associated with GC Roots, which is very fast.
Concurrency flagging (concurrent execution with user threads): Objects directly associated with GC Roots continue to be traversed and labeled down (all the way), which takes longer.
Relabeling (STW): Corrects the marking of objects that have changed due to the execution of the user thread in the previous step "Concurrent Marking".
Concurrent clearing (concurrent execution with user threads): The garbage object is removed using the concurrent-removal algorithm.

The execution process is shown in the following figure:

Post-lesson Reflections

Although CMS can achieve the shortest possible pause time, it has some drawbacks, such as the problem of memory fragmentation. Why does it suffer from memory fragmentation? And how can we solve the memory fragmentation problem?

This article has been included in my interview mini-site, which contains modules such as Redis, JVM, Concurrency, Concurrency, MySQL, Spring, Spring MVC, Spring Boot, Spring Cloud, MyBatis, Design Patterns, Message Queuing and more.