Location>code7788 >text

Learn about the various garbage collectors used in the early days

Popularity:649 ℃/2024-11-07 08:17:29

These are the 7 garbage collectors in the HotSpot VM, wired to indicate that the garbage collectors can be used together.

Parallel collection: Refers to multiple garbage collection threads working in parallel, but at this point the user thread is still in a waiting state.

Concurrent collection: It means that the user thread and the garbage collection thread are working at the same time (not necessarily in parallel may alternate). The user program continues to run while the garbage collection program runs on another CPU

Throughput: That is, the ratio of the time the CPU spends running user code to the total time consumed by the CPU (Throughput = Time to run user code / ( Time to run user code + Garbage collection time )), that is. For example, if the VM runs for a total of 100 minutes and the garbage collector spends 1 minute, then the throughput is 99%

Serial / Serial Old Collector

Features: single-threaded, simple and efficient (compared to single-threaded other collectors), usesreplication algorithm. For environments limited to a single CPU, the Serial collector can naturally achieve the highest single-threaded collection efficiency by concentrating on garbage collection since there is no overhead of thread interaction. When the collector performs garbage collection, it must suspend all other worker threads until it finishes (Stop The World) Parameters: -XX:+UseSerialGC -XX:+UseSerialOldGC

Security point: Let all other threads stop at this point, so that the garbage collection does not move the object address, so that other threads can not find the moved object because it is serial, so there is only one garbage collection thread. And while this thread performs the recycling work, the other threads go into a blocking state

Serial Oldis an older-era version of the Serial collector: using themarking finishingarithmetic

Features:

  • single-threaded collector

    • High collection efficiency, no object reference changes

    • STW for a long time

  • Usage Scenario: Suitable for small memory within a few tens of megabytes, more suitable for simple services or single-CPU services to avoid the overhead of thread interaction.

  • Pros: small stack of memory and efficient execution on a single core CPU.

  • Cons: Large heap memory, not suitable for multi-core CPUs, very long recycle time.

ParNew Collector

Younger generation: -XX:+UserParNewGC Older generation with CMS

The ParNew collector is actually a multithreaded version of the Serial collector.

  • Features: Multi-threaded, ParNew collectors enable the same number of collection threads as the number of CPUs by default. In environments with very many CPUs, you can use the -XX:ParallelGCThreads parameter to limit the number of threads for garbage collection. Same Stop The World problem as Serial collector.

CMS Collector

Old Generation:-XX:+UserConcMarkSweepGC Young Generation paired with ParNew

Concurrent Mark Sweep, an old-age collector with the goal of obtaining the shortest recycling standstill time

  • Features: on the basis ofMarker removal algorithmImplementation. Concurrent collection, low stuttering, but memory fragmentation

The running process is divided into the following 4 steps:

  1. Initial marking: Stop the word occurs by labeling objects directly associated with GCRoots as well as younger generations pointing to older generations. but this phase is fast because there is no backtracking down, i.e., only one layer is labeled.

For example: Math math = new Math (); at this time, new Math () is a direct reference to the object of the math, and then down for indirect references are not recorded, such as constructive methods referenced in the other members of the variable

  1. Concurrent Markers: Then the entire reference chain is traversed and tagged starting from the directly referenced object of the gc roots. This process takes a long time but does not require stopping the user thread and can be run concurrently with the garbage collection thread. As the user thread continues to run, it may cause the state of the tagged objects to change, and this phase uses thethree-color markup algorithm, in the object header (Mark World) identifies a color attribute, different colors represent different phases, the scanning process to give the object a color, record the scanning location to prevent cpu time slice switching does not need to rescan.
  2. Relabeling: in order toFix the concurrency flagA record of the markup for the portion of the object where the markup changed during the period because the user program continued to run. There is still the Stop The World problem, which is slower here.
  3. Concurrent cleanup: After the marking is finished, the user thread is opened and the garbage collection thread starts to clear the unmarked area, if there are any new objects at this stage, they will be marked black and not processed in any way.

During the concurrent marking and concurrent clearing processes, which are the most time-consuming of the entire process, the collector thread can work alongside the user thread without needing to make a pause.

  • Application Scenarios:Applicable to focus on the response speed of the service, hope that the system downtime is the shortest, to bring a better user experience and other scenarios. For example, web programs, b/s services.

  • Pros:

    • Concurrent collection;

    • STW is relatively short with low pauses;

  • Drawbacks:

    • Low throughput: Low downtime comes at the expense of throughput, resulting in insufficient CPU utilization.
    • Memory fragmentation problem: CMS is essentially a collector that implements a mark-and-clear algorithm, which means that it will generate memory fragmentation, when fragmentation is very serious, this time there are large objects into the unallocated memory will trigger FullGC, special scenarios will be used Serial collector, resulting in an uncontrollable pause.
    • Unable to handle floating garbage, need to reserve space, possible Concurrent Mode Failure Floating garbage is the garbage generated during the concurrent cleanup phase due to user threads continuing to run, and this garbage can only be recovered at the next GC. Because of the floating garbage, a portion of memory needs to be set aside, which means that CMS collection cannot wait until the old ages are almost full like other collectors. If the reserved memory is not enough to hold the floating garbage, a Concurrent Mode Failure occurs, and the VM will temporarily enable Serial Old instead of CMS, resulting in a longer pause!

three-color markup algorithm

tricolor marking

  • Black: represents that one has been scanned and one's referenced objects have been identified.

  • Gray: means that you have been scanned, but your references have not been marked.

  • White: then it means it has not been scanned yet. At the end of the marking process, all unmarked objects are unreachable and can be recycled.

of the three-color marking algorithmProblem scenarios: When a business thread does an object reference change, it happens that the B object is not scanned and treated as garbage collected.

public class Demo3 {

    public static void main(String[] args) {
        R r = new R(); r
         R r = new R(); = new A(); B b = new B(); B
        B b = new B();
        // GCroot traverses R, R is black, a reference chain below R is grayed out, no references, switch time slice
         = b; // business thread changed reference.
        // Business thread has changed reference, original reference is null.
         = null; // GC thread comes back and continues from where it left off.
        // GC thread comes back to continue last scan, finds no references, then b is assumed to have no references cleared.
         = b; // GC recovers b, the business thread recovers it.
        // GC recycles b, and the business thread can't use it.
    }
}

class R {
    A a; B b;
    B b.
B b; }

class A {
    B b; }
}

class B {
}

When the GC thread marks A, the CPU time slice switches, and the business thread performs an object reference change, at which point the time slice returns to the GC thread, which continues to scan for object A. When it finds that A doesn't have any references, it assigns A to black and finishes scanning, so that B won't be scanned, and it will be marked as garbage, and it will be reclaimed in the cleanup phase, which is wrongly reclaiming the normal object, and generating a business exception.

CMS's solution based on this mislabeling is to take theWrite Barrier + Incremental UpdateIncremental Update , When an object change occurs in the business thread, re-mark R as gray and scan it again, Incremental Update still produces a missed mark in special scenarios. That is, when a black object is referenced by a new white object, record the black object where the reference change occurs, and change it back to a gray object and re-mark it.

public class Demo3 {

    public static void main(String[] args) {
        // Another problem with Incremental Update
        R r = new R(); A a = new A(); // The Incremental Update will also cause problems.
        
        A b = new A();
        r.a1 = a; // GC thread switch.
        // GC thread switch, r sweeps a1, but not a2, still grayed out
        r.a2 = b; // GC thread switch.
        // Business thread reference switch, r is grayed out (itself grayed out)
        r.a1 = b; // GC thread continues sweeping.
        // GC thread continues to sweep a2, r is black, b object is missing again~
    }
}

class R {
    A a1.
    A a2.
}

class A {
}

When the GC 1 thread is marking O, and has finished marking the attribute O.1 of O, and is ready to mark O.2, the business thread puts the attribute O,1 = B, which marks the O object as gray again, and the GC 1 thread cuts back to finish marking the O.2 thread, and then considers that all the marking of O is finished, and the O is marked in black, and the B object is marked with an omission, and the CMS has to pause all threads during the marking phase to scan again for those references that have been changed. CMS for Incremental Update problems, can only in the mark phase, suspend all threads, these have changed the reference, scan again.

Throughput-first Parallel

  • multi-threaded

  • Larger heap memory, multi-core CPU

  • Shortest STW (stop the world, stop all other worker threads) time per unit of time

  • Garbage collector used by default in JDK 1.8

Parallel Scavenge collector

Cenozoic collector, based onreplication algorithmImplemented collector. Characterized by throughput-first, so also known as throughput-first collectors, multi-threaded collectors capable of parallel collection, allowing multiple garbage collection threads to run simultaneously, reducing garbage collection time and increasing throughput. Parallel Scavenge collector focuses on throughput and efficient utilization of CPU resources. The CMS garbage collector focuses more on user thread downtime.

The Parallel Scavenge collector provides two parameters for precise control of throughput, the

-XX: MaxGCPauseMillis parameter and the -XX: GCTimeRatio parameter that directly sets the throughput size.

  • -XX: The value of the MaxGCPauseMillis parameter is a number of milliseconds greater than 0. The collector will try to ensure that memory reclamation does not take longer than the user-set value.

  • -XX: The value of the GCTimeRatio parameter is greater than 0 and less than 100, i.e., the ratio of garbage collection time to total time, which is equivalent to the inverse of throughput.

The goal of this collector is to achieve a manageable throughput. There is another point of interest:GC Adaptive Conditioning Strategy(the single most important difference from the ParNew collector)

GC adaptive regulation strategy: The Parallel Scavenge collector can set the -XX:+UseAdptiveSizePolicy parameter. When the switch is turned on there is no need to manually specify the size of the new generation (-Xmn), the ratio of Eden to Survivor zones (-XX:SurvivorRation), the age of the objects in the promoted old generation (-XX:PretenureSizeThreshold), etc., and the VM will dynamically set these parameters based on the system's operating conditions by collecting performance monitoring information The virtual machine collects performance monitoring information based on the system running condition and dynamically sets these parameters to provide the optimal stopping time and the highest throughput, this kind of regulation is called the adaptive regulation strategy of GC.

  • Usage scenarios: for memory between a few G, for background computing services or services that do not require too much interaction to ensure throughput.

  • Benefits: controlled throughput, guaranteed throughput, parallel collection.

  • Cons: STW during reclaim, reclaim pause time increases as heap memory grows.

Parallel Old Collector

is an older version of the Parallel Scavenge collector.

Features: multi-threaded, usingMarker Collation Algorithm(There are no surviving areas from the old days.)

  • Response time is prioritized

  • multi-threaded

  • Larger heap memory, multi-core CPU

  • Make the single STW time as short as possible (try not to interfere with other threads running)

Interview questions column

Java interview questions columnIt's online, so feel free to visit.

  • If you don't know how to write a resume, resume projects don't know how to package them;
  • If there's something on your resume that you're not sure if you should put on it or not;
  • If there are some comprehensive questions you don't know how to answer;

Then feel free to private message me and I will help you in any way I can.