Quartz Cluster Enhancement 🎉

Quartz Cluster Enhancements 😘

Reproduced with attribution/funnyzpc/p/18534034

This is in addition tomee_adminThe most time and effort invested in open source other thanquartz cluster enhancementsIt took me more than 4 months of time and effort to develop it, so it wasn't easy.
Let's start with a brief overview:quartz cluster enhancementsleave it (to sb)quartz(used form a nominal expression)2.3.2 The release has been revamped with some reductions in functionality, as well as a number of enhancements to existing pain points, which include, but are not limited to, the following.

1. Modification of the serialization and storage of mission parameters

Change the default parameter storage fromblobThe large field approach is modified to a fixed-length string approach to ensure that the postmaster modification is the result, while the internal use of the library for json string serialization and deserialization, more convenient to use
It also changes the pass parameter configuration from the originalK/V(object serialization) to (Map) object (eg.{"aa":1}) or (Array) list (eg.[11,true,{"bb":"her"}]), the use of the json structure, you can use the
() maybe() It's easier and faster to get the configured json parameters directly; incidentally: it's also possible to get the configured json parameters via the() Take out the original stored json string 😊

2. Separation of the implementation side and the configuration management side

This is a big change, if it is used on the execution side, just add the dependencyquartz-coreIf it is a back-end configuration then you only need to connect to the execution library (database) and use the dependencyquartz-client to configure it, and the execution side of managing a cluster or distribution only needs the
Use this to configure it (provided you are connecting to the same repository table), a change that would have been unthinkable in the original quartz, where thequartzThe not-so-fantastic way the execution side is coupled with the configuration side is a real headache and a puzzle 😂

3. Use of optimistic locks

Again, this is a significant change, with the original version using thelockTable row-level pessimistic locks, where each change contends for the sameSTATE_ACCESS maybeTRIGGER_ACCESS , while each execution of adding and releasing locks also needs to be accompanied by a thread-local variable (ThreadLocal) to use, it seems a bit unnecessary and very bulky at the same time the system overhead is relatively large
quartz centralization enhancementdiscontinuelock Table pessimistic locks, instead usingUPDATE ... WHERE ... way (optimistic locking), although there is also some query overhead, but the overhead of writing the db is greatly reduced.
At the same time, it should be noted that the use of optimistic locking still seems to be a bit insufficient, but currently there is no test bug (perhaps the test is not strict enough...), if there is a good idea, please let me know. , if there is a good IDEA please inform haha~ 🤦

4. Simplified table structure and number

Since the bottom layer removes thetriggerup tomemory stand-alone and other related logic, so the table structure and logic has been simplified accordingly, from the original version of the 11 tables simplified to 4 tables, yes, that's right ~, that is, four tables, four tables on the ok💪!
The overall design of the table is largely informed by themee_timed This is also a timed task component written by me, and currently the four tables are.

QRTZ_APP Application Table: Used to define the application, especially useful for managing distributed applications, while the executable will be automatically added or updated at startup without adding data manually.
QRTZ_NODE node table: used to define the application under the execution of the node, which is very useful for cluster applications, for example, you need to incrementally release the node through the start and stop nodes in order to incrementally update the execution of the end, the same table data is also automatically added or updated, very convenient!
QRTZ_JOB Task Configuration Table: Configure the basic information of the task (excluding execution events, a job task corresponds to multiple execution configurations/time items), this table mainly defines the execution class of the task and the associated application information.
QRTZ_EXECUTE Execute Configuration Table: Execute Configuration must be associated with a job, and Execute Configuration also has a separate state for postmaster operations.

5. Removed`group`(Group)

It's a little lessusefulnessThe concept of the group is used in the vast majority of cases.quartzis very confusing for developers (at least most of the projects I've experienced), thegroupThe use of the product can differ from expectations if not done carefully, as there aregroup This layer exists and the configuration tasks are slightly bloated.... 🤨🤨🤨
It's gone.group It also removes theTriggerKey and the associated logic, which essentially dilutes thegroup The concept and use of the benefits speak for themselves.

6. Compatible with the original quartz configuration items and integration methods

For commonly usedspringbootframework, integrated in the same way as the originalquartzThere is not much difference in the way they are used, the main difference being that they only need to be imported into thequartz cluster enhancements The table provided will suffice.starter（autoconfigure) in the configuration classes and methods that are not useful are made compatible with the
And then there's the originalcontextSlight changes, mainly the removal of theTriggerKeyenhancementjsonPassing values brings changes that make no difference with basic use ~ 😅😅😅😅

7. Misfire/flameout (MisfireHandler) and ClusterManager handling

First of all, the change is pretty big, too.
Let's start with what the lack of fire is due toquartzAll execution time points within are passed through the corresponding type of theTrigger(eg: CronTriggerImpl、 SimpleTriggerImpl) is calculated so that in the event of unforeseen errors as well as downtime, the execution timepoint cannot move forward, and if a compensation or recovery mechanism isfireTrigger The task scan cannot scan the task causing the task to fail to continue, which is a lack of fire (Misfire），
I'm not explaining it very well, so please correct me haha 😂
Because the original version of quartz has concurrency (clustering) and global pessimistic locking in these two blocks, once you trigger theMisfire then it may lead to mission agnostic problems, and at the same timeMisfireHandler、ClusterManager For two separate and distinct thread tasks (polling), but using the same lock, there is a timeliness problem (latency) with the logical processing
So for thisquartz cluster enhancements It's different.😉😉， Now I'm going to combine these two moves., utilization ClusterMisfireHandler handleMisfire task as well as the node check and cleanup tasks, which also use database locks, but in theClusterMisfireHandler Internal concurrency optimization to ensure that only one node executes under an application at a time, which is very important!

8. Other

Simplifies the management of the thread pool and is also compatible with the originalSimpleThreadPool
Due to the variation of the passed parameters, there is no need to make the operation of the large fields of the stored parameters compatible with the databases of various vendors.

architectural design

`quartz-client`+ Rear Tube Integration Effect

Possible and known problems at present

Whether it's the ClusterMisfireHandler that maintains the cluster and misfires, or the QuartzSchedulerThread that sweeps the tasks, they all do it at a frequency, with the QuartzSchedulerThread looping once every 5s and the ClusterMisfireHandler looping once every 15s. Here's the problem.
When adding a task, if you add a task at 10 o'clock, the task can be executed only after 5s at 10 o'clock at the earliest, which is due to the current frequency of task scanning and the current structure.
QuartzSchedulerThread scans the batch of tasks and then loops to the point of execution, if you shut down the node and the application will only stop the task when the current execution is complete.
There may be a flaw in QuartzSchedulerThread's handling of locks (how to ensure that a task is executed by only one node in a clustered environment), but it is not currently troubleshootable, and it will take time for users to test it to troubleshoot the root cause.
Removing groups is bad news for users of groups, count this as one 😂
Currently only supports postgresql, mysql, oracle database support of these three vendors, and has been tested, this is the use of limitations
client sdk (quartz-client) Although tested and approved, there may be unknown bugs and design flaws.

Quartz Cluster Enhancement 🎉