Analyzing Redis

What is Redis

Redis is essentially a Key-Value type of in-memory database, the entire database is loaded in memory for operation, and the data in the database is periodically flushed to the hard disk through asynchronous operations for storage.

Because it is purely in-memory, Redis has excellent performance, handling over 100,000 reads and writes per second, making it the fastest performing Key-Value database known.

Redis Underpinnings

For the underpinnings of Redis, seehttps:///backend/redis2/#_1-preamble This is a very detailed article.

Threading Model for Redis

redis internal usefile event handlerIt is single-threaded, which is why redis is called a single-threaded model. It usesIO multiplexing mechanismListening to multiple sockets at the same time, the event generating sockets are pressed into the memory queue, and the event dispatcher selects the corresponding event handler according to the type of events on the sockets for processing.

Structure of the file event handler:

Multiple sockets
IO multiplexing program
Document Event Dispenser
Event Processors (Connection Answer Processor, Command Request Processor, Command Reply Processor)

1725246285647

1725248167657

Redis associates the connection answer handler with the AE_READABLE event when Redis starts initialization.
If a client initiates a connection with Redis, Redis generates an AE_READABLE event, which is handled by the Connection Answer Processor because it is associated with the Connection Answer Processor at the beginning of the connection. The Connection Answer Processor establishes a connection with the client, creates a socket for the client's response, and associates the AE_READABLE event of the socket with the Command Request Processor. socket and associates the AE_READABLE event of the socket with the command request processor.
If the client sends a command to Redis at this time (set k1 v1), the socket generates an AE_READABLE event, which is pressed into a queue by the IO multiplexer, and the event dispatcher obtains the event from the queue, and since the socket's AE_READABLE event is already associated with the command request processor Since the socket's AE_READABLE event is already associated with the command request processor, the event dispatcher passes the event to the command request processor, which reads the command in the event and completes it. When the operation is complete, Redis associates the socket's AE_WRITABLE event with the command-reply processor.
If the client is ready to receive data, the socket in Redis generates an AE_WRITABLE event, which is also pressed into a queue and then taken out by the event dispatcher and handed over to the corresponding command-response processor, which writes the prepared response data to the socket for the client to read.
After the command reply handler is written, it removes the association of the socket's AE_WRITABLE event with the command reply handler.

single-threaded processing flow

The main thread handles network I/O and command execution:
- In single-threaded mode, the main thread of Redis is responsible for both reading requests from the client and executing commands and sending responses. All work is done sequentially, in the order of the requests.
- The main thread polls all client connections, processing requests one by one.
Processes the request from Client A:
- The main thread first reads from client ASET key1 value1 Request.
- Immediately after the read is complete, the main thread parses and executes the command, sending thekey1 set tovalue1。
- The main thread will thenOK The result is sent back to client A.
Processes client B's request:
- Next, the main thread reads from Client B theGET key1 Request.
- After the read is complete, the main thread parses and executes the command, querying thekey1 The value of thevalue1。
- The main thread will resultvalue1 Returned to Client B.
Processes the request from Client C:
- Finally, the main thread reads from client C theSET key2 value2 Request.
- The main thread parses and executes the command, parsing thekey2 set tovalue2。
- wipeOK The result is returned to client C.

Specific step-by-step explanation

Step 1: Sequential Processing of Network I/O and Command Execution
- Redis polls the connections of clients A, B, and C sequentially and reads the requested data from them. In the main thread, network I/O and command execution are done synchronously, meaning that Redis processes all operations for one client before moving on to the next.
Step 2: Command Parsing and Execution
- When the main thread reads a complete command, it immediately parses the command and executes it. For example, the main thread reads from client A theSET key1 value1 Immediately after, thekey1 set tovalue1and returnsOK。
Step 3: Response Writeback
- After the main thread executes the command, it immediately sends the response back to the client. For example, client B requestsGET key1immediately after the main thread query.value1 sent to Client B.

multithreading mechanism

1725248255687

Sample Client Request

Assume that 3 clients are sending requests to Redis at the same time:

Client A sendsSET key1 value1
Client B sendsGET key1
Client C sendsSET key2 value2

Multi-threaded I/O Processing Flow

Network I/O phase:
- The 4 I/O threads of Redis start working and each thread is responsible for receiving data from a different client. Example:
  - I/O Thread 1 Reads from Client ASET key1 value1 of the request.
  - I/O Thread 2 Read from Client BGET key1 of the request.
  - I/O Thread 3 Read from Client CSET key2 value2 of the request.
Main thread command parsing and execution:
- Once the I/O threads receive the complete request data from the client, they pass it to the main Redis thread.
- The main thread is responsible for parsing the commands and executing them:
  - First, the main thread handlesSET key1 value1willkey1 set tovalue1。
  - The main thread then handlesGET key1Read and returnkey1 The value of (value1）。
  - Finally, the main thread handlesSET key2 value2willkey2 set tovalue2。
Network Response Phase:
- After the command is executed, the main thread passes the result back to the I/O thread:
  - I/O thread 1 willOK The response is returned to Client A.
  - I/O thread 2 willvalue1 Returned to Client B.
  - I/O thread 3 willOK Returned to Client C.

Underlying principles of memory elimination

1. Phase-out process

The Redis memory elimination execution process is as follows:

1. Each time Redis executes a command that sets the maximum memory size, maxmemory, and sets an elimination policy, it attempts to perform a key elimination;

First, it will evaluate whether the used memory (excluding the memory occupied by the two buffers used by the master-slave replication) is larger than maxmemory, if not, it will return directly, otherwise, it will calculate how much memory needs to be freed, and then it will start to eliminate the eligible Keys according to the policy; when it starts to do so, it will sample each database in turn, and the range of the data to sample will be determined by the policy, while the number of samples will be determined by the configuration of maxmemory-samples. The number of samples is determined by the maxmemory-samples configuration;

3. After completing the sampling, Redis will try to put the samples into the pre-initialized EvictionPoolLRU array, which is equivalent to a temporary buffer, when the array is full, it is about to delete all the keys inside.

4. If the memory is still insufficient after a deletion, then repeat the previous step again, the remaining Key in the sample to fill the array again for deletion, until enough memory is released, or all the Key of the sample has been deleted (if the memory is still insufficient at this point, then re-execute the process of elimination once again).

In the sampling step, involves the process of random sampling from the dictionary, due to the hash table Key is hash distribution, so there will be a lot of buckets are empty, pure random efficiency may be very low. Therefore, Redis uses a special approach, that is, the first continuous traversal of several buckets, if they are empty, and then randomly transferred to another location, and then continuous traversal of several buckets of ...... such a cycle until the end of the sampling.

You can refer to the source code to understand the process:

unsigned int dictGetSomeKeys(dict *d, dictEntry **des, unsigned int count) {
    unsigned long j; /* internal hash table id, 0 or 1. */
    unsigned long tables; /* 1 or 2 tables? */
    unsigned long stored = 0, maxsizemask;
    unsigned long maxsteps;

    if (dictSize(d) < count) count = dictSize(d);
    maxsteps = count*10;

    // If the dictionary is being migrated，then assist in relocating
    for (j = 0; j < count; j++) {
        if (dictIsRehashing(d))
            _dictRehashStep(d);
        else
            break;
    }

    tables = dictIsRehashing(d) ? 2 : 1;
    maxsizemask = d->ht[0].sizemask;
    if (tables > 1 && maxsizemask < d->ht[1].sizemask)
        maxsizemask = d->ht[1].sizemask;

    unsigned long i = random() & maxsizemask;
    unsigned long emptylen = 0;

    // When enough samples have been taken，Or end sampling if the retry limit has been reached.
    while(stored < count && maxsteps--) {
        for (j = 0; j < tables; j++) {
            if (tables == 2 && j == 0 && i < (unsigned long) d->rehashidx) {
                if (i >= d->ht[1].size)
                    i = d->rehashidx;
                else
                    continue;
            }

            // If a library's expiration dictionary has already been processed，Then process the next library
            if (i >= d->ht[j].size) continue;
            dictEntry *he = d->ht[j].table[i];

            // Traverse multiple buckets consecutively，If multiple barrels are empty，Then randomly jump to another location，Then repeat this step
            if (he == NULL) {
                emptylen++;
                if (emptylen >= 5 && emptylen > count) {
                    i = random() & maxsizemask;
                    emptylen = 0;
                }
            } else {
                emptylen = 0;
                while (he) {
                    *des = he;
                    des++;
                    he = he->next;
                    stored++;
                    if (stored == count) return stored;
                }
            }
        }

        // Find the next bucket
        i = (i+1) & maxsizemask;
    }
    return stored;
}

2. LRU realization

The full name of LRU is Least Recently Used. Generally speaking, LRU will eliminate the key with the earliest last access time from a batch of keys.

It is a very common cache recovery algorithm, and similar implementations are provided in cache libraries such as Guava Cache and Caffeine. We can also implement a cache based on JDK's LinkedHashMap that supports the LRU algorithm.
2.1 Approximate LRU
Traditional implementations of LRU algorithms typically maintain a chained table and move a node to the head of the chain when it has been accessed. After repeating this, the nodes of the chain table are sorted by the time of the most recent access. When the number of caches reaches the upper limit, we directly remove the tail node to remove the least recently accessed cache.

However, for Redis, if each Key add needs to additionally maintain and manipulate such a chain table, the extra cost is obviously unacceptable, so the LRU in Redis is NearlyLRU.

Each time a key is accessed, Redis records the access time in a structure, and when a key needs to be eliminated, it samples all the data and removes the key with the earliest access time from the sample.

It is characterized by:

Sampling is done only when needed, thus eliminating the need to maintain a chained table of the full amount of data, which avoids additional memory consumption.
Accesses only record the operation time on the structure without manipulating the linked table nodes, which avoids additional performance consumption.

Of course, there are advantages and disadvantages, this implementation also determines that the LRU of Redis is not 100% accurate, and the eliminated Key may not be the one that has the earliest last access time among all the Keys.

2.2 Sample size
Based on the above, it is easy to understand that when the number of samples is larger, the LRU eliminates the Key more accurately, and the relative overhead is also larger. Therefore, Redis allows us to configure the number of samples through maxmemory-samples (the default is 5), thus striking a balance between performance and accuracy.

3. LFU realization

LFU is fully known as Least Frequently Used , which means least recently used. It is characterized as follows:

Again the approximation algorithm is based on a sampling implementation, and maxmemory-samples works for it as well.
Instead of comparing the last access time, the data is accessed more frequently. When elimination occurs, priority is given to eliminating the Key with the lowest range frequency.

Its implementation is basically the same as that of LRU, but with some improvements in the counting part.

3.1 Probability counters
In redisObj, the structure used by Redis to store data, there is a 24-bit lru value field:

When the LRU algorithm is used, it is used to record the timestamp of the last access time.
When the LFU algorithm is used, it is divided into two parts, the high 16 bits are used to record the Last Decrement Time and the low 8 bits are used to record the Logistic Counter.

The core of LFU lies in the access frequency counter (hereinafter referred to as counter), which is a special value between 0 and 255, and it will be dynamically changed every time the key is accessed based on the mechanism of time decay and probability increment.

| This probability-based counter that counts a large number of events using a very small amount of memory is called a Morris counter, and it is an implementation of the probabilistic counting method.

3.2 Time decay
Whenever a Key is accessed, the counter is decayed according to the time difference between the current actual time and the last access time of the Key.

The decay value depends on the lfu_decay_time configuration, which represents a decay period. We can simply assume that the counter is decremented by one whenever the time interval satisfies a decay period.

For example, if we set lfu_decay_time to 1 minute, then if it's been 3 minutes and 30 seconds since the Key was last accessed, then the counter needs to be minus three.

3.3 Increasing probability
After the decay is complete, Redis will increment the counter based on the probability value corresponding to the lfu_log_factor configuration.

Here's the source code directly:

/* Logarithmically increment a counter. The greater is the current counter value
 * the less likely is that it gets really implemented. Saturate it at 255. */
uint8_t LFULogIncr(uint8_t counter) {
    // If the maximum value has been reached 255，Direct return
    if (counter == 255) return 255;
    // Gets a value between 0 until (a time) 1 A random value between
    double r = (double)rand()/RAND_MAX;
    // Based on the current counter 减去初始值得until (a time) baseval
    double baseval = counter - LFU_INIT_VAL;
    if (baseval < 0) baseval = 0;
    // utilization baseval*server.lfu_log_factor+1 得until (a time)一个概率值 p
    double p = 1.0/(baseval*server.lfu_log_factor+1);
    // (coll.) fail (a student) r < p hour，in increasing order counter
    if (r < p) counter++;
    return counter;
}

In short, understanding directly from the code, we can assume that the larger the counter and lfu_log_factor, the smaller the incremental probability:

Of course, it is important to consider the impact of the number of accesses, which is the official figure given by Redis:

3.4 Initial value of the counter
In order to prevent new Keys from being eliminated outright due to a counter of 0, Redis sets counter to 5 by default.

3.5 Selection of Sample Size
It is worth noting that when the amount of data is relatively large, if the sampling size is set too small, because of the limited number of samples sampled at one time, the difference in the weights of hot and cold data due to time decay will become inconspicuous, and at this time the advantages of the LFU algorithm are difficult to be reflected, even if the relatively hot data may also be frequently "mistakenly injured". Even the relatively hot data may be frequently "mistakenly injured".

So, if you have chosen the LFU algorithm as your elimination strategy and also have a relatively large amount of data, you may want to set the sampling size larger as well.