How to Quantitatively Analyze Memory Usage of Redis in the Event of a Memory Surge

contexts

I recently ran into a case where a Redis instance was experiencing a memory bloom.used_memoryreached 78.9G at its maximum, and the instance'smaxmemoryconfiguration but only 16G, which ultimately results in a large amount of data being evicted from the instance.

Here's when the problem occurredINFO MEMORYpart of the output content of the

# Memory
used_memory:84716542624
used_memory_human:78.90G
used_memory_rss:104497676288
used_memory_rss_human:97.32G
used_memory_peak:84716542624
used_memory_peak_human:78.90G
used_memory_peak_perc:100.00%
used_memory_overhead:75682545624
used_memory_startup:906952
used_memory_dataset:9033997000
used_memory_dataset_perc:10.66%
allocator_allocated:84715102264
allocator_active:101370822656
allocator_resident:102303637504
total_system_memory:810745470976
total_system_memory_human:755.07G
used_memory_lua:142336
used_memory_lua_human:139.00K
used_memory_scripts:6576
used_memory_scripts_human:6.42K
number_of_cached_scripts:13
maxmemory:17179869184
maxmemory_human:16.00G
maxmemory_policy:volatile-lru
allocator_frag_ratio:1.20
allocator_frag_bytes:16655720392

Memory surge, resulting in data eviction, is a relatively common problem in Redis. Many children in the face of such problems often lack a clear analysis of ideas, often mistakenly thought that it was caused by replication, RDB persistence and other operations. Next, let's look at how to systematically analyze this type of problem.

This paper consists of the following sections:

INFOused_memoryHow did it come about?
What is it?used_memory？
used_memoryWhat scenarios is memory typically used for?
Redis 7 changes in memory statistics.
Trigger conditions for data eviction - whenused_memory outweighmaxmemoryWill expulsion necessarily be triggered after that?
Finally, a script is shared to help with real-time analyticsused_memoryWhat specific portion of memory consumption is responsible for the growth of the

Where does used_memory in INFO come from?

When we execute theINFOcommand, Redis invokes thegenRedisInfoStringfunction to generate its output.

// 
sds genRedisInfoString(const char *section) {
    ...
    /* Memory */
    if (allsections || defsections || !strcasecmp(section,"memory")) {
        ...
        size_t zmalloc_used = zmalloc_used_memory();
        ...
        if (sections++) info = sdscat(info,"\r\n");
        info = sdscatprintf(info,
            "# Memory\r\n"
            "used_memory:%zu\r\n"
            "used_memory_human:%s\r\n"
            "used_memory_rss:%zu\r\n"
            ...
            "lazyfreed_objects:%zu\r\n",
            zmalloc_used,
            hmem,
            server.cron_malloc_stats.process_rss,
            ...
            lazyfreeGetFreedObjectsCount()
        );
        freeMemoryOverheadData(mh);
    }
    ...
    return info;
}

As you can see, the used_memory value comes from zmalloc_used, which in turn is passed through thezmalloc_used_memory()function gets it.

// 
size_t zmalloc_used_memory(void) {
    size_t um;
    atomicGet(used_memory,um);
    return um;
}

The implementation of zmalloc_used_memory() is simple: it reads the value of used_memory atomically.

What is used_memory

used_memoryis a static variable of typeredisAtomic size_twhichredisAtomicbe_AtomicAn alias for the type._Atomicis a keyword introduced by the C11 standard for declaring atomic types to ensure that operations on the type are atomic in a multithreaded environment and to avoid data contention.

#define redisAtomic _Atomic
static redisAtomic size_t used_memory = 0;

used_memory The update is mainly realized through two macro definitions:

#define update_zmalloc_stat_alloc(__n) atomicIncr(used_memory,(__n))
#define update_zmalloc_stat_free(__n) atomicDecr(used_memory,(__n))

Among them.update_zmalloc_stat_alloc(__n)is called when memory is allocated, which allows used_memory to be added via an atomic operation.__n。

(indicates contrast)update_zmalloc_stat_free(__n)is called when the memory is freed, which atomically decreases used_memory by the amount of__n。

These two macros ensure that during memory allocation and freeing theused_memoryand avoids the problem of data contention caused by concurrent operations.

When allocating or freeing memory through a function in a memory allocator (commonly used memory allocators are glibc's malloc, jemalloc, tcmalloc, and jemalloc is generally used in Redis), a simultaneous call toupdate_zmalloc_stat_allocmaybeupdate_zmalloc_stat_freeto update the used_memory value.

In Redis, memory management is mainly implemented through the following two functions:

// 
void *ztrymalloc_usable(size_t size, size_t *usable) {
    ASSERT_NO_SIZE_OVERFLOW(size);
    void *ptr = malloc(MALLOC_MIN_SIZE(size)+PREFIX_SIZE);

    if (!ptr) return NULL;
#ifdef HAVE_MALLOC_SIZE
    size = zmalloc_size(ptr);
    update_zmalloc_stat_alloc(size);
    if (usable) *usable = size;
    return ptr;
#else
    ...
#endif
}

void zfree(void *ptr) {
    ...
    if (ptr == NULL) return;
#ifdef HAVE_MALLOC_SIZE
    update_zmalloc_stat_free(zmalloc_size(ptr));
    free(ptr);
#else
   ...
#endif
}

Among them.

ztrymalloc_usablefunction is used to allocate memory. The function first calls themallocAllocate the memory. If the allocation is successful, the memory is allocated via theupdate_zmalloc_stat_allocUpdates the used_memory value.
zfree function is used to free the memory. Before freeing the memory, it is passed through theupdate_zmalloc_stat_freeAdjust the value of used_memory before calling thefreeFree the memory.

This mechanism ensures that Redis is able to accurately track memory allocations and releases and thus effectively manage memory usage.

used_memory What scenarios is memory typically used for?

used_memoryIt consists of two main parts:

The data itself: corresponds to theused_memory_dataset。
The overhead of managing and maintaining the data structure internally: the corresponding INFOused_memory_overhead。

Note that used_memory_dataset is not calculated based on the number of Keys and the memory used by the Keys, but is obtained by subtracting used_memory_overhead from used_memory.

Next, we focus on analyzingused_memory_overhead source. In fact, Redis provides a separate function - thegetMemoryOverheadDatathat is dedicated to calculating the memory overhead of this section.

// 
struct redisMemOverhead *getMemoryOverheadData(void) {
    int j;
    // mem_total is used to accumulate total memory overhead, which is eventually assigned to used_memory_overhead.
    size_t mem_total = 0;
    // mem is used to calculate memory usage for each section.
    size_t mem = 0;
    // Call zmalloc_used_memory() to get used_memory.
    size_t zmalloc_used = zmalloc_used_memory();
    // Allocate memory for a redisMemOverhead structure using zcalloc.
    struct redisMemOverhead *mh = zcalloc(sizeof(*mh));
    ...
    // Add the memory usage at Redis startup server.initial_memory_usage to the total memory overhead.
    mem_total += server.initial_memory_usage;

    mem = 0;
    // Add the memory overhead of the copy backlog buffer to the total memory overhead.
    if (server.repl_backlog)
        mem += zmalloc_size(server.repl_backlog);
    mh->repl_backlog = mem;
    mem_total += mem;

    /* Computing the memory used by the clients would be O(N) if done
     * here online. We use our values computed incrementally by
     * clientsCronTrackClientsMemUsage(). */
    // Calculate client-side memory overhead
    mh->clients_slaves = server.stat_clients_type_memory[CLIENT_TYPE_SLAVE];
    mh->clients_normal = server.stat_clients_type_memory[CLIENT_TYPE_MASTER]+
                         server.stat_clients_type_memory[CLIENT_TYPE_PUBSUB]+
                         server.stat_clients_type_memory[CLIENT_TYPE_NORMAL];
    mem_total += mh->clients_slaves;
    mem_total += mh->clients_normal;
    // Calculate memory overhead for AOF Buffer and AOF Rewrite Buffer
    mem = 0;
    if (server.aof_state != AOF_OFF) {
        mem += sdsZmallocSize(server.aof_buf);
        mem += aofRewriteBufferSize();
    }
    mh->aof_buffer = mem;
    mem_total+=mem;
    // Calculate the memory overhead of the Lua script cache
    mem = server.lua_scripts_mem;
    mem += dictSize(server.lua_scripts) * sizeof(dictEntry) +
        dictSlots(server.lua_scripts) * sizeof(dictEntry*);
    mem += dictSize(server.repl_scriptcache_dict) * sizeof(dictEntry) +
        dictSlots(server.repl_scriptcache_dict) * sizeof(dictEntry*);
    if (listLength(server.repl_scriptcache_fifo) > 0) {
        mem += listLength(server.repl_scriptcache_fifo) * (sizeof(listNode) +
            sdsZmallocSize(listNodeValue(listFirst(server.repl_scriptcache_fifo))));
    }
    mh->lua_caches = mem;
    mem_total+=mem;
    // Calculate the memory overhead for the databases: traverse all databases (). For each database, calculate the memory overhead for the main dictionary (db->dict) and the expired dictionary (db->expires).
    for (j = 0; j < ; j++) {
        redisDb *db = +j;
        long long keyscount = dictSize(db->dict);
        if (keyscount==0) continue;

        mh->total_keys += keyscount;
        mh->db = zrealloc(mh->db,sizeof(mh->db[0])*(mh->num_dbs+1));
        mh->db[mh->num_dbs].dbid = j;

        mem = dictSize(db->dict) * sizeof(dictEntry) +
              dictSlots(db->dict) * sizeof(dictEntry*) +
              dictSize(db->dict) * sizeof(robj);
        mh->db[mh->num_dbs].overhead_ht_main = mem;
        mem_total+=mem;

        mem = dictSize(db->expires) * sizeof(dictEntry) +
              dictSlots(db->expires) * sizeof(dictEntry*);
        mh->db[mh->num_dbs].overhead_ht_expires = mem;
        mem_total+=mem;

        mh->num_dbs++;
    }
    // Assign the calculated mem_total to mh->overhead_total.
    mh->overhead_total = mem_total;
    // Calculate the memory overhead of the data (zmalloc_used - mem_total) and store it in mh->dataset.
    mh->dataset = zmalloc_used - mem_total;
    mh->peak_perc = (float)zmalloc_used*100/mh->peak_allocated;

    /* Metrics computed after subtracting the startup memory from
     * the total memory. */
    size_t net_usage = 1;
    if (zmalloc_used > mh->startup_allocated)
        net_usage = zmalloc_used - mh->startup_allocated;
    mh->dataset_perc = (float)mh->dataset*100/net_usage;
    mh->bytes_per_key = mh->total_keys ? (net_usage / mh->total_keys) : 0;

    return mh;
}

Based on the analysis of the above code, it is known that used_memory_overhead consists of the following parts:

server.initial_memory_usage: memory usage at Redis startup, corresponds to the amount of memory used in INFO.used_memory_startup。
mh->repl_backlog: memory overhead of the replication backlog buffer, corresponding to the INFOmem_replication_backlog。
mh->clients_slaves: memory overhead for slave libraries. Corresponds to themem_clients_slaves。
mh->clients_normal: memory overhead for other clients, corresponding to the INFO in themem_clients_normal。
mh->aof_buffer: memory overhead for the AOF buffer and the AOF rewrite buffer, corresponding to themem_aof_bufferThe AOF buffer is the buffer used before data is written to AOF.The AOF rewrite buffer is the buffer that is used to hold added data during an AOF rewrite.
mh->lua_caches: memory overhead for Lua script caching, corresponding to theused_memory_scriptsNew in Redis 5.0.
The memory overhead of the dictionary, which is not shown in the INFO, needs to be passed through theMEMORY STATSView.
```
17) "db.0"
18) 1) ""
    2) (integer) 2536870912
    3) ""
    4) (integer) 0
```

Of these memory overheads, used_memory_startup is essentially unchanged, mem_replication_backlog is limited by repl-backlog-size, used_memory_scripts overhead is generally not significant, and the memory overhead of dictionaries is proportional to the size of the data volume.

So, there are three main items to focus on:mem_clients_slaves，mem_clients_normal cap (a poem)mem_aof_buffer。

mem_aof_buffer: focuses on the size of the buffer during AOF rewrites.
mem_clients_slaves and mem_clients_normal: both are clients and have the same memory allocation. Memory overhead on the client side consists of the following three main components:

Input buffer: used to temporarily store client commands, the size is determined by theclient-query-buffer-limit Limitations.
Output buffer: used to cache the data sent to the client, the size is limited by theclient-output-buffer-limit Control. If the data exceeds the soft and hard limits and continues for a period of time, the client is shut down.
The memory occupied by the client object itself.

Redis 7 Changes in Memory Statistics

In Redis 7, memory overhead is also counted for the following items:

mh->cluster_links: memory overhead for cluster links, corresponding to the INFOmem_cluster_links。
mh->functions_caches: memory overhead for Function caches, corresponding to the INFOused_memory_functions。
The memory overhead for key-to-slot mapping in cluster mode, which corresponds to the-to-keys。

In addition, Redis 7 introduces Multi-Part AOF, a feature that removes the AOF rewrite buffer.

Note that the memory calculations for mh->repl_backlog and mh->clients_slaves have also changed.

Prior to Redis 7, mh->repl_backlog counted the size of the replication backlog buffer, and mh->clients_slaves counted the memory usage of all slave node clients.

if (server.repl_backlog)
    mem += zmalloc_size(server.repl_backlog);
mh->repl_backlog = mem;
mem_total += mem;

mem = 0;
// Iterate through all slave node clients, totaling up their output buffer, input buffer memory usage, and the memory footprint of the client objects themselves.
if (listLength()) {
    listIter li;
    listNode *ln;

    listRewind(,&li);
    while((ln = listNext(&li))) {
        client *c = listNodeValue(ln);
        mem += getClientOutputBufferMemoryUsage(c);
        mem += sdsAllocSize(c->querybuf);
        mem += sizeof(client);
    }
}
mh->clients_slaves = mem;

Because each slave node allocates a separate replication buffer (i.e., the slave node corresponds to the client's output buffer), this implementation results in wasted memory when the number of slave nodes increases. Not only that, whenclient-output-buffer-limitSetting it too large and having too many slave nodes also tends to cause the master node to go OOM.

To address this problem, Redis 7 introduces a global replication buffer. Both the replication backlog buffer (repl-backlog) and the replication buffers of slave nodes share this buffer.

replBufBlockstructure is used to store a block of the global copy buffer.

typedef struct replBufBlock {
    int refcount;           /* Number of replicas or repl backlog using. */
    long long id;           /* The unique incremental number. */
    long long repl_offset;  /* Start replication offset of the block. */
    size_t size, used;
    char buf[];
} replBufBlock;

everyonereplBufBlockContains arefcountfield to record how many replication instances (including the master node's replication backlog buffer and slave nodes) the block is referenced by.

When a new slave node is added, Redis does not allocate a new replication buffer block for it, but instead increments the existingreplBufBlock(used form a nominal expression)refcount。

Accordingly, the way mh->repl_backlog and mh->clients_slaves are calculated in memory has changed in Redis 7.

if (listLength() &&
    (long long)server.repl_buffer_mem > server.repl_backlog_size)
{
    mh->clients_slaves = server.repl_buffer_mem - server.repl_backlog_size;
    mh->repl_backlog = server.repl_backlog_size;
} else {
    mh->clients_slaves = 0;
    mh->repl_backlog = server.repl_buffer_mem;
}
if (server.repl_backlog) {
    /* The approximate memory of rax tree for indexed blocks. */
    mh->repl_backlog +=
        server.repl_backlog->blocks_index->numnodes * sizeof(raxNode) +
        raxSize(server.repl_backlog->blocks_index) * sizeof(void*);
}
mem_total += mh->repl_backlog;
mem_total += mh->clients_slaves;

Specifically, if the size of the global copy buffer is greater than the size of therepl-backlog-size, then the backlog buffer is copied (mh->repl_backlog) is taken as the size of therepl-backlog-sizeThe remaining portion is considered to be the memory used by the slave library (mh->clients_slaves). If the size of the global copy buffer is less than or equal to the size of therepl-backlog-size, then the size of the global copy buffer is taken directly.

In addition, since a Rax tree is introduced to index some of the nodes in the global replication buffer, replicating the backlog buffer also requires calculating the memory overhead of the Rax tree.

Trigger conditions for data expulsion

Many people have the misconception that as long as used_memory is greater than maxmemory, data eviction will be triggered. This is not the case.

The following conditions need to be met for the data to be expelled:

maxmemory must be greater than 0.
maxmemory-policy cannot be noeviction.
Memory usage is subject to certain conditions. Not used_memory is greater than maxmemory, but used_memory minus mem_not_counted_for_evict is greater than maxmemory.

Among them.mem_not_counted_for_evictvalue can be obtained with the INFO command, which is the size of thefreeMemoryGetNotCountedMemorycalculated in the function.

size_t freeMemoryGetNotCountedMemory(void) {
    size_t overhead = 0;
    int slaves = listLength();

    if (slaves) {
        listIter li;
        listNode *ln;

        listRewind(,&li);
        while((ln = listNext(&li))) {
            client *slave = listNodeValue(ln);
            overhead += getClientOutputBufferMemoryUsage(slave);
        }
    }
    if (server.aof_state != AOF_OFF) {
        overhead += sdsalloc(server.aof_buf)+aofRewriteBufferSize();
    }
    return overhead;
}

freeMemoryGetNotCountedMemoryfunction counts the total size of the replication buffer, AOF buffer, and AOF rewrite buffer for all slave nodes.

So, when Redis determines whether data needs to be evicted, it starts with theused_memoryThe memory footprint of the slave node copy buffer, the AOF buffer, and the AOF rewrite buffer is eliminated.

Redis Memory Analysis Script

Finally, share a script.

This script helps us to quickly analyze the memory usage of Redis. The output allows us to visualize the memory consumption of various parts of Redis and identify when theused_memory What specific portion of memory consumption is responsible for the increase.

Script address: /slowtech/dba-toolkit/blob/master/redis/redis_mem_usage_analyzer.py

# python3 redis_mem_usage_analyzer.py -host 10.0.1.182 -p 6379
Metric(2024-09-12 04:52:42)    Old Value            New Value(+3s)       Change per second   
==========================================================================================
Summary
---------------------------------------------
used_memory                    16.43G               16.44G               1.1M                
used_memory_dataset            11.93G               11.93G               22.66K              
used_memory_overhead           4.51G                4.51G                1.08M               

Overhead(Total)                4.51G                4.51G                1.08M               
---------------------------------------------
mem_clients_normal             440.57K              440.52K              -18.67B             
mem_clients_slaves             458.41M              461.63M              1.08M               
mem_replication_backlog        160M                 160M                 0B                  
mem_aof_buffer                 0B                   0B                   0B                  
used_memory_startup            793.17K              793.17K              0B                  
used_memory_scripts            0B                   0B                   0B                  
mem_hashtable                  3.9G                 3.9G                 0B                  

Evict & Fragmentation
---------------------------------------------
maxmemory                      20G                  20G                  0B                  
mem_not_counted_for_evict      458.45M              461.73M              1.1M                
mem_counted_for_evict          15.99G               15.99G               2.62K               
maxmemory_policy               volatile-lru         volatile-lru                             
used_memory_peak               16.43G               16.44G               1.1M                
used_memory_rss                16.77G               16.77G               1.32M               
mem_fragmentation_bytes        345.07M              345.75M              232.88K             

Others
---------------------------------------------
keys                           77860000             77860000             0.0                 
instantaneous_ops_per_sec      8339                 8435                                     
lazyfree_pending_objects       0                    0                    0.0

This script is used at regular intervals (by the-i parameter, the default is 3 seconds) to collect Redis memory data once. Then, it will compare the current collected data (New Value) with the last data (Old Value) and calculate the increment per second (Change per second).

The output is divided into four main sections:

Summary: Summary section, used_memory = used_memory_dataset + used_memory_overhead.
Overhead(Total): show the memory consumption of each item in used_memory_overhead. overhead(Total) is equal to the sum of all items, and theoretically should be equal to used_memory_overhead.
Evict & Fragmentation: shows some key metrics for evictions and memory fragmentation. Where mem_counted_for_evict = used_memory - mem_not_counted_for_evict whenmem_counted_for_evict outweighmaxmemory The data eviction is triggered only when the
Others: Some other important metrics, including keys (total number of keys), instantaneous_ops_per_sec (operations per second), and lazyfree_pending_objects (number of objects waiting to be freed by asynchronous deletion).

If foundmem_clients_normalmaybemem_clients_slavesis larger, you can specify --client to see the client's memory usage.

# python3 redis_mem_usage_analyzer.py -host 10.0.1.182 -p 6379 --client
ID    Address            Name  Age    Command         User     Qbuf       Omem       Total Memory   
----------------------------------------------------------------------------------------------------
216   10.0.1.75:37811          721    psync           default  0B         232.83M    232.85M        
217   10.0.1.22:35057          715    psync           default  0B         232.11M    232.13M        
453   10.0.0.198:51172         0      client          default  26B        0B         60.03K         
...

Among them.

Qbuf: size of the input buffer.
Omem: size of the output buffer.
Total Memory: The total memory occupied by this connection.

The results are output in ascending order of Total Memory.