contexts
I recently ran into a case where a Redis instance was experiencing a memory bloom.used_memory
reached 78.9G at its maximum, and the instance'smaxmemory
configuration but only 16G, which ultimately results in a large amount of data being evicted from the instance.
Here's when the problem occurredINFO MEMORY
part of the output content of the
# Memory
used_memory:84716542624
used_memory_human:78.90G
used_memory_rss:104497676288
used_memory_rss_human:97.32G
used_memory_peak:84716542624
used_memory_peak_human:78.90G
used_memory_peak_perc:100.00%
used_memory_overhead:75682545624
used_memory_startup:906952
used_memory_dataset:9033997000
used_memory_dataset_perc:10.66%
allocator_allocated:84715102264
allocator_active:101370822656
allocator_resident:102303637504
total_system_memory:810745470976
total_system_memory_human:755.07G
used_memory_lua:142336
used_memory_lua_human:139.00K
used_memory_scripts:6576
used_memory_scripts_human:6.42K
number_of_cached_scripts:13
maxmemory:17179869184
maxmemory_human:16.00G
maxmemory_policy:volatile-lru
allocator_frag_ratio:1.20
allocator_frag_bytes:16655720392
Memory surge, resulting in data eviction, is a relatively common problem in Redis. Many children in the face of such problems often lack a clear analysis of ideas, often mistakenly thought that it was caused by replication, RDB persistence and other operations. Next, let's look at how to systematically analyze this type of problem.
This paper consists of the following sections:
- INFO
used_memory
How did it come about? - What is it?
used_memory
? -
used_memory
What scenarios is memory typically used for? - Redis 7 changes in memory statistics.
- Trigger conditions for data eviction - when
used_memory
outweighmaxmemory
Will expulsion necessarily be triggered after that? - Finally, a script is shared to help with real-time analytics
used_memory
What specific portion of memory consumption is responsible for the growth of the
Where does used_memory in INFO come from?
When we execute theINFO
command, Redis invokes thegenRedisInfoString
function to generate its output.
//
sds genRedisInfoString(const char *section) {
...
/* Memory */
if (allsections || defsections || !strcasecmp(section,"memory")) {
...
size_t zmalloc_used = zmalloc_used_memory();
...
if (sections++) info = sdscat(info,"\r\n");
info = sdscatprintf(info,
"# Memory\r\n"
"used_memory:%zu\r\n"
"used_memory_human:%s\r\n"
"used_memory_rss:%zu\r\n"
...
"lazyfreed_objects:%zu\r\n",
zmalloc_used,
hmem,
server.cron_malloc_stats.process_rss,
...
lazyfreeGetFreedObjectsCount()
);
freeMemoryOverheadData(mh);
}
...
return info;
}
As you can see, the used_memory value comes from zmalloc_used, which in turn is passed through thezmalloc_used_memory()
function gets it.
//
size_t zmalloc_used_memory(void) {
size_t um;
atomicGet(used_memory,um);
return um;
}
The implementation of zmalloc_used_memory() is simple: it reads the value of used_memory atomically.
What is used_memory
used_memory
is a static variable of typeredisAtomic size_t
whichredisAtomic
be_Atomic
An alias for the type._Atomic
is a keyword introduced by the C11 standard for declaring atomic types to ensure that operations on the type are atomic in a multithreaded environment and to avoid data contention.
#define redisAtomic _Atomic
static redisAtomic size_t used_memory = 0;
used_memory
The update is mainly realized through two macro definitions:
#define update_zmalloc_stat_alloc(__n) atomicIncr(used_memory,(__n))
#define update_zmalloc_stat_free(__n) atomicDecr(used_memory,(__n))
Among them.update_zmalloc_stat_alloc(__n)
is called when memory is allocated, which allows used_memory to be added via an atomic operation.__n。
(indicates contrast)update_zmalloc_stat_free(__n)
is called when the memory is freed, which atomically decreases used_memory by the amount of__n
。
These two macros ensure that during memory allocation and freeing theused_memory
and avoids the problem of data contention caused by concurrent operations.
When allocating or freeing memory through a function in a memory allocator (commonly used memory allocators are glibc's malloc, jemalloc, tcmalloc, and jemalloc is generally used in Redis), a simultaneous call toupdate_zmalloc_stat_alloc
maybeupdate_zmalloc_stat_free
to update the used_memory value.
In Redis, memory management is mainly implemented through the following two functions:
//
void *ztrymalloc_usable(size_t size, size_t *usable) {
ASSERT_NO_SIZE_OVERFLOW(size);
void *ptr = malloc(MALLOC_MIN_SIZE(size)+PREFIX_SIZE);
if (!ptr) return NULL;
#ifdef HAVE_MALLOC_SIZE
size = zmalloc_size(ptr);
update_zmalloc_stat_alloc(size);
if (usable) *usable = size;
return ptr;
#else
...
#endif
}
void zfree(void *ptr) {
...
if (ptr == NULL) return;
#ifdef HAVE_MALLOC_SIZE
update_zmalloc_stat_free(zmalloc_size(ptr));
free(ptr);
#else
...
#endif
}
Among them.
-
ztrymalloc_usable
function is used to allocate memory. The function first calls themalloc
Allocate the memory. If the allocation is successful, the memory is allocated via theupdate_zmalloc_stat_alloc
Updates the used_memory value. -
zfree
function is used to free the memory. Before freeing the memory, it is passed through theupdate_zmalloc_stat_free
Adjust the value of used_memory before calling thefree
Free the memory.
This mechanism ensures that Redis is able to accurately track memory allocations and releases and thus effectively manage memory usage.
used_memory What scenarios is memory typically used for?
used_memory
It consists of two main parts:
- The data itself: corresponds to the
used_memory_dataset
。 - The overhead of managing and maintaining the data structure internally: the corresponding INFO
used_memory_overhead
。
Note that used_memory_dataset is not calculated based on the number of Keys and the memory used by the Keys, but is obtained by subtracting used_memory_overhead from used_memory.
Next, we focus on analyzingused_memory_overhead
source. In fact, Redis provides a separate function - thegetMemoryOverheadData
that is dedicated to calculating the memory overhead of this section.
//
struct redisMemOverhead *getMemoryOverheadData(void) {
int j;
// mem_total is used to accumulate total memory overhead, which is eventually assigned to used_memory_overhead.
size_t mem_total = 0;
// mem is used to calculate memory usage for each section.
size_t mem = 0;
// Call zmalloc_used_memory() to get used_memory.
size_t zmalloc_used = zmalloc_used_memory();
// Allocate memory for a redisMemOverhead structure using zcalloc.
struct redisMemOverhead *mh = zcalloc(sizeof(*mh));
...
// Add the memory usage at Redis startup server.initial_memory_usage to the total memory overhead.
mem_total += server.initial_memory_usage;
mem = 0;
// Add the memory overhead of the copy backlog buffer to the total memory overhead.
if (server.repl_backlog)
mem += zmalloc_size(server.repl_backlog);
mh->repl_backlog = mem;
mem_total += mem;
/* Computing the memory used by the clients would be O(N) if done
* here online. We use our values computed incrementally by
* clientsCronTrackClientsMemUsage(). */
// Calculate client-side memory overhead
mh->clients_slaves = server.stat_clients_type_memory[CLIENT_TYPE_SLAVE];
mh->clients_normal = server.stat_clients_type_memory[CLIENT_TYPE_MASTER]+
server.stat_clients_type_memory[CLIENT_TYPE_PUBSUB]+
server.stat_clients_type_memory[CLIENT_TYPE_NORMAL];
mem_total += mh->clients_slaves;
mem_total += mh->clients_normal;
// Calculate memory overhead for AOF Buffer and AOF Rewrite Buffer
mem = 0;
if (server.aof_state != AOF_OFF) {
mem += sdsZmallocSize(server.aof_buf);
mem += aofRewriteBufferSize();
}
mh->aof_buffer = mem;
mem_total+=mem;
// Calculate the memory overhead of the Lua script cache
mem = server.lua_scripts_mem;
mem += dictSize(server.lua_scripts) * sizeof(dictEntry) +
dictSlots(server.lua_scripts) * sizeof(dictEntry*);
mem += dictSize(server.repl_scriptcache_dict) * sizeof(dictEntry) +
dictSlots(server.repl_scriptcache_dict) * sizeof(dictEntry*);
if (listLength(server.repl_scriptcache_fifo) > 0) {
mem += listLength(server.repl_scriptcache_fifo) * (sizeof(listNode) +
sdsZmallocSize(listNodeValue(listFirst(server.repl_scriptcache_fifo))));
}
mh->lua_caches = mem;
mem_total+=mem;
// Calculate the memory overhead for the databases: traverse all databases (). For each database, calculate the memory overhead for the main dictionary (db->dict) and the expired dictionary (db->expires).
for (j = 0; j < ; j++) {
redisDb *db = +j;
long long keyscount = dictSize(db->dict);
if (keyscount==0) continue;
mh->total_keys += keyscount;
mh->db = zrealloc(mh->db,sizeof(mh->db[0])*(mh->num_dbs+1));
mh->db[mh->num_dbs].dbid = j;
mem = dictSize(db->dict) * sizeof(dictEntry) +
dictSlots(db->dict) * sizeof(dictEntry*) +
dictSize(db->dict) * sizeof(robj);
mh->db[mh->num_dbs].overhead_ht_main = mem;
mem_total+=mem;
mem = dictSize(db->expires) * sizeof(dictEntry) +
dictSlots(db->expires) * sizeof(dictEntry*);
mh->db[mh->num_dbs].overhead_ht_expires = mem;
mem_total+=mem;
mh->num_dbs++;
}
// Assign the calculated mem_total to mh->overhead_total.
mh->overhead_total = mem_total;
// Calculate the memory overhead of the data (zmalloc_used - mem_total) and store it in mh->dataset.
mh->dataset = zmalloc_used - mem_total;
mh->peak_perc = (float)zmalloc_used*100/mh->peak_allocated;
/* Metrics computed after subtracting the startup memory from
* the total memory. */
size_t net_usage = 1;
if (zmalloc_used > mh->startup_allocated)
net_usage = zmalloc_used - mh->startup_allocated;
mh->dataset_perc = (float)mh->dataset*100/net_usage;
mh->bytes_per_key = mh->total_keys ? (net_usage / mh->total_keys) : 0;
return mh;
}
Based on the analysis of the above code, it is known that used_memory_overhead consists of the following parts:
-
server.initial_memory_usage: memory usage at Redis startup, corresponds to the amount of memory used in INFO.
used_memory_startup
。 -
mh->repl_backlog: memory overhead of the replication backlog buffer, corresponding to the INFO
mem_replication_backlog
。 -
mh->clients_slaves: memory overhead for slave libraries. Corresponds to the
mem_clients_slaves
。 -
mh->clients_normal: memory overhead for other clients, corresponding to the INFO in the
mem_clients_normal
。 -
mh->aof_buffer: memory overhead for the AOF buffer and the AOF rewrite buffer, corresponding to the
mem_aof_buffer
The AOF buffer is the buffer used before data is written to AOF.The AOF rewrite buffer is the buffer that is used to hold added data during an AOF rewrite. -
mh->lua_caches: memory overhead for Lua script caching, corresponding to the
used_memory_scripts
New in Redis 5.0. -
The memory overhead of the dictionary, which is not shown in the INFO, needs to be passed through the
MEMORY STATS
View.17) "db.0"
18) 1) ""
2) (integer) 2536870912
3) ""
4) (integer) 0
Of these memory overheads, used_memory_startup is essentially unchanged, mem_replication_backlog is limited by repl-backlog-size, used_memory_scripts overhead is generally not significant, and the memory overhead of dictionaries is proportional to the size of the data volume.
So, there are three main items to focus on:mem_clients_slaves
,mem_clients_normal
cap (a poem)mem_aof_buffer
。
- mem_aof_buffer: focuses on the size of the buffer during AOF rewrites.
- mem_clients_slaves and mem_clients_normal: both are clients and have the same memory allocation. Memory overhead on the client side consists of the following three main components:
- Input buffer: used to temporarily store client commands, the size is determined by the
client-query-buffer-limit
Limitations. - Output buffer: used to cache the data sent to the client, the size is limited by the
client-output-buffer-limit
Control. If the data exceeds the soft and hard limits and continues for a period of time, the client is shut down. - The memory occupied by the client object itself.
Redis 7 Changes in Memory Statistics
In Redis 7, memory overhead is also counted for the following items:
- mh->cluster_links: memory overhead for cluster links, corresponding to the INFO
mem_cluster_links
。 - mh->functions_caches: memory overhead for Function caches, corresponding to the INFO
used_memory_functions
。 - The memory overhead for key-to-slot mapping in cluster mode, which corresponds to the
-to-keys
。
In addition, Redis 7 introduces Multi-Part AOF, a feature that removes the AOF rewrite buffer.
Note that the memory calculations for mh->repl_backlog and mh->clients_slaves have also changed.
Prior to Redis 7, mh->repl_backlog counted the size of the replication backlog buffer, and mh->clients_slaves counted the memory usage of all slave node clients.
if (server.repl_backlog)
mem += zmalloc_size(server.repl_backlog);
mh->repl_backlog = mem;
mem_total += mem;
mem = 0;
// Iterate through all slave node clients, totaling up their output buffer, input buffer memory usage, and the memory footprint of the client objects themselves.
if (listLength()) {
listIter li;
listNode *ln;
listRewind(,&li);
while((ln = listNext(&li))) {
client *c = listNodeValue(ln);
mem += getClientOutputBufferMemoryUsage(c);
mem += sdsAllocSize(c->querybuf);
mem += sizeof(client);
}
}
mh->clients_slaves = mem;
Because each slave node allocates a separate replication buffer (i.e., the slave node corresponds to the client's output buffer), this implementation results in wasted memory when the number of slave nodes increases. Not only that, whenclient-output-buffer-limit
Setting it too large and having too many slave nodes also tends to cause the master node to go OOM.
To address this problem, Redis 7 introduces a global replication buffer. Both the replication backlog buffer (repl-backlog) and the replication buffers of slave nodes share this buffer.
replBufBlock
structure is used to store a block of the global copy buffer.
typedef struct replBufBlock {
int refcount; /* Number of replicas or repl backlog using. */
long long id; /* The unique incremental number. */
long long repl_offset; /* Start replication offset of the block. */
size_t size, used;
char buf[];
} replBufBlock;
everyonereplBufBlock
Contains arefcount
field to record how many replication instances (including the master node's replication backlog buffer and slave nodes) the block is referenced by.
When a new slave node is added, Redis does not allocate a new replication buffer block for it, but instead increments the existingreplBufBlock
(used form a nominal expression)refcount
。
Accordingly, the way mh->repl_backlog and mh->clients_slaves are calculated in memory has changed in Redis 7.
if (listLength() &&
(long long)server.repl_buffer_mem > server.repl_backlog_size)
{
mh->clients_slaves = server.repl_buffer_mem - server.repl_backlog_size;
mh->repl_backlog = server.repl_backlog_size;
} else {
mh->clients_slaves = 0;
mh->repl_backlog = server.repl_buffer_mem;
}
if (server.repl_backlog) {
/* The approximate memory of rax tree for indexed blocks. */
mh->repl_backlog +=
server.repl_backlog->blocks_index->numnodes * sizeof(raxNode) +
raxSize(server.repl_backlog->blocks_index) * sizeof(void*);
}
mem_total += mh->repl_backlog;
mem_total += mh->clients_slaves;
Specifically, if the size of the global copy buffer is greater than the size of therepl-backlog-size
, then the backlog buffer is copied (mh->repl_backlog
) is taken as the size of therepl-backlog-size
The remaining portion is considered to be the memory used by the slave library (mh->clients_slaves
). If the size of the global copy buffer is less than or equal to the size of therepl-backlog-size
, then the size of the global copy buffer is taken directly.
In addition, since a Rax tree is introduced to index some of the nodes in the global replication buffer, replicating the backlog buffer also requires calculating the memory overhead of the Rax tree.
Trigger conditions for data expulsion
Many people have the misconception that as long as used_memory is greater than maxmemory, data eviction will be triggered. This is not the case.
The following conditions need to be met for the data to be expelled:
- maxmemory must be greater than 0.
- maxmemory-policy cannot be noeviction.
- Memory usage is subject to certain conditions. Not used_memory is greater than maxmemory, but used_memory minus mem_not_counted_for_evict is greater than maxmemory.
Among them.mem_not_counted_for_evict
value can be obtained with the INFO command, which is the size of thefreeMemoryGetNotCountedMemory
calculated in the function.
size_t freeMemoryGetNotCountedMemory(void) {
size_t overhead = 0;
int slaves = listLength();
if (slaves) {
listIter li;
listNode *ln;
listRewind(,&li);
while((ln = listNext(&li))) {
client *slave = listNodeValue(ln);
overhead += getClientOutputBufferMemoryUsage(slave);
}
}
if (server.aof_state != AOF_OFF) {
overhead += sdsalloc(server.aof_buf)+aofRewriteBufferSize();
}
return overhead;
}
freeMemoryGetNotCountedMemory
function counts the total size of the replication buffer, AOF buffer, and AOF rewrite buffer for all slave nodes.
So, when Redis determines whether data needs to be evicted, it starts with theused_memory
The memory footprint of the slave node copy buffer, the AOF buffer, and the AOF rewrite buffer is eliminated.
Redis Memory Analysis Script
Finally, share a script.
This script helps us to quickly analyze the memory usage of Redis. The output allows us to visualize the memory consumption of various parts of Redis and identify when theused_memory
What specific portion of memory consumption is responsible for the increase.
Script address: /slowtech/dba-toolkit/blob/master/redis/redis_mem_usage_analyzer.py
# python3 redis_mem_usage_analyzer.py -host 10.0.1.182 -p 6379
Metric(2024-09-12 04:52:42) Old Value New Value(+3s) Change per second
==========================================================================================
Summary
---------------------------------------------
used_memory 16.43G 16.44G 1.1M
used_memory_dataset 11.93G 11.93G 22.66K
used_memory_overhead 4.51G 4.51G 1.08M
Overhead(Total) 4.51G 4.51G 1.08M
---------------------------------------------
mem_clients_normal 440.57K 440.52K -18.67B
mem_clients_slaves 458.41M 461.63M 1.08M
mem_replication_backlog 160M 160M 0B
mem_aof_buffer 0B 0B 0B
used_memory_startup 793.17K 793.17K 0B
used_memory_scripts 0B 0B 0B
mem_hashtable 3.9G 3.9G 0B
Evict & Fragmentation
---------------------------------------------
maxmemory 20G 20G 0B
mem_not_counted_for_evict 458.45M 461.73M 1.1M
mem_counted_for_evict 15.99G 15.99G 2.62K
maxmemory_policy volatile-lru volatile-lru
used_memory_peak 16.43G 16.44G 1.1M
used_memory_rss 16.77G 16.77G 1.32M
mem_fragmentation_bytes 345.07M 345.75M 232.88K
Others
---------------------------------------------
keys 77860000 77860000 0.0
instantaneous_ops_per_sec 8339 8435
lazyfree_pending_objects 0 0 0.0
This script is used at regular intervals (by the-i
parameter, the default is 3 seconds) to collect Redis memory data once. Then, it will compare the current collected data (New Value) with the last data (Old Value) and calculate the increment per second (Change per second).
The output is divided into four main sections:
- Summary: Summary section, used_memory = used_memory_dataset + used_memory_overhead.
- Overhead(Total): show the memory consumption of each item in used_memory_overhead. overhead(Total) is equal to the sum of all items, and theoretically should be equal to used_memory_overhead.
- Evict & Fragmentation: shows some key metrics for evictions and memory fragmentation. Where mem_counted_for_evict = used_memory - mem_not_counted_for_evict when
mem_counted_for_evict
outweighmaxmemory
The data eviction is triggered only when the - Others: Some other important metrics, including keys (total number of keys), instantaneous_ops_per_sec (operations per second), and lazyfree_pending_objects (number of objects waiting to be freed by asynchronous deletion).
If foundmem_clients_normal
maybemem_clients_slaves
is larger, you can specify --client to see the client's memory usage.
# python3 redis_mem_usage_analyzer.py -host 10.0.1.182 -p 6379 --client
ID Address Name Age Command User Qbuf Omem Total Memory
----------------------------------------------------------------------------------------------------
216 10.0.1.75:37811 721 psync default 0B 232.83M 232.85M
217 10.0.1.22:35057 715 psync default 0B 232.11M 232.13M
453 10.0.0.198:51172 0 client default 26B 0B 60.03K
...
Among them.
- Qbuf: size of the input buffer.
- Omem: size of the output buffer.
- Total Memory: The total memory occupied by this connection.
The results are output in ascending order of Total Memory.