Redis big key analysis tool: support TOP N, batch analysis and slave node priority

background

Redis big key analysis tools are mainly divided into two categories:

1. Offline analysis

Based on RDB files for parsing, the commonly used tool is redis-rdb-tools (/sripathikrishnan/redis-rdb-tools).

However, this tool has not been updated in nearly 5 years and does not support Redis 7. Moreover, due to its use of Python, the parsing speed is slow.

The currently more active alternative tool is /HDT3213/rdb, which supports Redis 7 and is developed using Go.

2. Online analysis

A common tool is redis-cli, which provides two analysis methods:

--bigkeys: Redis 3.0.0 is introduced, which counts the number of elements in the key.
--memkeys: Redis 6.0.0 was introduced, throughMEMORY USAGEThe command counts the memory usage of the key.

The advantages and disadvantages of these two methods are as follows:

Offline analysis: parsing based on RDB files will not affect the online instance. The disadvantage is that the operation is relatively complicated, especially for many Redis cloud services. Due to the disabled SYNC command, it cannot be passed directly.redis-cli --rdb <filename>Download RDB files, you can only download them manually from the console.
Online analysis: The operation is simple. As long as you have access to the instance, you can directly perform analysis. The disadvantage is that the performance of online instances may have a certain impact on the analysis process.

The tools to be introduced in this article (redis-find-big-key) is also an online analysis tool, its implementation ideas andredis-cli --memkeysSimilar, but more powerful and practical. Mainly reflected in:

Support TOP N function

This tool can output the top N keys with the most memory usage, while redis-cli can only output the single key with the most memory usage in each type.
Supports batch analysis

This tool enables the analysis of multiple Redis nodes simultaneously, especially for Redis Cluster, enabling cluster mode (-cluster-modeAfter ) , each shard will be automatically analyzed. Redis-cli can only analyze for a single node.
Automatically select slave nodes for analysis

To reduce the impact on instance performance, the tool automatically selects slave nodes for analysis. Only when there is no slave node, the master node is selected for analysis. Redis-cli can only analyze the master node.

Test time comparison

Test environment: Redis 6.2.17, single instance, used_memory_human is 9.75G, key number is 100w, RDB file size is 3GB.

Here are the time-consuming situations of the four tools mentioned above when obtaining the 100 keys with the largest memory usage:

tool	time consuming
redis-rdb-tools	25m38.68s
/HDT3213/rdb	50.68s
redis-cli --memkeys	40.22s
redis-find-big-key	29.12s

Tool effects

# ./redis-find-big-key -addr 10.0.1.76:6379 -cluster-mode
Log file not specified, using default: /tmp/10.0.1.76:6379_20250222_043832.txt
Scanning keys from node: 10.0.1.76:6380 (slave)

Node: 10.0.1.76:6380
-------- Summary --------
Sampled 8 keys in the keyspace!
Total key length in bytes is 2.96 MB (avg len 379.43 KB)

Top biggest keys:
+------------------------------+--------+-----------+---------------------+
|             Key              |  Type  |   Size    | Number of elements  |
+------------------------------+--------+-----------+---------------------+
| mysortedset_20250222043729:1 |  zset  | 739.6 KB  |    8027 members     |
|   myhash_20250222043741:2    |  hash  | 648.12 KB |     9490 fields     |
| mysortedset_20250222043741:1 |  zset  | 536.44 KB |    5608 members     |
|    myset_20250222043729:1    |  set   | 399.66 KB |    8027 members     |
|    myset_20250222043741:1    |  set   | 328.36 KB |    5608 members     |
|   myhash_20250222043729:2    |  hash  | 222.65 KB |     3917 fields     |
|   mylist_20250222043729:1    |  list  | 160.54 KB |     8027 items      |
|    mykey_20250222043729:2    | string | 73 bytes  | 7 bytes (value len) |
+------------------------------+--------+-----------+---------------------+
Scanning keys from node: 10.0.1.202:6380 (slave)

Node: 10.0.1.202:6380
-------- Summary --------
Sampled 8 keys in the keyspace!
Total key length in bytes is 3.11 MB (avg len 398.23 KB)

Top biggest keys:
+------------------------------+--------+------------+---------------------+
|             Key              |  Type  |    Size    | Number of elements  |
+------------------------------+--------+------------+---------------------+
| mysortedset_20250222043741:2 |  zset  | 1020.13 KB |    9490 members     |
|    myset_20250222043741:2    |  set   | 588.81 KB  |    9490 members     |
|   myhash_20250222043729:1    |  hash  |  456.1 KB  |     8027 fields     |
| mysortedset_20250222043729:2 |  zset  |  404.5 KB  |    3917 members     |
|   myhash_20250222043741:1    |  hash  | 335.79 KB  |     5608 fields     |
|    myset_20250222043729:2    |  set   | 195.87 KB  |    3917 members     |
|   mylist_20250222043741:2    |  list  | 184.55 KB  |     9490 items      |
|    mykey_20250222043741:1    | string |  73 bytes  | 7 bytes (value len) |
+------------------------------+--------+------------+---------------------+
Scanning keys from node: 10.0.1.147:6380 (slave)

Node: 10.0.1.147:6380
-------- Summary --------
Sampled 4 keys in the keyspace!
Total key length in bytes is 192.9 KB (avg len 48.22 KB)

Top biggest keys:
+-------------------------+--------+-----------+---------------------+
|           Key           |  Type  |   Size    | Number of elements  |
+-------------------------+--------+-----------+---------------------+
| mylist_20250222043741:1 |  list  | 112.45 KB |     5608 items      |
| mylist_20250222043729:2 |  list  | 80.31 KB  |     3917 items      |
| mykey_20250222043729:1  | string | 73 bytes  | 7 bytes (value len) |
| mykey_20250222043741:2  | string | 73 bytes  | 7 bytes (value len) |
+-------------------------+--------+-----------+---------------------+

Tool address

Project address: /slowtech/redis-find-big-key

You can download the binary package directly or compile it in source code.

Download the binary package directly

# wget /slowtech/redis-find-big-key/releases/download/v1.0.0/
# tar xvf

After decompression, a name namedredis-find-big-keyexecutable file.

Source code translation

# wget /slowtech/redis-find-big-key/archive/refs/tags/v1.0.
# tar xvf v1.0. 
# cd redis-find-big-key-1.0.0
# go build

After compilation is completed, a name namedredis-find-big-keyexecutable file.

Parameter analysis

# ./redis-find-big-key --help
Usage of ./redis-find-big-key:
  -addr string
        Redis server address in the format <hostname>:<port>
  -cluster-mode
        Enable cluster mode to get keys from all shards in the Redis cluster
  -concurrency int
        Maximum number of nodes to process concurrently (default 1)
  -direct
        Perform operation on the specified node. If not specified, the operation will default to executing on the slave node
  -log-file string
        Log file for saving progress and intermediate result
  -master-yes
        Execute even if the Redis role is master
  -password string
        Redis password
  -samples uint
        Samples for memory usage (default 5)
  -skip-lazyfree-check
        Skip check lazyfree-lazy-expire
  -sleep float
        Sleep duration (in seconds) after processing each batch
  -tls
        Enable TLS for Redis connection
  -top int
        Maximum number of biggest keys to display (default 100)

The specific meanings of each parameter are as follows:

-addr: Specify the address of the Redis instance, in the format<hostname>:<port>, for example 10.0.0.108:6379. Notice,

If cluster mode is not enabled (-cluster-mode), multiple addresses can be specified, separated by commas, for example 10.0.0.108:6379, 10.0.0.108:6380.
If cluster mode is enabled, only one address can be specified, and the tool will automatically discover other nodes in the cluster.

-cluster-mode: Turn on cluster mode. The tool automatically analyzes each shard in the Redis Cluster and selects slaves first. Only when there is no slave node for the corresponding shard, will the master node be selected for analysis.
-concurrency: Set the concurrency degree, the default value is 1, that is, analyze it one by one. If there are many nodes to analyze, you can increase the concurrency to improve the analysis speed.
-direct: Perform analysis directly on the node specified by -addr, which will skip the default logic of automatically selecting slave nodes.
-log-file: Specifies the log file path, used to record progress information and intermediate process information during the analysis process. If not specified, the default is/tmp/<firstNode>_<timestamp>.txt, for example /tmp/10.0.0.108:6379_20250218_125955.txt.
-master-yes: If there is a master node in the node to be analyzed (common reasons: the slave node does not exist; specifying that you want to analyze on the master node through the -direct parameter), the tool will prompt the following error:

Error: nodes 10.0.1.76:6379 are master. To execute, you must specify --master-yes

If you are sure that analysis can be performed on the master node, you can specify -master-yes to skip detection.
-password: Specifies the password for the Redis instance.
-samples: SettingsMEMORY USAGE key [SAMPLES count]The number of samples in the command. For data structures containing multiple elements (such as LIST, SET, ZSET, HASH, STREAM, etc.), too low sampling volume may lead to inaccurate memory usage estimates, while too high will increase computation time and resource consumption. If SAMPLES is not specified, the default is 5.
-skip-lazyfree-check: If you are performing analysis on the master node, you need to pay special attention to expired keys. Because the scan operation will trigger the deletion of the expired key, if lazy deletion is not enabled (lazyfree-lazy-expire), the deletion operation will be performed in the main thread. At this time, deleting a large key may cause blockage and affect normal business requests.

Therefore, when the tool performs analysis on the master node, it automatically checks whether lazy deletion is enabled for that node. If not enabled, the tool will prompt the following error and terminate the operation to avoid impact on online business:

Error: nodes 10.0.1.76:6379 are master and lazyfree-lazy-expire is set to 'no'. Scanning might trigger large key expiration, which could block the main thread. Please set lazyfree-lazy-expire to 'yes' for better performance. To skip this check, you must specify --skip-lazyfree-check

In this case, it is recommended to passCONFIG SET lazyfree-lazy-expire yesThe command enables lazy deletion.

If you confirm that there is no expired large key, you can specify -skip-lazyfree-check to skip the detection.
-sleep: Set the sleep time after each batch of data is scanned.
-tls: Enable TLS connection.
-top: Shows the top N keys that consume the most memory. The default is 100.

Common usage

Analyze individual nodes

./redis-find-big-key -addr 10.0.1.76:6379
Scanning keys from node: 10.0.1.202:6380 (slave)

Note that in the above example, the specified node is not the same as the actual scanned node. This is because 10.0.1.76:6379 is the master node, and the tool will choose to analyze the slave library by default. The tool will scan the master node directly only if the specified master node does not have a slave library.

Analyze a single Redis cluster

./redis-find-big-key -addr 10.0.1.76:6379 -cluster-mode

Just provide the address of any node in the cluster, and the tool will automatically obtain the addresses of other nodes in the cluster. At the same time, the tool will give priority to the slave node for analysis. Only when a shard does not have a slave node will the master node of the shard be selected for analysis.

Analyze multiple nodes

./redis-find-big-key -addr 10.0.1.76:6379,10.0.1.202:6379,10.0.1.147:6379

Nodes are independent of each other, and can come from the same cluster or from different clusters. Note that if the -addr parameter specifies multiple node addresses, the -cluster-mode parameter cannot be used anymore.

Perform analysis of the master node

If you need to analyze the master node, you can specify the master node and use it-directparameter.

./redis-find-big-key -addr 10.0.1.76:6379 -direct -master-yes

Things to note

1. This tool is only available for Redis 4.0 and above becauseMEMORY USAGEandlazyfree-lazy-expireIt is supported since Redis 4.0.

2. It is normal that the size of the same key may be inconsistent in redis-find-big-key and redis-cli. The reason is that redis-find-big-key chooses slave library for analysis by default, so the key size in the slave library is usually displayed, while redis-cli can only analyze the main library, showing the key size in the main library. . See the following example.

# ./redis-find-big-key -addr 10.0.1.76:6379 -top 1
Scanning keys from node: 10.0.1.202:6380 (slave)
...
Top biggest keys:
+------------------------------+------+------------+--------------------+
|             Key              | Type |    Size    | Number of elements |
+------------------------------+------+------------+--------------------+
| mysortedset_20250222043741:2 | zset | 1020.13 KB |    9490 members    |
+------------------------------+------+------------+--------------------+

# redis-cli -h 10.0.1.76 -p 6379 -c MEMORY USAGE mysortedset_20250222043741:2
(integer) 1014242

# echo "scale=2; 1014242 / 1024" | bc
990.47

One is 1020.13 KB and the other is 990.47 KB.

If you directly view the size of the key in the main library through redis-find-big-key, the result is exactly the same as redis-cli:

# ./redis-find-big-key -addr 10.0.1.76:6379 -direct --master-yes -top 1 --skip-lazyfree-check
Scanning keys from node: 10.0.1.76:6379 (master)
...
Top biggest keys:
+------------------------------+------+-----------+--------------------+
|             Key              | Type |   Size    | Number of elements |
+------------------------------+------+-----------+--------------------+
| mysortedset_20250222043741:2 | zset | 990.47 KB |    9490 members    |
+------------------------------+------+-----------+--------------------+

Implementation principle

This tool is a referenceredis-cli --memkeysAchieved.

In fact, whateverredis-cli --bigkeysstillredis-cli --memkeys, all the calls arefindBigKeysFunctions, but the parameters passed in are different.

/* Find big keys */
if () {
    if (cliConnect(0) == REDIS_ERR) exit(1);
    findBigKeys(0, 0);
}

/* Find large keys */
if () {
    if (cliConnect(0) == REDIS_ERR) exit(1);
    findBigKeys(1, config.memkeys_samples);
}

Next, let’s take a look at the specific implementation logic of this function.

static void findBigKeys(int memkeys, unsigned memkeys_samples) {
    ...
// Get the total number of keys through the DBSIZE command
    total_keys = getDbSize();

    /* Status message */
    printf("\n# Scanning the entire keyspace to find biggest keys as well as\n");
    printf("# average sizes per key type.  You can use -i 0.1 to sleep 0.1 sec\n");
    printf("# per 100 SCAN commands (not usually needed).\n\n");

    /* SCAN loop */
    do {
        /* Calculate approximate percentage completion */
        pct = 100 * (double)sampled/total_keys;
      
//Scan the key through the SCAN command
        reply = sendScan(&it);
        scan_loops++;
// Get the key name of the current batch.
        keys  = reply->element[1];
        ...
// Use pipeline technology to send TYPE commands in batches to get the type of each key
        getKeyTypes(types_dict, keys, types);
// Use pipeline technology to send corresponding commands in batches to get the size of each key
        getKeySizes(keys, types, sizes, memkeys, memkeys_samples);

// Process keys one by one and update statistical information
        for(i=0;i<keys->elements;i++) {
            typeinfo *type = types[i];
            /* Skip keys that disappeared between SCAN and TYPE */
            if(!type)
                continue;

type->totalsize += sizes[i]; // The total size of each type key
type->count++;// The number of each type key
Totlen += keys->element[i]->len; // The length of the accumulated key
sampled++;// The number of cumulative scanned keys
// If the current key size exceeds the maximum value of the type, the maximum key size of the type will be updated and statistics will be printed.
            if(type->biggest<sizes[i]) {
                if (type->biggest_key)
                    sdsfree(type->biggest_key);
                type->biggest_key = sdscatrepr(sdsempty(), keys->element[i]->str, keys->element[i]->len);
                ...
                printf(
                   "[%05.2f%%] Biggest %-6s found so far '%s' with %llu %s\n",
                   pct, type->name, type->biggest_key, sizes[i],
                   !memkeys? type->sizeunit: "bytes");

                type->biggest = sizes[i];
            }

//                                                                                                                                                                                                                                                              �
            if(sampled % 1000000 == 0) {
                printf("[%05.2f%%] Sampled %llu keys so far\n", pct, sampled);
            }
        }

// If interval is set, every 100 SCAN commands are executed, they will sleep for a period of time.
        if ( && (scan_loops % 100) == 0) {
            usleep();
        }

        freeReplyObject(reply);
    } while(force_cancel_loop == 0 && it != 0);
    .. 
// Output total statistical information
    printf("\n-------- summary -------\n\n");
if (force_cancel_loop) printf("[%05.2f%%] ", pct); // If the loop is cancelled, the progress percentage is displayed
printf("Sampled %llu keys in the keyspace!\n", sampled); // Print the number of scanned keys
    printf("Total key length in bytes is %llu (avg len %.2f)\n\n",
Totlen, totlen ?(double)totlen/sampled : 0); // Print  total length and average length of the key name

// Output information for each type of maximum key
    di = dictGetIterator(types_dict);
    while ((de = dictNext(di))) {
        typeinfo *type = dictGetVal(de);
        if(type->biggest_key) {
            printf("Biggest %6s found '%s' has %llu %s\n", type->name, type->biggest_key,
               type->biggest, !memkeys? type->sizeunit: "bytes");
} // type->name is the type name of key, type->biggest_key is the name of the largest key
} // type->biggest is the size of the maximum key, !memkeys? type->sizeunit: "bytes" is the size unit.
    ..
// Output each type of statistical information
    di = dictGetIterator(types_dict);
    while ((de = dictNext(di))) {
        typeinfo *type = dictGetVal(de);
        printf("%llu %ss with %llu %s (%05.2f%% of keys, avg size %.2f)\n",
           type->count, type->name, type->totalsize, !memkeys? type->sizeunit: "bytes",
           sampled ? 100 * (double)type->count/sampled : 0,
           type->count ? (double)type->totalsize/type->count : 0);
} // sampled ? 100 * (double)type->count/sampled : 0 is the percentage of the number of keys of the current type in the total number of keys scanned.
    ..
    exit(0);
}

The implementation logic of this function is as follows:

Use the DBSIZE command to get the total number of keys in the Redis database.
Use the SCAN command to scan the key in batches and get the key name of the current batch.
Use pipeline to send TYPE command in batches to get the type of each key.
Use pipeline to send the corresponding command in batches to get the size of each key:

If --bigkeys is specified, use the corresponding command to get the size according to the key type: STRLEN (string type), LLEN (list type), SCARD (set type), HLEN (hash type), ZCARD (zset type), XLEN (stream type).
If --memkeys is specified, use the MEMORY USAGE command to obtain the memory footprint of the key.

Process keys one by one and update statistics: If the size of a key exceeds the maximum value of this type, update the maximum value and print relevant statistics.

Output summary information to show the maximum key for each key type and its related statistics.