Redis provides a wealth of data types, each of which has its own unique storage structure and operation methods, which can meet the needs of different business scenarios. The following is a detailed introduction to the main data types supported by Redis and their underlying implementation, and combined with specific application scenarios to illustrate their use.
1. String (String)
Presentation:
- The most basic type of key-value pair in Redis. Both the key and the value can be strings, and the value is limited to a maximum of 512MB.
-
String
type is the most commonly used data type in Redis, and it supports simpleGET
、SET
operations, as well as self-incrementing, self-decrementing, string splicing, and other operations.
Typical application scenarios:
- Cached data: Stores user login status, Token, configuration information, etc.
-
register: By
INCR
、DECR
Implement simple counters such as number of site visits, number of likes, etc. -
distributed lock: Combined
SETNX
command, you can implement simple distributed locks with strings.
Underlying Principle:
- The underlying Redis uses Simple Dynamic Strings (SDS) for strings, which is not only an encapsulation of C strings, but also incorporates optimization strategies such as length attributes and space reservation.SDS supports binary security and can store both text and binary data.
2. Hash
Presentation:
- A hash is a collection of key-value pairs, suitable for storing objects. Each key can have multiple fields, each with a value.
- Operations include
HSET
、HGET
、HDEL
etc.
Typical application scenarios:
- Storing user information: e.g. user ID as key and user's attributes (name, age, gender, etc.) as fields to avoid serializing the whole user object into a string.
- Configuration Item Management: Stores configuration items for quick access and updating of a configuration based on field names.
Underlying Principle:
- Hash uses two underlying data structures: a compressed list (ziplist) for small data volumes, and a hash table (hashtable) for large data volumes. Compressed list can save memory, but with the growth of the hash table will be automatically converted to a hash table to ensure query efficiency.
3. List
Presentation:
- A list is a bi-directional chained table that can insert and delete elements from the head or the tail. Common commands include
LPUSH
、RPUSH
、LPOP
、RPOP
etc. - Redis supports blocking operations such as
BLPOP
、BRPOP
, which can block and wait when there are no elements.
Typical application scenarios:
-
message queue: The list can be used as a simple message queue with the
LPUSH
Put the message into the queue withRPOP
maybeBRPOP
Pop-up message. - task scheduling: In an asynchronous task distribution system, tasks can be placed in a queue to be consumed by multiple consumers.
Underlying Principle:
- Lists are implemented as two-way linked lists (quicklist). For shorter lists, Redis uses a ziplist to save memory; for longer lists, it uses a true bi-directional linked list to balance the time complexity of operations.
4. Sets
Presentation:
- Sets are unordered, unique collections of elements that provide operations similar to mathematical sets, supporting intersection, union, difference, etc.
- Common operations include
SADD
、SREM
、SISMEMBER
、SMEMBERS
、SINTER
etc.
Typical application scenarios:
- labeling system: e.g., storing user labels as collections, each of which represents a group of users, facilitates aggregate operations such as finding users who have both of a certain two labels at the same time.
- De-weighting function: In some scenarios (e.g., popular search terms, de-duplication of access logs), duplicate data can be avoided by the uniqueness property of the collection.
Underlying Principle:
- Integer sets (intset) are used for small sets and hash tables (hashtable) are implemented for large sets. Through the hash table's fast lookup feature, you can achieve O(1) time complexity to determine whether the element exists or not.
5. Sorted Set
Presentation:
- An ordered collection is similar to a set, but each element is associated with a score, and the elements in the collection are sorted by the score. Supported operations include
ZADD
、ZRANGE
、ZREM
、ZREVRANGE
、ZCOUNT
etc.
Typical application scenarios:
-
the charts (of best-sellers): For example, a leaderboard in a game that ranks users by their scores. This can be done by
ZADD
Add players and their scores byZRANGE
Get ranked. - Deferred mandates: Set the time for task execution through scores, and take the tasks that need to be executed from the collection by time.
Underlying Principle:
- The underlying ordered set uses a data structure that combines a skiplist and a hash table. The skiplist allows the ordered set to support fast range queries and insertion operations (time complexity O(logN)), while the hash table guarantees fast positioning of elements.
6. Bitmaps
Presentation:
- Bitmaps are actually an extension of the string type, which can be viewed as a series of consecutive binary bits that can be bitwise manipulated. The supported commands are
SETBIT
、GETBIT
、BITCOUNT
、BITOP
etc.
Typical application scenarios:
- User check-in system: A bitmap is used to store the user's check-in record, each day corresponds to a bit, 0 means no check-in, 1 means checked-in.
- Active User Statistics: Quickly count how many active users there are on a given day by storing whether or not a user is active during a certain time period in a bitmap.
Underlying Principle:
- The underlying storage of the bitmap is the string structure of Redis, but the bit operations are directly for each binary bit, thus enabling efficient operations in a very compact storage space, suitable for massive data scenarios.
7. HyperLogLog
Presentation:
- HyperLogLog is an algorithm for base counting that can be used to estimate the number of non-repeating elements in a set with a very small memory footprint.
- Common commands are
PFADD
、PFCOUNT
。
Typical application scenarios:
- Unique Visitor Statistics: Count Unique Visitors (UVs) in website analytics, simply add each visitor ID to HyperLogLog to quickly get the number of unduplicated users.
- Large-scale data de-duplication counting: Used to estimate the number of de-duplications in large-scale data, such as clicks, requests, visits, etc.
Underlying Principle:
- HyperLogLog is a base estimation algorithm that maps the data into bit vectors through a hash distribution, and estimates the base by counting the maximum lengths of the different prefixes, with the advantage of taking up a very small amount of memory, and the disadvantage of only being able to make an estimation with a certain amount of error.
8. Geospatial
Presentation:
- Redis supports storing geolocation data and performing range queries and distance calculations based on this data. Common commands include
GEOADD
、GEODIST
、GEORADIUS
、GEOHASH
etc.
Typical application scenarios:
- LBS applications: For example, in taxi apps, the geolocation of drivers and passengers is stored, and the distance is calculated based on the location to match the nearest vehicle.
- Search Nearby Businesses: After the user inputs the location, it queries for nearby businesses and returns them sorted by distance.
Underlying Principle:
- Redis geospatial data is implemented based on ordered collections using the
GEOHASH
The algorithm encodes geographic coordinates as 64-bit integers into an ordered set. A range query over these encodings enables fast spatial retrieval.
9. Streams
Presentation:
-
Stream
is a new data type introduced in Redis 5.0, supports message queuing, similar to Kafka or RabbitMQ, and supports features such as consumer groups, message persistence, and auto-answer. Common commands includeXADD
、XREAD
、XGROUP
、XACK
etc.
Typical application scenarios:
- messaging system: With the stream data type, multiple consumers can consume data from the same queue, enabling message distribution and processing.
- log system: Log messages can be stored in Redis streams for persistence and real-time consumption.
Underlying Principle:
- Stream is based on a combination of compressed lists and chained lists, with a high complexity data structure to efficiently store large amounts of streaming data. Sorting and management by internally maintained IDs makes it suitable for handling ordered, continuously generated data streams.
summarize
The variety of data types provided by Redis not only enriches its applicability in different business scenarios, but also ensures performance through memory-friendly data structures and efficient algorithms. When choosing a Redis data type, you usually need to match the appropriate data structure with your business requirements to maximize system performance and resource utilization.