Project background
The AI power trading competition platform needs to design a ranking list for the team participating in the trading competition, ranking according to different tracks, price prediction tracks are ranked according to multiple accuracy rates, and trading tracks are ranked according to revenue. The specific requirements are as follows:
- Real-time: When market boundaries change, the rankings should be updated immediately.
- High concurrency: Support all participating teams to check the ranking list at the same time (100 participating teams).
- Ranking stability: The ranking is accurately calculated and can cope with a large number of updates in a short period of time.
Design ideas
When designing a ranking system, the key point is to clarify real-time requirements and choose appropriate data storage and update strategies, which directly affects the performance and user experience of the system.
Real-time and storage timing
Ranking lists can be divided into real-time ranking lists and non-real-time ranking lists according to real-time nature. The main difference between the two is the timing of storage updates:
- Real-time ranking list: When market boundaries or related data change, the rankings need to be updated immediately. This requires the system to capture data changes in real time and quickly reflect them in storage to ensure that users can obtain the latest ranking information at any time. For example, in online competitive games, once the player's real-time points change, the game's rankings should be updated immediately.
- Non-real-time rankings: Allows delayed updates after data changes. A common practice is to use timed task frameworks, such as xxl - job or powerjob, to perform statistical tasks at set time intervals (such as hourly, daily, etc.). When the task is executed, the relevant data is queried from the data source for statistical calculations, and after completion, the results are stored in the corresponding storage medium. For example, the monthly product sales rankings of e-commerce platforms can be updated by counting the sales data of the previous day by timed tasks performed in the early morning of every day.
Data storage and update solutions
In ranking design, how to efficiently store and update ranking data is the core issue. The common solutions are as follows:
-
Database Solution
-
principle: Traditional relational databases (such as MySQL) use sorting and indexing functions to achieve rankings. For example, suppose there is a database
subject_info
surface,create_by
The field is a user-only identified byselect count(1), create_by from subject_info group by create_by limit 0, 5;
Statement, group statistics on users and obtain the top 5 users. - Advantages: Data storage has integrity and consistency guarantees, and has strong data persistence capabilities. It is suitable for scenarios with extremely high data accuracy requirements and small data volume and concurrency.
- Disadvantages: In high concurrency scenarios, the database is under huge query pressure. If the index is used improperly, it is easy to generate slow SQL, resulting in low query efficiency and difficult to meet real-time requirements.
- Optimization measures: Caches can be added at the database level to reduce database query pressure and improve response speed, but a certain delay will be introduced.
-
principle: Traditional relational databases (such as MySQL) use sorting and indexing functions to achieve rankings. For example, suppose there is a database
-
Cache Scheme:
-
principle: In-memory database represented by Redis, which uses its ordered set (Sorted Set) to store ranking data. Use the ranking items as the Sorted Set element and the corresponding scores are used as the sorting basis. If the game ranking list, the player ID is an element and the game score is a score. Use
ZADD
Command updates scores, Redis automatically reorders, throughZRANGE
orZREVRANGE
Command query ranking. - Advantages: Based on memory operation, excellent performance, extremely fast query speed, can meet high concurrency and real-time requirements, and reduce database load.
- Disadvantages: Additional Redis environment needs to be built and maintained to increase system complexity, improper use is prone to major key problems, affecting performance and stability, and the data has a risk of memory loss (can be mitigated through persistence).
-
principle: In-memory database represented by Redis, which uses its ordered set (Sorted Set) to store ranking data. Use the ranking items as the Sorted Set element and the corresponding scores are used as the sorting basis. If the game ranking list, the player ID is an element and the game score is a score. Use
-
Mixed scheme
- principle: Combined with database and cache. The database is used for persistent data storage to ensure the security and reliability of data; Redis is responsible for real-time calculations and quick query. For example, new data is first written to the database, then synchronized to the Redis update rankings, and is directly obtained from Redis during query.
- Advantages: Taking into account data persistence and efficient query, it can not only meet real-time requirements, but also ensure data integrity, and is suitable for scenarios that have high requirements for real-time and data reliability.
- Disadvantages: The system architecture is relatively complex and requires coordination of data synchronization between the database and cache, increasing development and maintenance costs.
Solution selection
Taking into account the real-time and high concurrency requirements of the system, it is recommended to use Redis as the core storage of the ranking list and cooperate with MySQL for data persistence. Redis's ordered collection provides efficient implementation of ranking and ranking operations. External interactions directly access caches, which can greatly improve system performance. At the same time, MySQL ensures secure data storage and meets the needs of most scenarios.
Specific implementation
Data structure design
In Redis, we can use Sorted Set to implement rankings. A Sorted Set is a set with fractions, each element in the set has a unique value and an associated fraction. We can use scores to sort to achieve the function of rankings.
Take the spot gain ranking as an example: Each participating team has a unique ID and a spot gain value (spotGain), and the following structure can be designed:
- Key:spot:rank:gain (the unique identifier of the ranking list)
- Member:teamId (participating team ID)
- Score:spotGain (spot gain value)
Implement basic operations
(1)Add/update the gain value of participating teams
String rankKey = 'spot:rank:gain';
Long teamId = 2L;
double spotGain = 65.23;
// Update the spot gain value
(rankKey, spotGain, (teamId));
When the gain value of the team changes, it can be calledzaddMethod update ranking list. If the team already exists in the rankings, Redis will automatically update its score.
(2)Query Team Ranking
// Get the ranking of participating teams. Redis returns the ranking starting from 0
Long rank = (rankKey, (teamId));
if (rank != null) {
("The ranking of participating teams [{0}] is {2}", teamId, rank + 1);
} else {
("The participating team [{0}] is not ranked", teamId);
}
usezrevrankThe method can obtain the ranking of the participating teams. Note that this is the reverse order, that is, the ones with the highest scores are ranked first.
(3)Get the top N rankings
// Get the top 20 rankings
Set<String> topRank = (rankKey, 0, 19);
passzrevrangeMethods can obtain the top N participating teams in the ranking list and their corresponding scores.
Persistence and data recovery
Although Redis provides efficient ranking operations, it is a memory database after all, and data loss can be caused by power failure or server failure. To do this, we need to consider the persistence of data.
1. Data persistence
We can regularly synchronize the ranking data in Redis to MySQL to ensure the persistence of the data. The following methods can be used:
- Timed backup: Export the ranking data in Redis through a timing task and write it to MySQL.
- Synchronize during update: Whenever the parameter team gain value changes, both Redis and MySQL are updated.
2. Data recovery
When the Redis server restarts or data is lost, the ranking data can be restored from MySQL. In this way, even if the data in Redis is lost, we can restore from MySQL to ensure the normal operation of the rankings.
Coping with high concurrency and performance optimization
In high concurrency scenarios, we need to consider Redis' performance optimization to ensure the stability and efficiency of the ranking system.
- Using clusters:When a single Redis server cannot support high concurrent requests, you can consider using Redis Cluster to distribute data across multiple nodes to improve the scalability of the system.
- Current limit and downgrade:During peak periods, you can limit the query and update operations of the ranking list to avoid excessive consumption of the Redis server. At the same time, it is also possible to consider degrading functions when necessary, such as delaying updates to rankings or limiting query frequency.
Redis Zset multi-field sorting scheme design
Scenario: Price prediction ranking, based on the comprehensive accuracy ranking, if it is equal, then use the price difference direction accuracy ranking.
Idea: Using Redis zset data structure, the main consideration is to segment the score field. For example, the complete score: 1000020000, the comprehensive accuracy: 10000, the price difference direction accuracy: 20000, the two data are maintained in segments, and the overall query is as score.
Double score = ("teamId", "2");
//Get the first segment sort
int scorePart1 = (int) ((score / 1000000000) * 10000);
//Get the second segment sort
int scorePart2 = (int) (score % 100000);