Location>code7788 >text

Database Series: Paging Performance Issues with Huge Data Tables

Popularity:815 ℃/2024-07-24 09:16:55

Related articles

Database Series: MySQL Slow Query Analysis and Performance Optimization
Database Series: MySQL Index Optimization Summary (Comprehensive Edition)
Database Series: Data Field Changes under High Concurrency
Database Series: Overriding Indexes and Circumventing Table Returns
Database Series: Database High Availability and Lossless Scaling
Database Series: Improving Performance with Highly Distinctive Index Columns
Database Series: Prefix Indexes and Index Length Tradeoffs
Database Series: MySQL Engines MyISAM and InnoDB Comparison
Database Series: Implementing High Concurrency Control under InnoDB
Database Series: 4 Isolation Levels for Transactions
Database Series: Difference between snapshot reads under RR and RC
Database Series: Introduction to MySQL InnoDB Locking Mechanisms
Database Series: What locks are used for different MySQL operations?
Database Series: A Look at the Industry's Mainstream MySQL Data Middleware

1 Background

Some time ago, interviews with new employees, and candidates to communicate up paging performance issues, just before encountering such problems, it will be taken out to discuss again!

2 Analysis

Paging performance issues, especially with large amounts of data, are a common problem. Typically, when we use a program likeLIMIT cap (a poem)OFFSET The performance problem is especially noticeable when paging with SQL statements. This is because with theOFFSET increase, the database needs to skip more rows to fetch the required data, which leads to an increase in query time.

We found it to be very fast when we looked at the first few pages, for examplelimit 200,25The first time I saw it, it came out in an instant. But the further back, the slower the speed, especially after a million entries, the card to the end, that this is what principle. First look at the back of our page flip, the query sql is how:

1 select * from t_name where c_name1='xxx' order by c_name2 limit 2000000,25;

The slowness of this kind of query is actually caused by the offset after the LIMIT being too large. For example, a query like the abovelimit 2000000,25 This is equivalent to the database to be scanned for2000025 data, and then discard the previous20000000 data, returning the remaining25strip of data to the user, a take that is clearly unreasonable.

image

This issue is covered in detail in Chapter 6: Query Performance Optimization of High Performance MySQL:

Pagination is usually accomplished by using a limit plus an offset with an appropriate order by clause. However, a common problem arises: when the offset is very large, it causes MySQL to scan for a large number of unneeded rows and then discard them.

3 Optimization

Here are some strategies to optimize paging performance:
1. Use of indexes + subquery optimization
Make sure that the columns involved in your query (especially those used for sorting and filtering) are indexed; unindexed columns can cause the database to perform a full table scan, which can significantly reduce query performance.
Once you've ensured that you have an index, you can find the id value at the beginning of the index tree, and then query the rows based on the found id value.

[SQL]
SELECT ,,,,,
from emp a left join dep b on =
where >= (select id from emp order by id limit 100,1)
order by limit 25;
Affected rows: 0
timing: 0.106s

2. Using more efficient paging techniques
Consider using cursor or key based paging instead of theOFFSET of paging. For example, if you are sorting by timestamp or ID, you can remember the timestamp or ID of the last entry on the previous page and start the next page of the query from there.

Remembers the location of the primary key of the last lookup result, avoiding the use of the offset offset

[SQL]
SELECT ,,,,,,
from emp a left join dep b on =
where > 100 order by limit 25;
Affected rows: 0
timing: 0.001s

[SQL]
SELECT ,,,,,,
from emp a left join dep b on =
where > 4800000
order by limit 25;
Affected rows: 0
timing: 0.000s

3. Reducing the amount of data returned
Select only the columns you need instead of using theSELECT * , Reducing the amount of data can significantly improve query speed.
This is good to understand, get the data, the more streamlined the better, never all fetch back, MySQL access specification is also set so.

4. Partitioned tables
For very large tables, consider using partitioning techniques. By distributing the data to different partitions, query performance can be improved because queries can operate on smaller data sets.

5. Use of caches
For frequently visited pages, consider using a caching technology, such as Redis or Memcached, to store query results. This way, for the same query request, you can get the result directly from the cache instead of querying the database every time.

6. Consideration of physical design
The physical design of the database, such as the speed and type of hard disk (SSD vs HDD), and the size of the server's memory, can also affect query performance.

4 Summary

By implementing the above strategies, you can significantly improve the performance of database paging, especially when dealing with large amounts of data. Each method has its own applicable scenarios, so we need to choose the appropriate optimization strategy based on the specific needs and database environment.