Coincidentally, yesterday just wrote about the solution to the database continuity problem, I did not expect that today at two o'clock in the afternoon there are friends in Shanghai Jinduoduo interview non-technical positions encountered a similar problem. Here is the original question:
A maximum number of consecutive failed payments There is a payment flow table pay; the fields are as follows
id | uid | time | status |
---|---|---|---|
pay_01 | 1 | 2024-01-15 10:00:00 | fail |
pay_02 | 1 | 2024-01-15 10:00:20 | success |
Find the maximum number of consecutive failed payments per user
Output fields:uid,cnt (number of failures)
If my friend had read my post yesterday on database lookups for continuous data, then at least he would have had some ideas instead of writing a mess hahaha.
reasoning
As I wrote yesterday about the continuity problem, creating a continuous column to compare to the column that the target determines if it's continuous or not is one of the easier ways to do it.
For example, the question, need time continuous someone's maximum number of failed payments, then you need to sort according to the time, then how to determine the time continuous, the question does not specify the frequency of time, so directly and time to do a comparison to determine the continuous is not a good solution.
The easiest thing to do here is actually to generate the line numbers twice.
move
1. According to the user number to open the window, sorted by time, increase the line number, so that each user in accordance with the time of the order serial number.
2. Filter out the payment of orders successfully, and then according to the user number to open the window, sorted by time, increase the line number, so that each user in accordance with the time of the order of the order of the failure of the serial number.
3. When two serial numbers are subtracted from each other, equal numbers indicate continuity.
4. Grouped by user and the number obtained, or the largest group, is the largest number of consecutive failed payment orders.
WITH ContinueCustomer AS(
SELECT p1.*, - RANK() over(PARTITION by ORDER BY ) as diff FROM (
SELECT *,RANK() over(PARTITION by uid ORDER BY time) as rowNumber FROM pay
) p1 WHERE = 'fail'
),
EveryContinueCustomerCounts AS(
SELECT uid,count(1) as times FROM ContinueCustomer GROUP BY uid,diff
)
SELECT uid,max(times) as cnt FROM EveryContinueCustomerCounts
summarize
All things being equal, I hope the interview goes well for everyone.