The road to go high concurrency

I. Scenarios of use

Imagine a scenario where there is a configuration service system that stores a variety of configurations, such as live information, likes, check-ins, red envelopes, bandwagons, and so on. These configuration information has two characteristics:

1. The amount of concurrency may be particularly largeImagine a live room of hundreds of thousands of people, perhaps a few seconds before the start of the live broadcast, users instantly flooded in, then this time our system will have to load this configuration information. At this time, the amount of requests is like a flood, all of a sudden impact into our system.

2. These configurations are usually read-only, which is set up on the B-side (admin backend), is usually modified very infrequently after the live broadcast starts.

So faced with the above business scenario, let's say our goal is toWhat technical architecture and solution would you use to carry 3wQPS?

1. Directly checking the databaseFor example, MySQL, Doris, and other relational databases. Obviously this will not be able to carry, usually relational databases can let carry a few thousand is basically the end of the road.

2, using stand-alone RedisTheoretically, it is possible. Theoretically it is possible, Tencent cloud (below) and some Redis official data, said that theoretically high configuration version of the stand-alone Redis can resist 10W + QPS. but the theory after all is theory, in fact, work, I have done a lot of pressure test using Redis, show that the stand-alone Redis on the more than 20,000 after the performance of bottlenecks, the pressure test can not be pressed up. (Of course, perhaps our Redis has not been upgraded to the top of the line?)

3. Use the clustered version of RedisThe company can certainly solve this problem, that is, the cost is a little high, the company is not bad money can be used in this program.

4, local cache, is the focus of this article, perfect solution to this problem. The so-called local cache is to store these data that need to be acquired in the server's memory. There is theoretically no upper limit to how fast a server can read the local cache, depending on the configuration of the server's physical machine. However, the lower limit is several times higher than MySQL and standalone Redis or something like that, a 2-core 4G Linux server, it is estimated that at least 10W + QPS to start. I once did a pressure test in the local Windows system (quad-core eight threads, 16G), it reached 100W + QPS. to the same configuration of the Linux system, that is not to say.

II. Technical Programs

Since we have chosen the strategy of local caching, how do we design the technical solution for this local caching?

本地缓存数据流转图

1, as shown in the figure above, our client to get data will first read the local cache.If there is no data in the local cache it will read Redis data, if Redis doesn't it will read DB data.

2, it should be noted that, between the local cache and the DB will generally join the Redis layer of cacheThis is because the local cache can't be updated once it is set (unless the server is changed). This is because the local cache can't be updated once it's set up (unless the server is restarted), whereas the Redis cache we are able to update anytime after there is a change in the DB. This is also understandable because Redis has a separate Redis server, whereas the local cache can only be updated and set on that machine.However, in actual projects, the machine that sets up the DB data source for local caching and the machine that uses local caching are most likely not on the same system.。That's why our local caches are set to very short times, most of them are seconds, usually not more than 1 minute, such as 1 second, 2 seconds... ... And Redis this cache duration can obviously be set longer, like half an hour, 1 hour ....

Third, how to update the local cache

As mentioned above.The worst thing about local caching is the update problem, because it is likely that the system of the DB data source that sets up the local cache and the system that uses the local cache are not the same, and it is not possible to synchronize the update of the local cache when the DB data is updated. However, in practice, this scenario is likely to be needed., which is to update the local cache when updating the data source. As an example:

The system where we set up the DB data source for Configuration A is an API system, but now there is a script system that needs to process some behavioral data from the C-end based on a certain Configuration A, determine whether it satisfies that Configuration A, and then carry out the corresponding business processing. Well, now the amount of behavioral data on the C-end is very huge, it can be said to be a massive amount of data.On average, half a million pieces of data are pushed through kafka every second. At this point we would have to use thelocal cacheStoring information about configuration A now.It's the only way to fight this flood of traffic.But it's a scripting system. But it's a scripted system.We update the DB information for Configuration A in the corresponding API system's. What then?

There are several ways to do this:
1. Maintain a script in the scripting system to read MySQL data at regular intervalsThen update to the local cache. But this has to be evaluated in terms of time and MySQL performance, because you have to sweep the table all the time.

2. Pulling MySQL binlog logsThe script listens to the kafka data and updates the local cache of configuration A when it receives the data. The script listens for kafka data and updates the local cache of configuration A when it receives kafka data. But this also need to pay attention to, because the scripting system will generally start a lot of services at the same time, so you have to pay attention to how many services have to set up how many consumer groups, because we want to ensure that each service of the scripting system to consume the kafka DB corresponding to the updated data, which will update the local cache on the respective machine.

3. Use the publish-subscribe feature of RedisIf the upstream api has updated configuration information, it will publish the information, and each script service will subscribe to the information and update the local cache on its own machine as soon as the information is available. But this also has a drawback, Redis publish and subscribe function is no confirmation mechanism, so a script service may not receive the message to update the local cache, and then a bug occurs. demo as follows:
(1) Publisher:

package main

import (
"context"
"fmt"
"/go-redis/redis/v8"
"time"
)

var ctx = ()

// Publish the subscription function
// The publisher publishes and all subscribers receive the published message. Note the distinction between message queues, which are publisher-published and can be grabbed by only one subscriber.
func main() {
rdb := (&{
Addr: "localhost:16379",
Password: "123456",
})

i := 0
for {
// Simulate a message when the data is updated
(ctx, "money_updates", "New money value updated " + ("%d", i))
("Message published " + ("%d", i))
(5 * )
i++
}

}

(2) Subscribers:

package main

import (
	"context"
	"fmt"
	"/go-redis/redis/v8"
)

var ctx = ()

func main() {
	rdb := (&{
		Addr: "localhost:16379",
		Password: "123456",
	})

	pubsub := (ctx, "money_updates")
	defer ()

	// Waiting for news.
	for {
		msg, err := (ctx)
		if err != nil {
			("Error receiving message:", err)
			return
		}
		("Received message:", )
	}
}

4, similar to method 1, except that you can push the modified configuration A information to RedisThen the script goes and scans Redis for information and updates the local cache if there is any. In fact, it is a delayed queue. But this will have to be upstream configuration A additions and deletions are written to this Redis, sometimes additions and deletions are too many mouths, in fact, the implementation is also more difficult.

As mentioned above, basically there is no very suitable and efficient way to update the local cache, you can only pick one of the methods that better fits your business scenario.

Common Class Libraries for Local Caching

How does go use the local cache?

1, you can implement a local cache, you can generally use the LRU (least recently used) algorithmThe following is a demo of a local cache that I implemented myself. Here is a demo of a local cache implemented by myself.

package main

import (
	"container/list"
	"fmt"
	"sync"
	"time"
)

type Cache struct {
	capacity int
	cache map[int]*
	lruList *
	mu // Ensure thread safety
}

type entry struct {
	key int
	value int
	expiration // expiration date (of document)
}

// NewCache Creating a new cache
func NewCache(capacity int) *Cache {
	return &Cache{
		capacity: capacity,
		cache: make(map[int]*),
		lruList:  (),
	}
}

// Get Getting values from the cache
func (c *Cache) Get(key int) (int, bool) {
	()
	defer ()

	if elem, found := [key]; found {
		// Check for expiration dates
		if .(entry).(()) {
			// Moving to the head of the chain table (Recent use)
			(elem)
			return .(entry).value, true
		}
		// If expired，Delete cache entries
		(elem)
	}
	return 0, false
}

// Put Putting values into cache
func (c *Cache) Put(key int, value int, ttl ) {
	()
	defer ()

	if elem, found := [key]; found {
		// Update existing values
		 = entry{key, value, ().Add(ttl)}
		(elem)
	} else {
		// Add new entry
		if () == {
			// Delete the oldest entry
			oldest := ()
			if oldest != nil {
				(oldest)
			}
		}
		newElem := (entry{key, value, ().Add(ttl)})
		[key] = newElem
	}
}

// removeElement Removing elements from the cache
func (c *Cache) removeElement(elem *) {
	(elem)
	delete(, .(entry).key)
}

// Clearance of obsolete items
func (c *Cache) CleanUp() {
	()
	defer ()

	for e := (); e != nil; {
		next := ()
		if .(entry).(()) {
			(e)
		}
		e = next
	}
}

func main() {
	cache := NewCache(2)
	(1, 1, 5*) // 设置expiration date (of document)为5unit of angle or arc equivalent one sixtieth of a degree
	(2, 2, 5*)

	((1)) // exports: 1 true
	(6 * )

	// Expired visits
	((1)) // exports: 0 false
	() // carry out a clean-up
}

The code uses the LRU algorithm, which is implemented by moving the newest cache to the head of the chain (most recently used). However, there are some issues, the CleanUp method needs to be called manually to clean up the expired cache, there is no mechanism to automatically clean it up periodically. This means that the user may need to call CleanUp frequently, otherwise expired items may stay in the cache for a long time. There are also locks in the code, and there may be data consistency and performance issues when accessing concurrently, and so on.

So, this kind of demo implemented by yourself is not recommended to be used in production environment, there may be some minor problems. For example.A local cache library written by my department had a big bug: the memory space of the local cache could not be released, which caused the memory to go up all the time. Every few days, the memory would go up to 90%, and then we temporarily handled it by restarting the script every few days....

2, so it is, we still recommend to go to the use of open source libraries, at least there are many predecessors who have stepped through the pits for us, here are a few recommended ones with higher STAR counts:
（1）go-cache: A simple memory cache library , support for expiration and automatic cleanup , suitable for simple cache key-value needs . (I use more in the project , convenient and simple , recommended)
（2）bigcache: High-performance in-memory cache library for caching large amounts of data, designed to reduce garbage collection pressure.
（3）groupcache: Google developed a cache library , support for distributed caching and stand-alone caching . Applicable to the need for high concurrency and high availability scenarios .

Above, some personal experiences with local caching. I have to say, this thing is really nice to use, inexpensive, can carry and fight. The only drawback is that the local cache is not very good at updating in real time, but of course there are a few solutions to this problem.