I: Background
- storytelling
The company deployed in a dish on the project in September pressure test 50 concurrency, found that a container threads, memory abnormal rise, resulting in abnormal function can not be used. According to what I learned, I analyzed the threads and memory problems, analysis can be used lldb or windbg, but I personally prefer the interface windbg, so I finally use windbg to start work.
II: WinDbg Analysis
- Where exactly is the leak?
On the windows platform I believe there are many friends know to use !address -summary command to see, but this is exclusively windows platform commands, in the analysis of linux on the dump is not good, refer to the following output:
0:000> !address -summary
Mapping file section regions...
Mapping module regions...
Mapping heap regions...
--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
<unknown> 4062 ffffffff`f5638600 ( 16.000 EB) 100.00% 100.00%
Image 1282 0`09fc8a00 ( 159.784 MB) 0.00% 0.00%
--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
2431 fffffffe`2b813000 ( 16.000 EB) 100.00%
MEM_PRIVATE 2913 1`d3dee000 ( 7.310 GB) 0.00% 0.00%
--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
2431 fffffffe`2b813000 ( 16.000 EB) 100.00% 100.00%
MEM_COMMIT 2913 1`d3dee000 ( 7.310 GB) 0.00% 0.00%
--- Protect Summary (for commit) - RgnCount ----------- Total Size -------- %ofBusy %ofTotal
PAGE_READWRITE 2115 1`cb683000 ( 7.178 GB) 0.00% 0.00%
PAGE_EXECUTE_READ 175 0`03d49000 ( 61.285 MB) 0.00% 0.00%
PAGE_READONLY 585 0`03ce9000 ( 60.910 MB) 0.00% 0.00%
PAGE_EXECUTE_WRITECOPY 38 0`00d39000 ( 13.223 MB) 0.00% 0.00%
--- Largest Region by Usage ----------- Base Address -------- Region Size ----------
<unknown> 7ffc`011fa000 ffff8003`fe406000 ( 16.000 EB)
Image 7f45`fe4e9000 0`01b16000 ( 27.086 MB)
The categorization of memory segments in trigrams is not very useful or informative, so what to do? In fact, the coreclr team has considered this situation, it provides a maddress command to realize the cross-platform !address, and the output after the change is as follows:
0:000> !sos maddress
Enumerating and tagging the entire address space and caching the result...
Subsequent runs of this command should be faster.
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Memory Kind | StartAddr | EndAddr-1 | Size | Type | State | Protect | Image |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Stack | 7f42d256e000 | 7f42d2d6e000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d3570000 | 7f42d3d70000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d3d71000 | 7f42d4571000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d4572000 | 7f42d4d72000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d4d73000 | 7f42d5573000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d5574000 | 7f42d5d74000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d5d75000 | 7f42d6575000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d6d77000 | 7f42d7577000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d7578000 | 7f42d7d78000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d7d79000 | 7f42d8579000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7f42d857a000 | 7f42d8d7a000 | 8.00mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
...
+-------------------------------------------------------------------------+
| Memory Type | Count | Size | Size (bytes) |
+-------------------------------------------------------------------------+
| Stack | 788 | 6.28gb | 6,743,269,376 |
| GCHeap | 48 | 688.98mb | 722,448,384 |
| PAGE_READWRITE | 930 | 180.22mb | 188,977,152 |
| Image | 1,278 | 159.69mb | 167,447,040 |
| HighFrequencyHeap | 327 | 20.35mb | 21,336,064 |
| LowFrequencyHeap | 259 | 18.31mb | 19,202,048 |
| LoaderCodeHeap | 15 | 17.53mb | 18,378,752 |
| HostCodeHeap | 11 | 1.51mb | 1,581,056 |
| ResolveHeap | 1 | 348.00kb | 356,352 |
| PAGE_READONLY | 123 | 261.50kb | 267,776 |
| DispatchHeap | 1 | 196.00kb | 200,704 |
| IndirectionCellHeap | 3 | 152.00kb | 155,648 |
| LookupHeap | 3 | 144.00kb | 147,456 |
| CacheEntryHeap | 2 | 100.00kb | 102,400 |
| PAGE_EXECUTE_WRITECOPY | 5 | 96.00kb | 98,304 |
| StubHeap | 2 | 76.00kb | 77,824 |
| PAGE_EXECUTE_READ | 2 | 8.00kb | 8,192 |
+-------------------------------------------------------------------------+
| [TOTAL] | 3,798 | 7.34gb | 7,884,054,528 |
+-------------------------------------------------------------------------+
From the trigrams you can see that the current program has a total of 6.28gb of memory occupied, basically eaten by the thread stack, more unexpected is that the thread stack actually takes up 8M of memory space, this is really a little big, and linux is not like windows has a reserved concept, here 8M is actually pre-occupied, you can observe the memory address of these 8M You can see that the memory addresses of these 8M are initialized to 0, which doesn't make sense.
0:000> dp 7f42d256e000 7f42d2d6e000
...
00007f42`d2d6dfa0 00000000`00000000 00000000`00000000
00007f42`d2d6dfb0 00000000`00000000 00000000`00000000
00007f42`d2d6dfc0 00000000`00000000 00000000`00000000
00007f42`d2d6dfd0 00000000`00000000 00000000`00000000
00007f42`d2d6dfe0 00000000`00000000 00000000`00000000
00007f42`d2d6dff0 00000000`00000000 00000000`00000000
00007f42`d2d6e000 ????????`????????
- How to modify the stack space size
Generally different operating system distributions have different default stack space configurations. You can start by going to memory and searching which distribution you are currently on, which is done by searching for the main keyword of the operating system name.
0:000> s-a 0 L?0xffffffffffffffff "centos"
...
00005570`9cddbc18 63 65 6e 74 6f 73 2e 37-2d 78 36 34 00 00 00 00 centos.7-x64....
...
You can see from the trigram that the current operating system is centos7-x64, and you can modify the PE header by modifying the stack space size on windows platform, and there are two ways to do it on linux.
Modify the ulimit -s parameter (not recommended)
root@ubuntu:/data# ulimit -s
8192
root@ubuntu:/data# ulimit -s 2048
root@ubuntu:/data# ulimit -s
2048
Modify the DOTNET_DefaultStackSize environment variable (recommended, configured in the environment variable for exception containers)
DOTNET_DefaultStackSize=180000
More can be found in the article:/post/2023-10-18-til-dotnet-stack-size/
The above is the first direction of solving the problem, next we will talk about the other direction, why a total of 888 threads are generated?
- Why are there so many threads
To find this answer, you need to see what each thread is doing at the moment, which can be done using the windbg proprietary command.
0:000> ~*e !clrstack
...
OS Thread Id: 0x1b82 (225)
Child SP IP Call Site
00007F441B7FD660 00007f4cdbb69ad8 [HelperMethodFrame_1OBJ: 00007f441b7fd660] (Int32, )
00007F441B7FD790 00007f4c676318cd (Int32, ) [/_/src/libraries//src/System/Threading/ @ 570]
00007F441B7FD810 00007f4c676312e1 [[System.__Canon, ]](OperationQueue`1<System.__Canon> ByRef, System.__Canon, Int32, Int32) [/_/src/libraries//src/System/Net/Sockets/ @ 1330]
00007F441B7FD8A0 00007f4c67e26ff1 (`1, ByRef, Byte[], Int32 ByRef, Int32, Int32 ByRef) [/_/src/libraries//src/System/Net/Sockets/ @ 1557]
00007F441B7FD920 00007f4c67e2ea6b (, Byte[], Int32, Int32, , Int32 ByRef)
00007F441B7FD9A0 00007f4c67e26c37 (Byte[], Int32, Int32, , ByRef)
00007F441B7FDA20 00007f4c67e26929 (Byte[], Int32, Int32) [/_/src/libraries//src/System/Net/Sockets/ @ 231]
00007F441B7FDA70 00007f4c69b85757 () [/_/src/libraries//src/System/IO/ @ 771]
00007F441B7FDA90 00007f4c69b774e8 () [/_/src/libraries//src/System/IO/ @ 207]
00007F441B7FDAA0 00007f4c69b853ee ()
00007F441B7FDAF0 00007f4c69b852c6 ()
00007F441B7FDB10 00007f4c69b57068 ()
00007F441B7FDB50 00007f4c67590d19 (, , ) [/_/src/libraries//src/System/Threading/ @ 183]
00007F441B7FDCF0 00007f4cdb1e3aa7 [DebuggerU2MCatchHandlerFrame: 00007f441b7fdcf0]
...
You can use regular dotnet-dump or procdump to grab it, and according to the above trigram display, you can see a lot of link libraries related to it, so I guess a lot of threads are stuck in it.
With this knowledge, the final advice to the friend is as follows:
Modify the DOTNET_DefaultStackSize parameter.
You can follow the default 1.5M stack space setting of .netcore on windows, because 8M is really too big to carry, and it is also inconsistent with the low memory usage of Linux. After modifying it, I read the dump and found that the configuration has taken effect.
0:000> !sos maddress
Enumerating and tagging the entire address space and caching the result...
Subsequent runs of this command should be faster.
*** WARNING: Unable to verify timestamp for lttng-ust-wait-8-0
*** WARNING: Unable to verify timestamp for lttng-ust-wait-8
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Memory Kind | StartAddr | EndAddr-1 | Size | Type | State | Protect | Image |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
.......
| Stack | 7fabe4e8c000 | 7fabe500c000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe500d000 | 7fabe518d000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe518e000 | 7fabe530e000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe530f000 | 7fabe548f000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe5490000 | 7fabe5610000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe5611000 | 7fabe5791000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe5792000 | 7fabe5912000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe5913000 | 7fabe5a93000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe5a94000 | 7fabe5c14000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
| Stack | 7fabe5c15000 | 7fabe5d95000 | 1.50mb | MEM_PRIVATE | MEM_COMMIT | PAGE_READWRITE | |
.......
+-------------------------------------------------------------------------+
| Memory Type | Count | Size | Size (bytes) |
+-------------------------------------------------------------------------+
| Stack | 766 | 1.41gb | 1,518,571,520 |
| GCHeap | 48 | 702.39mb | 736,509,952 |
| PAGE_READWRITE | 931 | 186.31mb | 195,358,720 |
| Image | 1,283 | 158.77mb | 166,480,384 |
| HighFrequencyHeap | 336 | 20.97mb | 21,991,424 |
| LowFrequencyHeap | 256 | 18.32mb | 19,214,336 |
| LoaderCodeHeap | 15 | 17.53mb | 18,378,752 |
| HostCodeHeap | 11 | 1.63mb | 1,703,936 |
| ResolveHeap | 1 | 348.00kb | 356,352 |
| PAGE_READONLY | 123 | 261.50kb | 267,776 |
| DispatchHeap | 1 | 196.00kb | 200,704 |
| IndirectionCellHeap | 3 | 152.00kb | 155,648 |
| LookupHeap | 3 | 144.00kb | 147,456 |
| PAGE_EXECUTE_WRITECOPY | 5 | 132.00kb | 135,168 |
| CacheEntryHeap | 2 | 100.00kb | 102,400 |
| StubHeap | 2 | 76.00kb | 77,824 |
| PAGE_EXECUTE_READ | 2 | 8.00kb | 8,192 |
+-------------------------------------------------------------------------+
| [TOTAL] | 3,788 | 2.50gb | 2,679,660,544 |
+-------------------------------------------------------------------------+
Observe the relevant logic in the project code
Found that the reference actually belongs to the invalid reference in the code, the reference will be deleted pressure test observation, found that the thread is normal.
III: Summary
NET debugging ecosystem on Linux is getting richer by the day, which is a very exciting thing to finally give to my teacherThe Frontline Coder.and WinDbg like this.