synopsis
The generalized reference material sucks, so I won't go over the basics, such as the use of generalized interfaces/delegates/methods, inversion and covariance.
The benefits of generalization are as follows
- Code Reuse
Algorithm reuse, just pre-define the algorithms, sorting, searching, swapping, comparing, etc. The same set of logic can be used for any type - type safety
The compiler guarantees that it will not pass int to string - simple and clear
Reduced type conversion code - Better performance
Reduced boxing/unboxing and better generalized algorithms.
Why is generalization more performant?
This is mainly due to the managed heap allocation issues and performance loss associated with boxing.
- Value type boxing takes up extra memory
var a = new List<int>()
{
1,2, 3, 4
};
var b = new ArrayList()
{
1,2,3,4
};
Variable a:72kb
Variable b:184kb
- Loading/unloading boxes consumes additional CPU
public void ArrayTest()
{
Stopwatch stopwatch = ();
();
ArrayList arrayList = new ArrayList();
for (int i = 0; i < 10000000; i++)
{
(i);
_ = (int)arrayList[i];
}
();
($"array time is {}");
}
public void ListTest()
{
Stopwatch stopwatch = ();
();
List<int> list = new List<int>();
for (int i = 0; i < 10000000; i++)
{
(i);
_ = list[i];
}
();
($"list time is {}");
}
Such a huge difference will undoubtedly result in increased management costs for the GC as well as additional CPU consumption.
Think about the question, if it was a real parameter of a reference type. Would the difference still be so large?
If the gap is small, then what is our reason for using generalization?
Open/closed type
There are a variety of CLRtyped object , such as reference types, value types, interface and delegate types, and generic types.
Based on the creation behavior, they are further categorized intoOpen type/closed type
Why is this important? One of the advantages of generalization is code reuse, just define the algorithm. The rest just fill in the good. For example, List<> open to any real parameter, we can reuse the same set of algorithms.
give an example
- An open type is one where the type parameters have not yet been specified, they cannot be instantiated List<>,Dictionary<,>,interface . They just build the basic framework and open different real parameters.
Type it = typeof(ITest).
(it);//creation failed
Type di = typeof(Dictionary<,>);
(di);//creation failed
- Closed type means that the type has been specified and can be instantiated List<string
>,String are closed types. They only accept real parameters with a specific meaning
Type li = typeof(List<string>);.
(li);//created successfully
code explosion
So when we useWhen opening up the type, there is a problem. During the JIT compilation phase, the CLR fetches the IL of the generalized type and then looks for the corresponding real parameter replacement to generate the appropriate native code.
But doing so has the disadvantage of generating, for each different generic type/method combination, all kinds of various native code. This would significantly increase the Assembly of the program, thus hurting performance
The CLR has a special optimization to mitigate that phenomenon:shared methodology
-
Same type of real parameters, common set of methods
If List<Struct> is used in one Assembly and another Assembly also uses List<Struct>.
Then the CLR will only generate a set of native code. -
Reference type real parameters that share a common set of methods
List<String> and List<Stream> real parameters are both reference types, and their values are pointer references on the managed heap. So the CLR can operate on pointers in the same way!
This is not the case for value types, such as int vs. long. One takes up 4 bytes and one takes up 8 bytes. They don't take up the same amount of memory, which makes it impossible to reuse them with the same set of logic.
Seeing is believing1
sample code (computing)
internal class Program
{
static void Main(string[] args)
{
var a = new Test<string>();
var b = new Test<Stream>();
();
}
}
public class Test<T>
{
public void Add(T value)
{
}
public void Remove(T value)
{
}
}
Variable a:
Variable b
A closer look reveals that their EEClasses are identical, and the MethodDesc of their Add/Remove methods are also identical. This confirms the above statement that reference type real parameters refer to the same set of methods.
Seeing is believing2
Click to view code
internal class Program
{
static void Main(string[] args)
{
var a = new Test<int>();
var b = new Test<long>();
var c = new Test<MyStruct>();
();
}
}
public class Test<T>
{
public void Add(T value)
{
}
public void Remove(T value)
{
}
}
public struct MyStruct
{
public int Age;
}
Let's replace the reference type with a value type and look at their method tables again.
Variable a.
Variable b.
Variable c.
A quick glance shows that they have completely different MethodDesc. This means that in Assembly. the CLR generates 3 sets of methods for generalization.
For those of you who are careful, you may notice that the real parameter of the reference type becomes a type called System.__Canon, which is used internally by the CLR as a "placeholder" for all reference types.
Interested partners can refer to its source code: coreclr\\src\System__Canon.cs
Why can't value types share the same set of methods?
In fact, it is well understood that the length of the pointer of the reference type is fixed (32-bit 4byte, 64-bit 8byte), while the length of the value type is not the same. Resulting in value types generated by the underlying assembly can not be handled uniformly. Therefore, the value type can not reuse the same set of methods.
seeing is believing
Click to view code
internal class Program
{
static void Main(string[] args)
{
var a = new Test<int>();
(1);
var b = new Test<long>();
(1);
var c = new Test<string>();
("");
var d = new Test<Stream>();
(null);
();
}
}
public class Test<T>
{
public void Add(T value)
{
var s = value;
}
public void Remove(T value)
{
}
}
//variable a
00007FFBAF7B7435 mov eax,dword ptr [rbp+58h]
00007FFBAF7B7438 mov dword ptr [rbp+2Ch],eax //int type step 4 2ch
//variable b
00007FFBAF7B7FD7 mov rax,qword ptr [rbp+58h]
00007FFBAF7B7FDB mov qword ptr [rbp+28h],rax //long type step 8 28h assembly inconsistent
//variable c
00007FFBAF7B8087 mov rax,qword ptr [rbp+58h]
00007FFBAF7B808B mov qword ptr [rbp+28h],rax // 28h
// variable d
00007FFBAF7B8087 mov rax,qword ptr [rbp+58h]
00007FFBAF7B808B mov qword ptr [rbp+28h],rax // 28h The reference type address step is consistent, as is the assembly.
Mathematical computation of generalizations
Prior to .NET 7, if we want to utilize generics for math operations. It was not possible to do so. It could only be curved through dynamic
NET 7 introduces new math-related generic interfaces and provides default implementations of the interfaces.
/zh-cn/dotnet/standard/generics/math
The underlying implementation of the Math Computing Interface
C# Layer:
The operation of summing relies heavily on the IAdditionOperators interface.
ILayer:
The + operator is JIT-compiled into the op_Addition abstract method
For int, an implementation of int is called
System.
For long, the long implementation is called
System.
In principle, it is very simple, BCL implements the basic value type of all the +-*/ operations, as long as the constraints in the generalization is good, JIT will automatically call the corresponding implementation.
reach a verdict
There were no words along the way, nothing but fighting.
generalization, use it and be done with it. It's all aboutA little attention.(Hard drives are much cheaper than programmers.) Code explosion caused by value type generalization.