A few days ago, a netizen asked me a question: why can't the target object be Null when calling an instance method. it seems like a simple question, but it's not really clear in one sentence. And this conclusion is not right, when we call the instance defined in a certain type, the target object can actually be Null.
I. From ECMA-335 Spec
II. Call Callvirt
III. Direct calls (C#)
IV. Static methods
V. Example methods for value types
VI. ?. Operators
VII. Extended methodology
I. From ECMA-335 Spec
A method that is associated with an instance of the type is either an instance method or a virtual
method (see §I.8.4.4). When they are invoked, instance and virtual methods are passed the
instance on which this invocation is to operate (known as this or a this pointer).
The fundamental difference between an instance method and a virtual method is in how the
implementation is located. An instance method is invoked by specifying a class and the instance
method within that class. Except in the case of instance methods of generic types, the object
passed as this can be null (a special value indicating that no instance is being specified) or an
instance of any type that inherits (see §I.8.9.8) from the class that defines the method. A virtual
method can also be called in this manner. This occurs, for example, when an implementation of a
virtual method wishes to call the implementation supplied by its base class. The CTS allows this
to be null inside the body of a virtual method.
A virtual or instance method can also be called by a different mechanism, a virtual call. Any
type that inherits from a type that defines a virtual method can provide its own implementation of
that method (this is known as overriding, see §I.8.10.4). It is the exact type of the object
(determined at runtime) that is used to decide which of the implementations to invoke.
A method that is associated with an instance of the type is either an instance method or a virtual
method (see §I.8.4.4). When they are invoked, instance and virtual methods are passed the
instance on which this invocation is to operate (known as this or a this pointer).
The fundamental difference between an instance method and a virtual method is in how the
implementation is located. An instance method is invoked by specifying a class and the instance
method within that class. Except in the case of instance methods of generic types, the object
passed as this can be null (a special value indicating that no instance is being specified) or an
instance of any type that inherits (see §I.8.9.8) from the class that defines the method. A virtual
method can also be called in this manner. This occurs, for example, when an implementation of a
virtual method wishes to call the implementation supplied by its base class. The CTS allows this
to be null inside the body of a virtual method.
A virtual or instance method can also be called by a different mechanism, a virtual call. Any type that inherits from a type that defines a virtual method can provide its own implementation of that method (this is known as overriding, see §I.8.10.4). It is the exact type of the object (determined at runtime) that is used to decide which of the implementations to invoke.
The above passage is excerpted fromCommon Language Infrastructure (CLI), let me briefly summarize:
- Methods that are associated with instances of a certain type, which we collectively call Instance Methods, are actually further divided into Instance Methods and Virtual Methods.I think it's clearer to refer to them as Non-Virtual Instance Methods and Virtual Instance I think it's clearer to call them Non-Virtual Instance Methods and Virtual Instance Methods;
- Looking at the IL instructions, there are two ways to call methods, Call and Callvirt. Two instance method types + two ways to call them, so that's a total of four calling scenarios;
- Call directives call methods of declared types directly, which are determined at compile time; Callvirt directives call methods of the real type of the target object, which can only be determined at runtime. In principle, the Call instruction avoids dynamic distribution of the target method, so it performs better;
- With Call does not require the target object to be Null because the target method is determined at runtime, but with Callvirt the directive needs to determine the type where the target method is located based on the specified object, so it requires that the target object not be Null.
Personally, I'm adding a few points:
- In the eyes of the CLR there is really no difference between static and instance methods; both methods automatically add a prepended parameter whose type is the type of the method. When we call a static method, the first parameter is always Null (default for value types), and when we call an instance method, the target object is the first parameter;
- In addition to the Call and Callvirt directives, method calls also have the Calli directive, which calls the target method with a more supplied method pointer and argument list;
II. Call . Callvirt
Let's answer the question posed in the opening paragraph: whether a method is virtual or not, there is no requirement that the target object is not null, as long as it is called with a Call instruction; however, we cannot use the Callvirt instruction to call Null instance methods, regardless of whether they are virtual or not. We are going to verify this conclusion using the following example.
using ; Invoke(CreateInvoker(, "Foo")); Invoke(CreateInvoker(, "Bar")); Invoke(CreateInvoker(, "Foo")); Invoke(CreateInvoker(, "Bar")); static void Invoke(Action<Foobar?> invoker) { try { invoker(null); } catch (Exception ex) { (); } } static Action<Foobar?> CreateInvoker(OpCode opcode, string methodName) { DynamicMethod foo = new DynamicMethod( name: "Invoke", returnType: typeof(void), parameterTypes: [typeof(Foobar)]); var il = (); (OpCodes.Ldarg_0); (opcode,typeof(Foobar).GetMethod(methodName)!); (); return (Action<Foobar?>)(typeof(Action<Foobar?>)); } public class Foobar { public void Foo() => (this is null); public virtual void Bar() => (this is null); }
As shown in the code snippet above, the Foobar class defines two instance methods, Foo and Bar, the former being a regular method and the latter being a dummy method.The CreateInvoker method creates a DynamicMethod based on the specified method call instruction and method name, which in turn creates the Action< Foobar> delegate that calls the specified method.The Invoke method executes the specified Action< Foobar> delegate in Try/Catch to determine whether the method call is made. The Invoke method executes the specified Action<Foobar> delegate in a Try/Catch to determine if the method call completed successfully. The demo program calls the Invoke method four times, demonstrating the call of regular/virtual methods with Call/Callvirt directives, and the output shown below confirms our conclusion.
III. Direct calls (C#)
So what happens when you call regular and dummy methods in C#? For this I defined the following two static methods Foo and Bar and then created the corresponding Action<Foobar> delegate based on them to call the Invoke method as a parameter.
using ; Invoke(Foo); Invoke(Bar); static void Foo(Foobar? foobar) => foobar!.Foo(); static void Bar(Foobar? foobar) => foobar!.Bar(); static void Invoke(Action<Foobar?> invoker) { try { invoker(null); } catch (Exception ex) { (); } } public class Foobar { public void Foo() => (this is null); public virtual void Bar() => (this is null); }
As you can see from the following output, it is required that the target object is not Null, regardless of whether the method called is a virtual method or not.
According to our conclusion above, since the method call is used for "null reference validation", the method call instruction used cannot be Call. The following is the IL code for the static methods Foo and Bar, and it can be seen that they both call the Foo and Bar methods of the Foobar object using the instruction Callvirt.
.method assembly hidebysig static void '<<Main>$>g__Foo|0_0' ( class Foobar foobar ) cil managed { .custom instance void []::.ctor(uint8) = ( 01 00 02 00 00 ) .custom instance void []::.ctor() = ( 01 00 00 00 ) // Method begins at RVA 0x20b2 // Header size: 1 // Code size: 8 (0x8) .maxstack 8 // foobar!.Foo(); IL_0000: ldarg.0 IL_0001: callvirt instance void Foobar::Foo() // } IL_0006: nop IL_0007: ret } // end of method Program::'<<Main>$>g__Foo|0_0'
.method assembly hidebysig static void '<<Main>$>g__Bar|0_1' ( class Foobar foobar ) cil managed { .custom instance void []::.ctor(uint8) = ( 01 00 02 00 00 ) .custom instance void []::.ctor() = ( 01 00 00 00 ) // Method begins at RVA 0x20bb // Header size: 1 // Code size: 8 (0x8) .maxstack 8 // foobar!.Bar(); IL_0000: ldarg.0 IL_0001: callvirt instance void Foobar::Bar() // } IL_0006: nop IL_0007: ret } // end of method Program::'<<Main>$>g__Bar|0_1'
As far as I remember (and I could be wrong), for regular non-virtual method call directives, the original compiler would use the Call directive, and I don't know which version of the compiler has standardized on the Callvirt directive. In fact, it is good to understand, if the method does not involve the target object, we should define it as a static method, against the instance method to perform null reference verification is actually necessary.
IV. Static methods
As we said above, static methods are not different from instance methods, but the first parameter specified when calling a static method is always Null, so it is not possible to use the Callvirt directive for calls against them, but can only be specified using Call. The following is the IL code for the static method Invoke, which can be parameterized to call the method using the Call instruction.
.method assembly hidebysig static void '<<Main>$>g__Invoke|0_2' ( class []`1<class Foobar> invoker ) cil managed { .custom instance void []::.ctor() = ( 01 00 00 00 ) .param [1] .custom instance void []::.ctor(uint8[]) = ( 01 00 02 00 00 00 01 02 00 00 ) // Method begins at RVA 0x20c4 // Header size: 12 // Code size: 31 (0x1f) .maxstack 2 .locals init ( [0] class [] ex ) // { IL_0000: nop .try { // { IL_0001: nop // invoker(null); IL_0002: ldarg.0 IL_0003: ldnull IL_0004: callvirt instance void class []`1<class Foobar>::Invoke(!0) // (no C# code) IL_0009: nop // } IL_000a: nop IL_000b: leave.s IL_001e } // end .try catch [] { // catch (Exception ex) IL_000d: stloc.0 // { IL_000e: nop // (); IL_000f: ldloc.0 IL_0010: callvirt instance string []::get_Message() IL_0015: call void []::WriteLine(string) // (no C# code) IL_001a: nop // } IL_001b: nop IL_001c: leave.s IL_001e } // end handler IL_001e: ret } // end of method Program::'<<Main>$>g__Invoke|0_2'
V. Example methods for value types
For calls to instance methods of value types, since the target object cannot be Null and there is no such thing as a virtual method for value types, the instruction used should also be Call.
static void Do(Foobar foobar) => (); public struct Foobar { public void Do() { } }
The static method Do defined above has the following IL code, which shows that the instruction it uses to call the method of the same name of the structure Foobar is Call.
.method assembly hidebysig static void '<<Main>$>g__Do|0_0' ( valuetype Foobar foobar ) cil managed { .custom instance void []::.ctor() = ( 01 00 00 00 ) // Method begins at RVA 0x2064 // Header size: 1 // Code size: 9 (0x9) .maxstack 8 // (); IL_0000: ldarga.s foobar IL_0002: call instance void Foobar::Do() // } IL_0007: nop IL_0008: ret } // end of method Program::'<<Main>$>g__Do|0_0'
VI. ?. Operators
When making a method call, if you are not sure whether the target object is Null or not, using the ??. operator is necessary.
static string ToString(object? instance) => instance?.ToString() ?? "N/A";
?. operator is just a syntactic sugar, and the compiler translates the above code into the following form:
static string ToString(object? instance) => ((instance != null) ? () : null) ?? "N/A";
VII. Extended methodology
Extension methods are static methods, so they are called without null reference validation. However, extension methods are called as instance methods, so I recommend null reference validation on the first parameter passed when defining extension methods.
public static class FoobarExtesnions { public static void ExtendedMethod(this Foobar foobar) { (foobar, nameof(foobar)); ... } }