-8-
Methods
The code of a data type is
implemented by a method, which is executed by the Execution Engine. The CLR
offers a large number of services to support the execution of code.
Any code that uses these
services is called managed code. Managed code allows the CLR to provide a set
of features such as handling exceptions. It also makes sure that the code is
verifiable. Only managed code has access to managed data.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object {
.method public hidebysig static void vijay() il managed {
.entrypoint
call instance void a1()
ret
}
}
.method public instance void
a1() il managed
{
ldstr "hi"
call void System.Console::WriteLine(class System.String)
ret
}
Output
hi
There is no rule in the IL book
that prevents a method from being global. It can certainly be written outside a
class.
a.il
.assembly mukhi {}
.method public instance void
a1() il managed
{
.entrypoint
ldstr "hi"
call void System.Console::WriteLine(class System.String)
ret
}
Output
hi
In fact we can write the
smallest IL program without using the class directive. It is mandatory to have
a function with the entrypoint directive. Thus, had the designers of C# so
desired, they could have provided the facility of global functions, but they
chose not to. They decided, in their infinite wisdom, that all functions should
be placed within a class. There is no such restriction imposed by IL.
The CLR recognizes three types
of methods: static, instance and virtual. There are some special functions that
are automatically called by the runtime such as static constructors or type
initializers such as .cctor and instance constructors such as .ctor.
A method in IL is uniquely
identified by its signature. A signature
consists of five parts:
• The name of the method
• The type or class that the method
resides in
• The calling convention used
• The return type
• The parameter types.
a.il
.assembly mukhi {}
.method public instance void
a1() il managed
{
.entrypoint
call instance int32 a2()
pop
call instance void a2()
ret
}
.method public instance void
a2() il managed
{
ldstr "hi"
call void System.Console::WriteLine(class System.String)
ret
}
.method public instance int32 a2() il managed
{
ldstr "hi1"
call void System.Console::WriteLine(class System.String)
ldc.i4.2
ret
}
Output
hi1
hi
For people like us, who are
familiar with the world of C, C++ and Java, the concept of a message signature
depending upon the return type of a function is alien.
Here, we have two functions,
both named a2, which differ in the type
of return value. This is perfectly
valid in IL. The reason being that when calling a method in IL, we only have to
state the return type. But what is allowed in IL, may be taboo in C#.
Method overloading is a concept
where the same function name appears in a class, more than once. In fact, you
may not have clearly observed, in the above programs, the this pointer is not
passed to the global functions. Even then, things worked well.
The reason for this is that
generally, global functions are static by default. In fact, static functions
are found in classes, value types and interfaces. Static functions always have
a body associated with them.
The second type of method very
commonly used is an instance. These are functions associated with an instance
of a class. In this version of the CLR, we cannot declare them in interfaces.
Unlike static methods which are stand-alone methods and behave like global
functions, an instance functions is always passed a pointer or reference to the
data associated with the object. Thus, it can use the this pointer to access a
different set of data each time.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object {
.method public hidebysig static void vijay() il managed {
.entrypoint
call void zzz::a1()
ret
}
.method public instance void
a1() il managed
{
ldstr "hi"
call void System.Console::WriteLine(class System.String)
ret
}
}
Output
Exception occurred: System.MissingMethodException: Void
.zzz.a1()
at zzz.vijay()
A runtime exception is thrown
cause the call expects the method to be static, whereas, our method is an
instance. To avoid this runtime error, replace the modifier instance with
static.
The this pointer is of the same
type as the class in which the method resides. We therefore, have to create an
instance of a class before we can execute any instance method from the class.
As a rule, all instance
functions must have the this pointer as the first parameter. Therefore, it is
automatically added as a first hidden parameter. The this pointer can be a null
reference too.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.field int32 i
.method public hidebysig static void vijay() il managed
{
.entrypoint
newobj instance void zzz::.ctor()
ldnull
call instance void zzz::a1()
ret
}
.method public instance void
a1() il managed
{
ldstr "hi"
call void System.Console::WriteLine(class System.String)
ldarg.0
ldc.i4.2
stfld int32 zzz::i
ret
}
}
Output
Exception occurred: System.ExecutionEngineException: An
exception of type System.ExecutionEngineException was thrown.
at zzz.vijay()
Whenever we refer to a field in
a type, through a function, the this pointer should first be available on the
stack. This facilitates access to the instance fields. This explains the above
error.
Here, we have placed a ldnull as
the this pointer, and thus, are unable to access the instance members. On
commenting the ldnull, no error is generated.
The instruction newobj places a
this pointer on the stack. Therefore, prior to using it, ldarg.0 is checked for
NULL. However, for a value type, the this pointer is a managed pointer to the
value type. Unlike static or virtual, an instance is not an attribute of a
method. It is part of the calling convention of a method.
There are three ways to call a
method in IL. These are: call, callvirt and calli. Two of these, call and
callvirt, have already been dealt with, in the past.
There are three other
instructions that can be used to call a method in a special way. These are jmp,
jmpi and newobj. Every method that we call has its own evaluation stack. The
parameters to the function are placed on this stack, and instructions also
obtain their arguments from the same stack.
On the execution of an
instruction, the result is also placed on the same stack. The runtime creates
and maintains this stack. When the method quits out, the stack is released.
There is another stack that we
do not concern ourselves with. This stack keeps track of the method being
called, and hence, is known as the call stack.
The last and final instruction
in any function is the ret instruction. This instruction is responsible for the
method returning control back to the calling method. If a function returns a
value, it must be placed on the stack before ret is called. When quitting off a
method, the stack must not contain any value, other than the value to be
returned.
We use the call instruction to
call static or virtual functions. Before
the call instruction, all the parameters to the method must be placed on
the stack. The first argument to the function is placed first. The only
difference between calling a static and an instance method is that, the
modifier instance is used for an instance method whereas, no modifier is
required for a static method.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.field int32 i
.method public hidebysig static void vijay() il managed
{
.entrypoint
newobj instance void zzz::.ctor()
pop
ldnull
callvirt instance void zzz::a1()
ret
}
.method public virtual instance void a1() il managed
{
ldstr "hi"
call void System.Console::WriteLine(class System.String)
ret
}
}
Output
Exception occurred: System.NullReferenceException: Attempted
to dereference a null object reference.
at zzz.vijay()
Virtual functions have to be
handled with care as they are runtime entities. With virtual functions, the
instruction callvirt is used in place of call. callvirt unlike call executes
the overriding version of the method.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (class yyy V_0)
newobj instance void
xxx::.ctor()
stloc.0
ldloc.0
callvirt instance void
yyy::abc()
ldloc.0
call instance void
yyy::abc()
ret
}
}
.class private auto ansi yyy extends [mscorlib]System.Object
{
.method public hidebysig newslot virtual instance void abc() il
managed
{
ldstr "yyy
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ret
}
}
.class private auto ansi xxx extends yyy
{
.method public hidebysig virtual instance void abc() il managed
{
ldstr "xxx
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ret
}
.method public hidebysig specialname rtspecialname instance void
.ctor() il managed
{
ldarg.0
call instance void yyy::.ctor()
ret
}
}
Output
xxx abc
yyy abc
We have pulled out this program
from an earlier chapter, where we explained new, override and virtual
functions. The callvirt function calls the function abc from xxx, as it
overrides the one from the class yyy.
The reason being, in the class
xxx, there is no modifier newslot for the function abc, hence it is a different
abc from the one in the base class. With call however, the instruction simply
calls abc from the class specified, as it does not understand modifiers like
virtual, newslot etc. instance is used with callvirt as the this pointer, under
no circumstances, can be NULL.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (class yyy V_0)
newobj instance void
xxx::.ctor()
stloc.0
ldloc.0
callvirt instance void
yyy::abc()
ret
}
}
.class private auto ansi yyy extends [mscorlib]System.Object
{
.method public hidebysig newslot virtual instance void abc() il
managed
{
ldstr "yyy
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ret
}
}
.class private auto ansi xxx extends yyy
{
.method public hidebysig virtual instance void abc() il managed
{
ldstr "xxx
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ldloc.0
call instance void
yyy::abc()
ret
}
.method public hidebysig specialname rtspecialname instance void
.ctor() il managed
{
ldarg.0
call instance void yyy::.ctor()
ret
}
}
Output
xxx abc
yyy abc
In the above example, the super
class function abc from the class yyy is called, from the function abc from
class xxx. This facilitates reusing
code defined in the super class.
A virtual function may want to
call all code in the base class. In IL parlance, it is termed as a super call.
In the above code, we foresee a problem
with callvirt as it will either call itself over and over again, or give us the
following exception:
Output
xxx abc
Exception occurred: System.NullReferenceException: Attempted
to dereference a null object reference.
at xxx.abc()
at zzz.vijay()
The reason for the above error
is that, the this pointer refers to class xxx and not of the class yyy. Thus,
the instruction call is used and not callvirt.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
newobj void zzz::.ctor()
call instance void zzz::abc()
ret
}
.method public instance void abc()
{
ldstr "hi"
call void System.Console::WriteLine(class System.String)
jmp instance void zzz::pqr()
ldstr "bye"
call void System.Console::WriteLine(class System.String)
ret
}
.method public instance void pqr()
{
ldstr "pqr"
call void System.Console::WriteLine(class System.String)
ret
}
}
Output
hi
pqr
We have created an object like
zzz using newobj. It places a reference to a zzz on the stack. The this pointer
then calls the instance function abc.
Here we have displayed
"hi" and then an instance method pqr is called using the jmp
instruction.
After the method pqr finishes
execution, control does not regress to method abc. Instead, control returns
back to vijay, which is the method that called abc. Thus the string
"bye" present in the method pqr, does not get displayed.
The jmp instruction does not
revert the control back to the method from where the program initially branched
out.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object {
.method public hidebysig static void vijay() il managed {
.entrypoint
newobj void zzz::.ctor()
call instance void zzz::abc()
ret
}
.method public instance void abc()
{
ldstr "hi"
call void System.Console::WriteLine(class System.String)
ldftn instance void zzz::pqr()
jmpi
ldstr "bye"
call void System.Console::WriteLine(class System.String)
ret
}
.method public instance void pqr()
{
ldstr "pqr"
call void System.Console::WriteLine(class System.String)
ret
}
}
Output
hi
pqr
The above program is similar to
its predecessor, but it uses the instruction jmpi instead of jmp. This
instruction is similar to jmp, but differs in the following aspects:
• In the case of the jmp instruction, we
placed the method signature on the stack as a parameter to the instruction.
• In the case of the jmpi instruction, we
first use the instruction ldftn to load the address of the function pqr on the
stack, and then call jmpi.
The jmp family of instructions
executes a jump or a branch across a method. We can only jump to the beginning
of a method, and not to anywhere inside it. The signature of the method that we
intend to jump to, must be the same.
Output
Exception occurred: System.ExecutionEngineException: An
exception of type System.ExecutionEngineException was thrown.
at zzz.abc()
at zzz.vijay()
If the signature of the method
being jumped to is not the same, the above exception is thrown. The jmp
instruction is not verifiable.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
newobj void zzz::.ctor()
ldc.i4.1
ldc.i4.2
call instance void zzz::abc(int32,int32)
ret
}
.method public instance void abc(int32 i, int32 j)
{
ldc.i4.3
starg j
ldarg j
call void System.Console::WriteLine(int32)
jmp instance void zzz::pqr(int32,int32)
ret
}
.method public instance void pqr(int32 p,int32 q)
{
ldarg.1
call void System.Console::WriteLine(int32)
ldarg.2
call void System.Console::WriteLine(int32)
ret
}
}
Output
3
1
2
The method abc take two ints as
parameters. We have placed the constant 3 on the stack, and then used the
instruction starg to change the parameter j. Then, ldarg is used to place the
new value on the stack. Thereafter, we have called the WriteLine function to
confirm if the new value is 3. The jmp
instruction is the next to be called.
Here we have not placed any
parameters on the stack. The jmp instruction first places the numbers 1 and 2
on the stack, and then, calls the function pqr, that simply displays the
parameters that have been passed.
Even though we have changed the
parameter j, the change is not reflected in the called function pqr. This is
contrary to what the documentation states. The call does not pass parameters to
the next method. The instruction jmp does so.
If function pqr returns a value,
it will be passed to the function vijay and not to abc. We cannot place any
values on the stack before executing the jump. Jumps can be executed only
between methods that have the same signatures.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
newobj void zzz::.ctor()
ldftn instance void zzz::abc()
calli instance void ()
ret
}
.method public instance void abc()
{
ldstr "hi"
call void System.Console::WriteLine(class System.String)
ret
}
}
Output
hi
We can call a method indirectly
by first, placing its address on the stack, and then, using the calli
instruction. At first, the instruction ldftn places the address of a
non-virtual function on the stack. Like in the case of instance functions, the
this pointer has to be placed first on the stack, followed by the parameters to
the functions. When we tried using calli with the address of a virtual
function, Windows generated an error.
We use the newobj instruction to
create a new instance, and also, call the constructor of a class, which is
nothing more than a special instance method.
The only difference between a
constructor and an instance call is that, the this pointer is not passed to the
constructor. newobj first creates the object, and then, automatically places
the this pointer on the stack.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.field int32 i
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals ( class zzz v)
newobj void zzz::.ctor()
stloc.1
ldloc.1
ldfld int32 zzz::i
call void System.Console::WriteLine(int32)
ldloc.1
ldc.i4.2
stfld int32 zzz::i
ldloc.1
ldfld int32 zzz::i
call void System.Console::WriteLine(int32)
ldloc.1
call instance void zzz::.ctor()
ldloc.1
ldfld int32 zzz::i
call void System.Console::WriteLine(int32)
ret
}
.method public hidebysig specialname rtspecialname instance void
.ctor() il managed
{
ldarg.0
ldc.i4.1
stfld int32 zzz::i
ret
}
}
Output
1
2
1
The newobj instruction places
the this pointer on the stack before calling the constructor. If we desire to
call the constructor ourselves, we too need to
place the this pointer on the stack.
In the above program, we have
changed the value of the field i to 1, then again changed it to 2 using stfld
and then displayed this value. Thereafter, we have called the constructor,
which changes the value back to 1 again. This proves that a constructor is no
different from any other function.
A method definition is called a
method head in IL. The head also functions as an interface to other methods.
The format of the head is as follows:
• It starts with a number of predefined
method attributes.
• These are followed by an optional
indication, specifying whether the method is an instance method or not.
• Thereafter, the calling convention is
specified.
• This is followed by the return type and
a few more optional parameters.
• Finally, we state the name and the
parameters to the method and the implementation attributes.
Methods are instance by default.
To change the default behavior, we use use the modifiers static or virtual. As
of today, the return type cannot have any attributes, but who knows, what
changes may take place tomorrow.
The code for the method is
written in the method body. It can incorporate a large number of directives.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.emitbyte 0x19
call void System.Console::WriteLine(int32)
ret
}
}
Output
3
The code that we write, gets
converted into numbers. Every IL instruction is represented by a number. The
ldc.i4.3 instruction is known by the number 19 hex. This information is
available in the Instruction Set Reference. The directive emitbyte emits an unsigned
8 bit number directly into the code section of the method.
Thus, we can use the opcodes of
an IL instruction directly in il programs.
The return value of the
entrypoint function can either be void, int32 or unsigned int32. This value is
handed over to the Operating System. A value of ZERO normally indicates success
and any other value indicates an error. The entrypoint method is unique,
meaning, it can have private accessibility, and yet be accessed by the runtime.
The .locals directive is used to
create a local variable that can only be accessed from within that method.
Thus, it is used to store data that exists only for the duration of a method
call. After a method quits, all the memory allocated for a local is reclaimed
by IL.
It is faster for the system to
allocate memory on the stack, where locals get stored, than to allocate memory
on the heap for the fields. We cannot specify attributes for local variables,
like we do for parameters.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldc.i4.1
stloc.0
ldloc.0
call void System.Console::WriteLine(int32)
.locals ( int32 i)
ret
}
}
Output
1
The .locals directive can be
placed at the end of the code and does not have to be placed at the beginning.
Thus, in a sense, a forward reference is allowed here.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
//.zeroinit
.locals ( int32 i)
ldloc.0
call void System.Console::WriteLine(int32)
ret
}
}
Output
51380288
Remove the comments and a value
of zero will be displayed.
There is some overlap in IL. If
we use the modifier init in the locals directive, then all the variables will
be assigned their default values, depending upon their type. We have touched
upon this point earlier.
The same effect is seen when we
use the directive .zeroinit. This applies to all the locals in the method.
• If we place the comments, the variable
i will be assigned whatever value is present on the stack.
• If we remove the comments, the runtime
initialises all the value types to ZERO and all the reference types to NULL.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.zeroinit
.method public hidebysig static void vijay() il managed
{
.entrypoint
ret
}
}
Error
a.il(4) : error : syntax error at token '.zeroinit' in:
.zeroinit
***** FAILURE *****
Some of the directives can only
be used within certain entities. The directive .zeroinit can only be used
within a method and not outside. The assembler checks whether the directive has
been used at the right place or not. If not, it generates an error message that
is hardly informative.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (class yyy V_0)
newobj instance void
xxx::.ctor()
stloc.0
ldloc.0
callvirt instance void
yyy::abc()
ret
}
}
.class private auto ansi yyy extends [mscorlib]System.Object
{
.method public hidebysig virtual instance void abc() il managed
{
ldstr "yyy
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ret
}
.method public hidebysig specialname rtspecialname instance void
.ctor() il managed
{
ldarg.0
call instance void System.Object::.ctor()
ret
}
}
.class private auto ansi xxx extends yyy
{
.method public hidebysig virtual instance void abc() il managed
{
ldstr "xxx
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ret
}
.method public hidebysig specialname rtspecialname instance void
.ctor() il managed
{
ldarg.0
call instance void yyy::.ctor()
ret
}
}
Output
xxx abc
You may accuse us of being
repetitive, but there is no harm in refreshing our memory.
Class yyy is a base class and
xxx the derived class. We have created a local of type yyy, which is the base
class, but initialized it to the class xxx, which is the derived class. A
better way to say it is, we are creating an object that looks like xxx, but
storing it in a yyy local.
callvirt calls the function abc
from the class xxx despite of it being called from the yyy class, . This is
because, the instruction callvirt executes at runtime. In that environment, the this pointer on the
stack is of class xxx, and thus abc
from the class xxx is called. The virtual function has its own unique way of
deciding on the pointer to be placed on the stack.
If we remove the modifier
virtual from the function abc in class xxx, then the function abc will be called
from the yyy class. Changing the newobj to yyy does not make a difference, as
both the run time and compile time data types should be the same. The run time
data type takes precedence over the compile time data type.
We add the modifier newslot in
function abc class xxx as follows:
.method public hidebysig newslot virtual instance void abc() il
managed
Here, from the point of view of
the run time, the function abc is treated as a new function. As there is no
connection with the abc of class yyy, they are now treated as two distinct
functions. The abc of class yyy is called. Placing the modifier newslot in
class yyy function for abc makes it a new function abc, if one is present in
the object. Thus, it makes no difference here.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (class yyy V_0 , class xxx V_1)
newobj instance void
www::.ctor()
stloc.0
ldloc.0
callvirt instance void
yyy::abc()
newobj instance void www::.ctor()
stloc.1
ldloc.1
callvirt instance void
xxx::abc()
ret
}
}
.class private auto ansi yyy extends [mscorlib]System.Object
{
.method public hidebysig virtual instance void abc() il managed
{
ldstr "yyy
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ret
}
.method public hidebysig specialname rtspecialname instance void
.ctor() il managed
{
ldarg.0
call instance void System.Object::.ctor()
ret
}
}
.class private auto ansi xxx extends yyy
{
.method public hidebysig virtual instance void abc() il managed
{
ldstr "xxx
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ret
}
.method public hidebysig specialname rtspecialname instance void
.ctor() il managed
{
ldarg.0
call instance void yyy::.ctor()
ret
}
}
.class private auto ansi www extends xxx
{
.method public hidebysig virtual instance void abc() il managed
{
ldstr "www
abc"
call void
[mscorlib]System.Console::WriteLine(class System.String)
ret
}
.method public hidebysig specialname rtspecialname instance void
.ctor() il managed
{
ldarg.0
call instance void xxx::.ctor()
ret
}
}
Output
www abc
www abc
The above program is pretty
large. The only difference between this program and its predecessor is that, we
have added one more class www derived from xxx. We have created two locals, one
each of the types xxx and yyy, but the run time data type of both the locals is
a www object.
The functions abc are virtual
throughout. When we call the functions abc though callvirt, even though we are
using the class prefix xxx and yyy, the function gets called from www.
This is so because the run time
data type, i.e. www, of the this pointer has been passed.
Then, we make our first small
change: We add a newslot to the function abc in class www.
The output now reads as follows:
xxx abc
xxx abc
This output has resulted as
shown above because, newslot dissociates the function abc of the class www,
from the earlier abc functions. Thus, since the abc of class xxx is the newest,
it gets called.
Next, we add the modifier
newslot to the function abc from class xxx and remove it from the class www.
The output now reads as.
yyy abc
www abc
Isn't the output fascinating?
Now you probably can understand, as to why we are revisiting virtual functions.
By adding the modifier newslot
to the function abc in class xxx, we are creating two families of abc:
• One that comprises only of a single abc
in class yyy
• Another made up of abc functions from
classes xxx and www.
Thus, in every instance, the
last member of the family gets called and, since the first family has only one
member, this single member i.e. class yyy, gets called.
In the second case, the abc of
class www gets called. Now let us add the newslot modifier to function abc
class www, without removing the one from class xxx.
The output now reads as follows:
yyy abc
xxx abc
Now, we have three families of
abc functions. Each of them has only one function abc that has nothing to do
with the abc functions of the other families.
If we add the modifier newslot
to the function abc in class yyy, we will not see any change in the output.
This is because, we are cutting off abc from its root, from class yyy onwards.
There is no function abc in any of the classes that yyy derives from. Hence,
there is no change in the output.
If we remove virtual from the
function abc in class www, it has the same effect as adding the modifier
newslot. A virtual modifier function signifies that the address of the function
to be called should be read from the vtable. If we remove the virtual modifier
from function abc class xxx, the output will be as follows:
www abc
xxx abc
This output has resulted because
of the following:
The object created is a www
type.
• In the first case, the vtable has the
address of a www abc. The vtable stores a single address of every virtual
function. The runtime checks for the compile time data type of the pointer and
on examining, it looks like yyy. Within yyy, it discovers that function abc is
virtual. Thus it looks into the vtable for the address which turns out to be
that of www.
• In the second case, at the compile time
the type revealed is xxx. But within the class xxx, the function is not virtual
and thus, the vtable does not come into play.
Now we remove virtual from the
function abc of class yyy only. Remember, we are making only one change a time.
The output now will be as follows:
yyy abc
www abc
The same explanation as given
earlier applies here too. We hope you
will remember us and our brilliant explanation of the concept of virtual. At
least, this is how we interpret it, and do not mind being the only ones to do
so in this manner.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (int32 i)
ldc.i4.1
stloc i
ldloc i
call void System.Console::WriteLine(int32)
{
.locals (int32 i)
ldc.i4.2
stloc i
ldloc i
call void System.Console::WriteLine(int32)
}
ldloc.0
call void System.Console::WriteLine(int32)
ldloc.1
call void System.Console::WriteLine(int32)
ret
}
}
Output
1
2
1
2
In IL, the scoping levels do not
exhibit similar behavior to those found in traditional languages like C. Here i
is created as a new variable each time with the { brace even though, all the
variables are moulded together into one large local directive.
Thus we refer to the individual
variables i in their respective blocks. The ldloc.0 stands for the first i
whereas, ldloc.2 stands for the inner i that is visible in the outer brace.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (int32 i)
ldloca i
call void System.Console::WriteLine(int32)
{
.locals (int32 i)
ldloca i
call void System.Console::WriteLine(int32)
}
ret
}
}
Output
6552336
6552340
The above program displays
different values for the local variable i. The output proves that they are
created consecutively in memory.
Whenever you are in doubt,
display the value of the variables and clear up the cobwebs in your mind. Thus,
scope blocks are also known as syntactic sugar and are only used to increase
the readability and to debug code written by others.
Internally, for a variable name,
IL begins at the scope we are presently in, and recursively tries to resolve
the name of the variable. Thus, even though a declaration hides the name of a
variable, we can access it using the index. The scope does not change the
lifetime of a variable. All the variables in a method are created when we first
enter the method, and die when we exit from it. The variable is always
accessible by the zero based index, that is allocated on a "first come
first served" basis.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldc.i8 4
call vararg void zzz::abc(..., int32)
ret
}
.method public static vararg void abc()
{
.locals init (value class System.ArgIterator it,int32 x)
ldloca it
initobj value class
System.ArgIterator
ldloca it
arglist
call instance void System.ArgIterator::.ctor(value class
System.RuntimeArgumentHandle)
ldloca x
ldloca it
call instance typedref System.ArgIterator::GetNextArg()
call class
System.Object System.TypedReference::ToObject(typedref)
castclass System.Int32
unbox int32
cpobj int32
ldloc x
call void System.Console::WriteLine(int32)
ret
}
}
Output
4
The above program demonstrates
how a function accepts multiple number
of parameters.
Vararg is a calling convention
that allows passing of multiple parameters to a function. We have created a
variable called it, that looks like System.ArgIterator. We have then loaded its
address on the stack using ldloca and then called arglist. This instruction
returns an opaque handle i.e. an unmanaged pointer which represents all the
arguments passed to the method. This handle can be passed to other methods but
is valid only during the lifetime of the current method. This opaque handle is
of the type RuntimeArgumentHandle.
The arglist instruction is valid
on methods that take a variable number of arguments. The constructor of the value class ArgIterator is called with
this handle as a parameter.
Once the value class is
instantiated, we place the address of a local variable x on the stack. This is
more to store the parameter passed to our function. Subsequenly, the address of
variable it is put on the stack too. A function GetNextArg from class
ArgIterator is called that places a typedref on the stack, which is then passed
to the function ToObject.
Then, the class to an int32 is
casted and unboxed as we need a value type. This value is copied to the
variable x. The vararg is a calling convention, and thus, part of the signature
of the method. We are specifying it as part of the call instruction. The ellipsis
denote the end of fixed parameters and beginning of the variable number of
parameters. This is because, a function may want to have a certain fixed number
of parameters also.
The other functions of the class
ArgIterator can also give us useful information, such as the number of items on
the stack.
We use method parameters to
enable a method to accept data from the caller. Method parameters are checked
for type safety. They make it mandatory for a method to be called with the
correct parameters. The Execution Engine enforces the contract between the
caller and the called methods.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals ( int32 i)
ldc.i4 4
stloc.0
ldloc.0
call void System.Console::WriteLine(int32)
newobj instance void zzz::.ctor()
ldloc.0
call instance void zzz::abc(int32)
ret
}
.method public instance void abc(int32 )
{
ldarg.1
call void System.Console::WriteLine(int32)
ret
}
}
Output
4
4
We are not compelled to assign
any name to the parameters. In the above program, we have a local as well as a
parameter of type int32 which has no name or id. IL does not seem to care at
all. However, the unnamed variables can be referenced only as an index.
Parameters can also have attributes, as we shall now see, but these attributes
have nothing to do with the signature.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
newobj instance void zzz::.ctor()
ldc.i4 2
call instance void zzz::abc(int32)
ret
}
.method public instance void abc([opt] int32 i )
{
ldarg.1
call void System.Console::WriteLine(int32)
ret
}
}
Output
2
The first attribute to a
parameter is opt, which makes it optional. This means that, it is not
compulsory to pass a parameter to our function.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
newobj instance void zzz::.ctor()
call instance void zzz::abc()
ret
}
.method public instance void abc([opt] int32 i )
{
ret
}
}
Output
Exception occurred: System.MissingMethodException: Void
.zzz.abc()
at zzz.vijay()
Always read the fine print. The
opt attribute may indicate that the parameter is optional, but it is used for
documentation purposes only. The compiler may place the opt attribute on a
parameter, so that other tools make sense of it. As far as the runtime is
concerned, however, all the parameters are mandatory, and it simply ignores the
opt attribute. Thus, opt has no significance for the runtime.
Implementation attributes
provide a lot of information about the nature of the method to the runtime.
These attributes decide whether the method requires special handling at runtime
or not.
The
Synchronized Attribute
a.il
.assembly mukhi {}
.class public auto ansi yyy extends [mscorlib]System.Object
{
.method public hidebysig instance void abc()synchronized
{
.locals (int32 V_0)
ldc.i4.0
stloc.0
br.s IL_0018
IL_0004: ldloc.0
call void
[mscorlib]System.Console::WriteLine(int32)
ldc.i4 0x3e8
call void
[mscorlib]System.Threading.Thread::Sleep(int32)
ldloc.0
ldc.i4.1
add
stloc.0
IL_0018: ldloc.0
ldc.i4.3
ble.s IL_0004
ret
}
}
.class public auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (class yyy V_0,class [mscorlib]System.Threading.Thread
V_1,class [mscorlib]System.Threading.Thread V_2)
newobj instance void
yyy::.ctor()
stloc.0
ldloc.0
ldftn instance void
yyy::abc()
newobj instance void
[mscorlib]System.Threading.ThreadStart::.ctor(class System.Object,int32)
newobj instance void
[mscorlib]System.Threading.Thread::.ctor(class
[mscorlib]System.Threading.ThreadStart)
stloc.1
ldloc.0
ldftn instance void
yyy::abc()
newobj instance void
[mscorlib]System.Threading.ThreadStart::.ctor(class System.Object,int32)
newobj instance void
[mscorlib]System.Threading.Thread::.ctor(class
[mscorlib]System.Threading.ThreadStart)
stloc.2
ldloc.1
call instance void
[mscorlib]System.Threading.Thread::Start()
ldloc.2
call instance void
[mscorlib]System.Threading.Thread::Start()
ret
}
}
Output with synchronized
0
1
2
3
Output without synchronized
0
0
1
1
2
2
3
3
You should run the above program
with and without the synchronized attribute to appreciate its significance.
The attribute il managed tells
the runtime that the method contains IL code that will run in the managed
world. We have created two threads, V_1 and V_2. These execute the same
function abc from class yyy.
In the function abc, we display
numbers from 0 to 3, using a loop. After displaying a number, the Sleep
function stalls all operations for 1000 milliseconds. Thus the first thread
executes function abc, prints the value 0 and then sleeps. Now the second
thread takes advantage of the fact that the first thread is sleeping, and it
also displays 0 and falls asleep. This continues till we reach the value 3 and
exit from the loop.
The synchronized attribute does
not execute the second function until the first thread terminates. Thus, the
second thread has no choice but to wait until the first thread finishes
execution. Try implementing the above in C#.
What we are trying to say is
that if C# does not inculcate a feature of
IL, there is no way you can use it in any .cs program.
If a code implementation
attribute is not given, the default value is il managed. The other three
options are native, optil and runtime. These are mutually exclusive. The runtime
attribute specifies that the implementation of the code will be supplied by the
runtime, and not by the programmer. We cannot place any code in this type of a
method. It is used for constructors and delegates.
a.il
.assembly mukhi {}
.class public auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() optil
{
.entrypoint
ret
}
}
On running the ‘a.exe’
executable, three message boxes pop up with the following message.
Unable to load OptJit Compiler (MSCOROJT.DLL). File may be
missing or corrupt. Please check your setup or rerun setup.
Failure to compile a method to native code. Most likely it is
a corrupt executable file.
Windows Protection Error
The program reported the above
errors on the introduction of the new attribute optil. It clearly says that it
could not find a particular dll. The attribute optil means that the code is an
optimized IL code that runs faster.
We normally end all our
attributes for a method with the qualifier managed or unmanaged. The default
value is managed. This signifies as to who will manage the execution of the
method.
• Managed signifies that the CLR will manage it.
• Unmanaged signifies that someone else
will manage it.
a.il
.assembly mukhi {}
.class public auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il unmanaged
{
.entrypoint
ret
}
}
Output
Exception occurred: System.ExecutionEngineException: An
exception of type System.ExecutionEngineException was thrown.
at zzz.vijay()
If we use the unmanaged
attribute with pure IL code we get the above exception.
a.cs
using System;
using System.Runtime.InteropServices;
class zzz
{
[DllImport("user32.dll")]
public static extern int MessageBoxA(int h, string m, string c,
int type);
public static void Main()
{
MessageBoxA(0,"Hell","Bye",0);
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz
extends
[mscorlib]System.Object
{
.method public hidebysig static
pinvokeimpl("user32.dll" winapi)
int32 MessageBoxA(int32
h,class System.String m,class System.String c,int32 type) il managed
{
}
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldc.i4.0
ldstr
"Hell"
ldstr
"Bye"
ldc.i4.0
call int32
zzz::MessageBoxA(int32,class System.String,class System.String,int32)
pop
ret
}
}
There are over a trillion lines
of code already written in the programming language C, under the Windows
Operating System. This code resides in files called dll's or Dynamic Link
Libraries. To ensure that this code is also be available to programs written in
IL, C# provides an attribute called DllImport.
To be technically accurate, code
written in a dll has nothing to do with a programming language. Once we obtain
a dll, there is no way one can detect as to which programming language it was
originally written in. The C# compiler converts our attribute DllImport to a
method. This implies that C# understands attributes and depending upon the
attribute it generates relevant IL code. The method is called MessageBoxA and
has the same parameters that we specified in C#. The added attribute is
pinvokeimpl, that is first passed the name of the dll that contains the
function.
Then we have a calling
convention that has three parameters. The parameters are pushed on the stack
before the function gets called. The order of placing parameters on the stack
that IL follows is "first written first placed" i.e. from left to
right. The winapi calling convention follows the reverse order i.e. right to
left.
Then, the name of the function
gets added with a number specifying the size of the parameters on the stack.
Finally who restores the stack, the caller or the callee?
The function MessageBoxA can be
called in the same manner that any other static function of IL gets called.
There are two primary ways of
calling unmanaged methods :
• One is using pinvokeimpl,
• The other is using IJW (It Just Works).
In IJW, the runtime stays out of
our way, and we have to write code for handling everything. We stick to
pinvokeimpl, the one we can work with. The runtime will automatically drift us
from managed to unmanaged code, convert data types and handle all the issues of
transition management. The attributes to be used are native and unmanaged as,
that is what the documentation recommends. The C# compiler however, is not
familiar with the documentation.
Tail
Calls
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed {
.entrypoint
ldc.i4 2
ldc.i4 3
call int32 zzz::abc(int32, int32)
call void System.Console::WriteLine(int32)
ret
}
.method static public int32 abc(int32 a, int32 r)
{
ldarg a
ldc.i4 0
bgt c
ldarg r
ret
c:
ldarg a
ldc.i4 1
sub
ldarg r
ldarg a
mul
tail.
call int32 zzz::abc(int32, int32)
ret
}
}
Output
6
The above example uses recursion
to find out the factorial of a number. It uses the prefix tail. wich is a tail
call instruction. Functional programming languages like Lisp or Prolog use tail
calls extensively. In a non-tail call, the current stack frame is kept intact,
and a new frame is allocated. This means that the stack position changes. In a
tail call, the stack frame is replaced with a frame for the function to be
called.
When a call terminates with a
ret, the control returns to the caller function. In the case of tail calls,
control continues to remain with the called method. Since non-tail calls need to store information as
to who the caller is, it uses up memory on the stack, and may limit the amount
of recursion that is possible. Thus, tail calls handle recursion more
effectively than non-tail calls.
The above program works even
without the tail prefix.