Keywords and Operators

-4-

Keywords and Operators

Code that is placed after the return statement never gets executed. In the first program given below, you will notice that there is a WriteLine function call in C# but is not visible in our IL code. This is because the compiler is aware that any statements after return is not executed and hence, it serves no purpose to convert it into IL.

a.cs

class zzz

{

public static void Main()

{

return;

System.Console.WriteLine("hi");

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

br.s IL_0002

IL_0002: ret

}

The compiler does not waste time compiling code that will never get executed, instead generates a warning when it encounters such a situation.

a.cs

class zzz

{

public static void Main()

{

}

zzz( int i)

{

System.Console.WriteLine("hi");

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

.method private hidebysig specialname rtspecialname instance void .ctor(int32 i) il managed {

ldarg.0

call instance void [mscorlib]System.Object::.ctor()

ldstr "hi"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

If a constructor is not present in the source code, a constructor with no parameters gets generated. If a constructor is present, the one with no parameters is eliminated from the code.

The base class constructor always gets called without any parameters and it gets called first. The above IL code proves this fact.

a.cs

namespace vijay

{

namespace mukhi

{

class zzz

{

public static void Main()

{

}

a.il

.assembly mukhi {}

.namespace vijay.mukhi

{

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

We may write a namespace within a namespace, but the compiler converts it all into one namespace in the IL file. Thus, the two namespaces vijay and mukhi in the C# file get merged into a single namespace vijay.mukhi in the IL file.

a.il

.assembly mukhi {}

.namespace vijay

{

.namespace mukhi

{

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

In C#, one namespace can be present within another namespace but, the C# compiler prefers using only a single namespace, hence the il ouput displays only one namespace. The .namespace directive in IL is similar in concept to the namespace keyword in C#. The idea of a namespace originally germinated in IL, and not in programming language such as C#.

a.cs

namespace mukhi

{

class zzz

{

public static void Main()

{

}

namespace mukhi

{

class pqr

{

}

a.il

.assembly mukhi {}

.namespace mukhi

{

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

.class private auto ansi pqr extends [mscorlib]System.Object

{

}

We may have two namespaces called mukhi in the C# file, but they become one large namespace in the IL file and their contents get merged. This facility of merging namespaces is offered by the C# compiler.

Had the designers deemed it fit, they could have flagged the above program as an error instead.

a.cs

class zzz

{

public static void Main()

{

int i = 6;

zzz a = new zzz();

a.abc(ref i);

System.Console.WriteLine(i);

}

public void abc(ref int i)

{

i = 10;

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (int32 V_0,class zzz V_1)

ldc.i4.6

stloc.0

newobj instance void zzz::.ctor()

stloc.1

ldloc.1

ldloca.s V_0

call instance void zzz::abc(int32&)

ldloc.0

call void [mscorlib]System.Console::WriteLine(int32)

ret

}

.method public hidebysig instance void abc(int32& i) il managed

{

ldarg.1

ldc.i4.s 10

stind.i4

ret

}

Output

We will now explain how IL implements passing by reference. Unlike C#, it is very convenient to work with pointers in IL. It has three types of pointers.

When the function abc is called, the variable i is passed to it as a reference parameter. In IL, the instruction ldloca.s gets called, which places the address of the variable on the stack. Had the instruction been ldloc instead, the value of the variable would be placed on the stack.

In the function call, we add the symbol & at the end of the type name to indicate the address of a variable. & suffixed to a data type indicates the memory location of a variable, and not the value contained in it.

In the function itself, ldarg.1 is used to place the address of parameter 1 on the stack. Then, we place the number that we want to initialise it with, on the stack. In the above example, we have first placed the address of the variable i on the stack, followed by the value that we want to initialize it with i.e. 10.

The instruction stind places the value that is present on top of the stack i.e. 10 in the variable whose address is stored as the second item on the stack. In this case, as we have passed the address of the variable i on the stack, the variable i is assigned the value 10.

The instruction stind is used when an address is given on the stack. It fills up that memory location with the specified value.

If the word ref is replaced with the word out, IL shows the same output because, in either case, the address of a variable is being put on the stack. Thus, ref and out are artificial concepts implemented in C# and have no equivalent representation in IL.

The IL code has no way of knowing whether the original program used ref or out. Thus, on disassembling this program, we will have no way of differentiating between ref and out as this information is lost on conversion from C# code into IL code.

a.cs

class zzz

{

public static void Main()

{

string s = "hi" + "bye";

System.Console.WriteLine(s);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class System.String V_0)

ldstr "hibye"

stloc.0

ldloc.0

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Output

hibye

The next focus is on concatenating two strings. The C# compiler does this by converting them into one string. This occurs due to the compiler's zest to optimise constants. The value is stored in a local variable and then placed on the stack. Thus, at runtime, the C# compiler optimises the code as far as possible.

a.cs

class zzz

{

public static void Main()

{

string s = "hi" ;

string t = s + "bye";

System.Console.WriteLine(t);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class System.String V_0,class System.String V_1)

ldstr "hi"

stloc.0

ldloc.0

ldstr "bye"

call class System.String [mscorlib]System.String::Concat(class System.String,class System.String)

stloc.1

ldloc.1

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Output

hibye

Whenever the compiler deals with variables, it is ignorant of their values at compile time. The following steps are executed in the above program:

• Two variables s and t are converted into the local variables V_0 and V_1 respectively.

• The local variable V_0 is assigned the string "hi".

• This variable is then pushed onto the stack.

• Next, the constant string "bye" is put on the stack.

• Thereafter, the + operator is converted into a static function Concat, which belongs to the String class.

• This function concatenates the two strings and creates a new string on the stack.

• This concatenated string is stored in the variable V_1.

• The concatenated string is finally printed out.

There are two PLUS (+) operators in C#:

• One handles strings. This operator gets converted into the function Concat from the String class in IL.

• The other one handles numbers. This operator gets converted to the add instruction in IL.

Thus, the String class and its functions are built into the C# compiler. We can therefore conclude that, C# can understand and handle String operations.

a.cs

class zzz

{

public static void Main()

{

string a = "bye";

string b = "bye";

System.Console.WriteLine(a == b);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class System.String V_0,class System.String V_1)

ldstr "bye"

stloc.0

ldstr "bye"

stloc.1

ldloc.0

ldloc.1

call bool [mscorlib]System.String::Equals(class System.String,class System.String)

call void [mscorlib]System.Console::WriteLine(bool)

ret

}

Output

True

Like the + operator, when the == operator is used with strings, the compiler converts it into the function Equals.

From the above examples, we can deduce that the C# compiler is totally at ease with strings. The next version will introduce many more of such classes which the compiler shall understand intuitively.

a.cs

class zzz

{

public static void Main()

{

System.Console.WriteLine((char)65);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.s 65

call void [mscorlib]System.Console::WriteLine(wchar)

ret

}

Output

Whenever we cast a variable, like a numeric value to a character value, internally, the program merely calls the function with the data type of the cast. A cast does not modify the original variable. What actually happens is that, instead of the WriteLine function being called with an int, it gets called with a wchar. Thus a cast does not incur any run-time overhead.

a.cs

class zzz

{

public static void Main()

{

char i = 'a';

System.Console.WriteLine((char)i);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (wchar V_0)

ldc.i4.s 97

stloc.0

ldloc.0

call void [mscorlib]System.Console::WriteLine(wchar)

ret

}

Output

The char data type of C# has a size of 16 bytes. It is converted into a wchar on conversion to IL. The character 'a' gets converted into the ASCII number 97. This is placed on the stack and the variable V_0 is initialised to this value. Thereafter, the program displays the value 'a' on the screen.

a.cs

class zzz

{

public static void Main()

{

System.Console.WriteLine('\u0041');

System.Console.WriteLine(0x41);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.s 65

call void [mscorlib]System.Console::WriteLine(wchar)

ldc.i4.s 65

call void [mscorlib]System.Console::WriteLine(int32)

ret

}

Output

il cannot understand UNICODE characters or HEXADECIMAL numbers. It prefers plain and simple decimals. The \u escape sequence is provided as a convenience to C# programmers, to enhance their productivity.

You may have noticed that, even though the above program has two ret instructions, no error is generated. The criteria is that at least one ret instruction should be present.

a.cs

class zzz

{

public static void Main()

{

int @int;

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (int32 V_0)

ret

}

Variables created on the stack in C# are not given the same names on conversion to IL. So, the situation where a reserved word of C# could create a problem in IL, does not arise.

a.cs

class zzz

{

int @int;

public static void Main()

{

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field private int32 'int'

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

In the above program, the local variable @int becomes a field named int and the int datatype is changed to int32, which is a reserved word in IL. Thereafter, the compiler writes the fieldname in single inverted commas. On conversion to IL, the @ sign simply disappears from the name of the variable.

a.cs

// hi this is comment

class zzz {

public static void Main() // allowed here

{

A comment over

two lines

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

When you see the above code, you will realize why programmers the world over have an aversion to writing comments. All comments in C# are stripped off when the IL file is generated. Not a single comment is copied over into the IL code.

The compiler has scant respect for comments, and it throws all of them away. There is little wonder that programmers consider writing comments as an exercise in futility, and their frustration is well founded.

a.cs

class zzz

{

public static void Main()

{

System.Console.WriteLine("hi \nBye\tNo");

System.Console.WriteLine("\\");

System.Console.WriteLine(@"hi \nBye\tNo");

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldstr "hi \nBye\tNo"

call void [mscorlib]System.Console::WriteLine(class System.String)

ldstr "\\"

call void [mscorlib]System.Console::WriteLine(class System.String)

ldstr "hi \\nBye\\tNo"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Output

Bye No

hi \nBye\tNo

The String handling capabilities of C# have been inherited from IL. The escape sequences like \n have been simply copied over.

The two backslashes (\\) result in a single backslash when displayed.

If a string is prefaced with an @ sign, the special meaning of the escape sequences in the string is ignored and they are displayed verbatim, as shown in the program above.

If IL had not provided support for string formatting, it would have been vexed with the predicament of handling most of the modern programming languages.

a.cs

#define vijay

class zzz {

public static void Main()

{

#if vijay

System.Console.WriteLine("1");

#else

System.Console.WriteLine("2");

#endif

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed {

.entrypoint

ldstr "1"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Output

The next series of programs deals with the pre-processor directives, that are alien to the C# compiler. Only the pre-processor is capable of comprehending them.

In the above .cs program, the #define directive creates a word called "vijay". The compiler knows that the #if statement is TRUE and therefore, it ignores the #else statement. Thus, the IL file that is generated contains only the WriteLine function that has the parameter '1' and not the one that has the parameter '2'.

This is the power of compile time knowledge. A large amount of the code that is never going to be used, is simply eliminated by the pre-processor prior to converting it into IL.

a.cs

#define vijay

#undef vijay

class zzz {

public static void Main() {

#if vijay

System.Console.WriteLine("1");

#endif

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

We can use as many #undef statements as we like. The compiler knows that the word 'vijay' has been undefined and therefore, it ignores the code in the #if statement.

There is no way the original pre-processor directives can be recovered on re-conversion of code from IL to C#.

a.cs

#warning We have a code red

class zzz

{

public static void Main()

{

}

The pre-processor directive #warning in C# is used to display warnings for the benefit of the programmer who runs the compiler.

The pre-processor directives #line and #error also do not produce any executable output. They are used merely for providing information.

Inheritance

a.cs

class zzz

{

public static void Main()

{

xxx a = new xxx();

a.abc();

}

class yyy

{

public void abc()

{

System.Console.WriteLine("yyy abc");

}

class xxx : yyy

{

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class xxx V_0)

newobj instance void xxx::.ctor()

stloc.0

ldloc.0

call instance void yyy::abc()

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

.method public hidebysig instance void abc() il managed

{

ldstr "yyy abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.class private auto ansi xxx extends yyy

{

}

Output

yyy abc

The concept of inheritance is identical in all programming languages that support it. The word extends has originated in IL and Java and not in C#.

When we write a.abc(), the compiler decides on the abc function to call based on the following criteria:

• If the class xxx has a function abc, then the call in function vijay will have the prefix xxx.

• If the class yyy has a function abc, then the call in function vijay will have the prefix yyy.

Therefore, the intelligence that decides as to which function abc is to be called, resides in the compiler and not in the generated IL code.

a.cs

class zzz {

public static void Main()

{

yyy a = new xxx();

a.abc();

}

class yyy

{

public virtual void abc()

{

System.Console.WriteLine("yyy abc");

}

class xxx : yyy

{

public new void abc()

{

System.Console.WriteLine("xxx abc");

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0)

newobj instance void xxx::.ctor()

stloc.0

ldloc.0

callvirt instance void yyy::abc()

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

.method public hidebysig newslot virtual instance void abc() il managed

{

ldstr "yyy abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.class private auto ansi xxx extends yyy

{

.method public hidebysig instance void abc() il managed

{

ldstr "xxx abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Output

yyy abc

In the context of the above program, a small explanation would not be out of place for the benefit of C# neophytes.

We can equate an object a of a base class yyy to a derived class xxx. We have called the function a.abc(). The question that comes to the fore is: which of the following two versions of the function abc will be called ?

• The function abc present in the base class yyy, to which the calling object belongs.

• The function abc present in the class xxx, which is the type that it has been initialised to.

In other words, is the compile time type significant or the runtime type ?

The base class function has a modifier called virtual implying that the derived classes can override this function. The derived class, by adding the modifier new, informs the compiler that, this function abc has nothing to do with the function abc of the derived class. It is to treat them as separate entities.

First, the this pointer is put on the stack using ldloc.0. Then, inplace of a call instruction there is a callvirt instead. This is because the function abc is virtual. Other than this, there exists no difference. The function abc in class yyy is declared virtual and is also tagged with newslot. This signifies that it is a new virtual function. The word new is placed in the derived class in C#.

IL also uses a mechanism similar to that of C#, to figure out as to which version of abc is to be called.

a.cs

class zzz

{

public static void Main()

{

yyy a = new xxx();

a.abc();

}

class yyy

{

public virtual void abc()

{

System.Console.WriteLine("yyy abc");

}

class xxx : yyy

{

public override void abc()

{

System.Console.WriteLine("xxx abc");

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0)

newobj instance void xxx::.ctor()

stloc.0

ldloc.0

callvirt instance void yyy::abc()

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

.method public hidebysig newslot virtual instance void abc() il managed

{

ldstr "yyy abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.class private auto ansi xxx extends yyy

{

.method public hidebysig virtual instance void abc() il managed

{

ldstr "xxx abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void yyy::.ctor()

ret

}

Output

xxx abc

If the base constructor of class xxx is not called, no output is displayed in the output window. As a rule, we have not included the free constructor code in our IL programs.

In absence of the keywords new or override, the default keyword used is new. In the above function abc, in class xxx, we have used the override keyword, which implies that this function abc overrides the function of the base class.

By default, IL calls the virtual function from the class which the object looks like and uses the compile time type. In this case, it is yyy.

The first change that occurs with override in the derived class is the addition of the word virtual to the function prototype. This was not supplied earlier with new because a new function got created altogether which isolated itself from the base class.

The use of override effectively results in the overriding of the base class function. This makes the function abc a virtual function in the class xxx. In other words, override becomes virtual whereas, new becomes nothing.

As there is a newslot modifier in the base class and a virtual function of the same name in the derived class, the derived class gets called.

In a virtual function, the run time type of the object gets preference. The instruction callvirt resolves this issue at run-time and not at compile time.

a.cs

class zzz

{

public static void Main()

{

yyy a = new xxx();

a.abc();

}

class yyy

{

public virtual void abc()

{

System.Console.WriteLine("yyy abc");

}

class xxx : yyy

{

public override void abc()

{

base.abc();

System.Console.WriteLine("xxx abc");

}

a.il

.method public hidebysig virtual instance void abc() il managed

{

ldarg.0

call instance void yyy::abc()

ldstr "xxx abc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Only the code of the function abc in class xxx has been shown above. The rest of the IL code has been omitted. base.abc() calls the function abc from the base class, i.e. class yyy. The keyword base is a reference to the object in memory. This keyword of C# is not understood by IL as it is a compile time issue. Base does not care whether the function is virtual or not.

Whenever we make a function virtual for the first time, it is a good idea to mark it as newslot, solely to signify a break from all the functions with the same name present in the superclasses.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

newobj instance void yyy::.ctor()

callvirt instance void iii::pqr()

ret

}

.class interface iii

{

.method public virtual abstract void pqr() il managed

{

}

.class public yyy implements iii

{

.override iii::pqr with instance void yyy::abc()

.method public virtual hidebysig newslot instance void abc() il managed

{

ldstr "yyy abc"

call void System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void [mscorlib]System.Object::.ctor()

ret

}

Output

yyy abc

We have created an interface iii with just one function called pqr. Then, the class yyy implements from interface iii but does not implement function pqr. Instead it adds a function called abc. In the entrypoint function vijay, function pqr is called off the interface iii.

The reason we get no errors is due to the presence of the override directive. This directive informs the assembler to redirect any call made to the function pqr off interface iii, to the class yyy function abc. The assembler is very serious about the override directive. This can be gauged from the fact that without the implements iii in the definition of class yyy we are given the following exception:

Output

Exception occurred: System.TypeLoadException: Class yyy tried to override method pqr but does not implement or inherit that methods.

at zzz.vijay()

Destructors

a.cs

class zzz

{

public static void Main()

{

}

~zzz()

{

System.Console.WriteLine("hi");

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

.method family hidebysig virtual instance void Finalize() il managed

{

ldstr "hi"

call void [mscorlib]System.Console::WriteLine(class System.String)

ldarg.0

call instance void [mscorlib]System.Object::Finalize()

ret

}

No output

A destructor gets converted into a function called Finalize. This piece of information is also laid down in the C# documentation. The Finalize function calls the original from Object. The text "hi" does not get displayed because the function is called as and when the runtime decides. All we know is that it gets called at its demise. Thus, whenever the object dies, it calls Finalize. There is no way of destroying anyone or anything, including .NET objects.

a.cs

class zzz

{

public zzz()

{

}

public zzz(int i)

{

}

public static void Main()

{

}

~zzz()

{

System.Console.WriteLine("hi");

}

class yyy : zzz

{

}

a.il

.class private auto ansi yyy extends zzz

{

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void zzz::.ctor()

ret

}

In the above code, we’ve diplayed only the yyy class. Even though we have 2 constructors and 1 destructor, the class yyy only receives the free constructor with no parameters. Thus, derived classes do not inherit constructors or destructors of the base class.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

call void yyy::abc()

ret

}

.class private auto ansi yyy extends [mscorlib]System.Array

{

.method public hidebysig static void abc() il managed

{

ldstr "hi"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Output

In C#, we are not allowed to derive a class from certain classes like System.Array. However, in IL there is no such restriction. Thus, the above code does not generate any error.

We can safely conclude that the C# compiler has added the above restrictions and that IL is less restrictive. The rules of a language are decided by the compiler at compile time.

For your information, the other classes that we cannot derive from, in C#, are Delegate, Enum and ValueType.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class aa V_0)

newobj instance void aa::.ctor()

stloc.0

ret

}

.class public auto ansi aa extends bb

{

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void bb::.ctor()

ldstr "aa"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.class public auto ansi bb extends cc

{

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void cc::.ctor()

ldstr "bb"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.class public auto ansi cc extends aa

{

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void aa::.ctor()

ldstr "cc"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Error

Exception occurred: System.TypeLoadException: Could not load class 'aa' because the format is bad (too long?)

at zzz.vijay()

We are forbidden to have a circular reference in C#. The compiler checks for it and if found, reports an error. IL, however, does not check for a circular reference because, Microsoft does not expect all programmers to use pure IL.

Hence, class aa extends bb, class bb extends cc and finally class cc extends aa. This completes the circular reference. The exception that is thrown at runtime does not give any indication of a circular reference. Thus, if we had not unravelled this mystery for you here, the exception would have most probably left you baffled. We do not intend to disclose the fact that we have understood IL deeply, but there is no harm in giving oneself a pat on the back, once in a while.

a.cs

internal class zzz

{

public static void Main()

{

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

Access modifiers, like the keyword internal, are only part of the C# lexicon and have no relevance in IL. The keyword internal signifies that the particular class can only be accessed from within the file in which it is present.

Thus, by mastering IL, we are in a position to differentiate between the core belongings of .NET and features existing in the realms of C#.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

.class public auto ansi yyy extends xxx

{

}

.class private auto ansi xxx extends [mscorlib]System.Object

{

}

In C#, there is a rule : the base class has to be more accessible than the derived class. This rule is not adhered to in IL. Thus even though the base class xxx is private and the derived class yyy is public, no error is generated in IL.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

A function in C# cannot be more accessible than the class within which it resides. The function vijay is public, whereas the class that it is located in is private. Thus, the class is more restrictive than the function contained in it. Again, there is no such restriction imposed in IL.

a.cs

class zzz

{

public static void Main()

{

yyy a = new yyy();

xxx b = new xxx();

a = b;

b = (xxx) a;

}

class yyy

{

}

class xxx : yyy

{

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0,class xxx V_1)

newobj instance void yyy::.ctor()

stloc.0

newobj instance void xxx::.ctor()

stloc.1

ldloc.1

stloc.0

ldloc.0

castclass xxx

stloc.1

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

}

.class private auto ansi xxx extends yyy

{

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void yyy::.ctor()

ret

}

Without a constructor in xxx, the following exception is thrown:

Output

Exception occurred: System.InvalidCastException: An exception of type System.InvalidCastException was thrown.

at zzz.vijay()

In the above example, we are creating two objects a and b, that are instances of classes yyy and xxx respectively. The class xxx is the derived class and yyy is the base class. We can write a = b but, if we equate a derived class to a base class, an error is generated. Thus, a cast operator is required.

A cast in C# gets converted to the instruction castclass, followed by the name of the derived class that the class has to be cast into. If it cannot be casted, the above mentioned exception will be raised.

In the above code, there is no constructor, and hence, the exception is generated.

Thus, IL has a number of higher level primitives that deal with objects and classes.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0,class xxx V_1)

newobj instance void yyy::.ctor()

stloc.0

newobj instance void xxx::.ctor()

stloc.1

ldloc.1

stloc.0

ldloc.0

castclass xxx

stloc.1

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

}

.class private auto ansi xxx extends [mscorlib]System.Object

{

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void System.Object::.ctor()

ret

}

In the above case, the class xxx does not derive from class yyy anymore. They both extend from the Object class. Yet, we are allowed to cast the class yyy to class xxx. No error is generated with a constructor in the class xxx. but on removal of the constructor, an exception is generated. IL too has its own strange way of working.

a.il

.assembly mukhi {}

.class private auto ansi sealed zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ret

}

.class private auto ansi yyy extends zzz

{

}

The documentation states very clearly that a sealed class cannot be extended or sub-classed any further. In this case, an error was expected but none was generated. We must remind you that we are working on a beta copy. The next version may generate an error.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0)

newobj instance void yyy::.ctor()

stloc.0

ret

}

.class private auto ansi abstract yyy

{

}

An abstract class cannot be used directly. It can only be derived from. The above code should have generated an error, but it does not.

a.cs

public class zzz

{

const int i = 10;

public static void Main()

{

System.Console.WriteLine(i);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.s 10

call void [mscorlib]System.Console::WriteLine(int32)

ret

}

Output

A constant is an entity that only exists at compile time. It is not visible at run-time. This proves that the compiler removes all traces of compile time objects. On conversion to IL, all occurrences of int i in the C# code get replaced by the number 10.

a.cs

public class zzz

{

const int i = j + 4;

const int j = k - 1;

const int k = 3;

public static void Main() {

System.Console.WriteLine(k);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field private static literal int32 i = int32(0x00000006)

.field private static literal int32 j = int32(0x00000002)

.field private static literal int32 k = int32(0x00000003)

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.3

call void [mscorlib]System.Console::WriteLine(int32)

ret

}

Ouput

All the constants are evaluated by the compiler and, even though, they may refer to other constants, they are given absolute values. The IL runtime does not allocate any memory for literal fields. This falls in the realm of metadata, which we shall explain later.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field private static literal int32 i = int32(0x00000006)

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.6

stsfld int32 zzz::i

ret

}

Output

Exception occurred: System.MissingFieldException: zzz.i

at zzz.vijay()

A literal field represents a constant value. In IL, we are not allowed to access any literal field. The assembler does not generate any error at the time of assembling, but an exception is thrown at run time. We expected a compile time error, since we have used a literal field in the instruction stsfld.

a.cs

public class zzz

{

public static readonly int i = 10;

public static void Main()

{

System.Console.WriteLine(i);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field public static initonly int32 i

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldsfld int32 zzz::i

call void [mscorlib]System.Console::WriteLine(int32)

ret

}

.method public hidebysig specialname rtspecialname static void .cctor() il managed

{

ldc.i4.s 10

stsfld int32 zzz::i

ret

}

Output

A readonly field cannot be modified. In IL, we have a modifier called initonly which implements the same concept.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field public static initonly int32 i

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.s 10

stsfld int32 zzz::i

ldsfld int32 zzz::i

call void [mscorlib]System.Console::WriteLine(int32)

ret

}

The documentation very clearly states that initonly fields can only be changed in the constructor, but the CLR ( Common Language Runtime) does not strictly check this. Maybe in the next version, they should guard against such occurrences.

Thus, the entire series of restrictions on readonly have to be enforced by the programming language that converts the source code to IL. We are not trying to run down IL, but IL expects someone else to do the error checking in this situation.

a.cs

public class zzz

{

public static void Main()

{

zzz a = new zzz();

pqr();

a.abc();

}

public static void pqr()

{

}

public void abc()

{

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field public static initonly int32 i

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class zzz V_0)

newobj instance void zzz::.ctor()

stloc.0

call void zzz::pqr()

ldloc.0

call instance void zzz::abc()

ret

}

.method public hidebysig static void pqr() il managed

{

ret

}

.method public hidebysig instance void abc() il managed

{

ret

}

This example serves as a refresher. The static function pqr is not passed the this pointer on the stack, whereas, the non-static function abc is passed the this pointer or a reference to where its variables are stored in memory.

Thus, before the call to function abc, the instruction ldloc.0 pushes the reference of zzz onto the stack.

a.cs

public class zzz

{

public static void Main()

{

pqr(10,20);

}

public static void pqr(int i , int j)

{

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.field public static initonly int32 i

.method public hidebysig static void vijay() il managed

{

.entrypoint

ldc.i4.s 10

ldc.i4.s 20

call void zzz::pqr(int32,int32)

ret

}

.method public hidebysig static void pqr(int32 i,int32 j) il managed

{

ret

}

The calling convention indicates the order in which the parameters should be pushed onto the stack. The default sequence in IL is the order in which they were written. Thus, the number 10 first goes onto the stack, followed by the number 20.

Microsoft implements the reverse order. Thus, first 20 goes on the stack followed by 10. We cannot reason out this idiosyncrasy.

a.cs

public class zzz

{

public static void Main()

{

bb a = new bb();

}

public class aa

{

public aa()

{

System.Console.WriteLine("in const aa");

}

public aa(int i)

{

System.Console.WriteLine("in const aa" + i);

}

public class bb : aa

{

public bb() : this(20)

{

System.Console.WriteLine("in const bb");

}

public bb(int i) : base(i)

{

System.Console.WriteLine("in const bb" + i);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class bb V_0)

newobj instance void bb::.ctor()

stloc.0

ret

}

.class public auto ansi aa extends [mscorlib]System.Object

{

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void [mscorlib]System.Object::.ctor()

ldstr "in const aa"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor(int32 i) il managed

{

ldarg.0

call instance void [mscorlib]System.Object::.ctor()

ldstr "in const aa"

ldarga.s i

box [mscorlib]System.Int32

call class System.String [mscorlib]System.String::Concat(class System.Object,class System.Object)

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.class public auto ansi bb extends aa

{

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

ldc.i4.s 20

call instance void bb::.ctor(int32)

ldstr "in const bb"

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

.method public hidebysig specialname rtspecialname instance void .ctor(int32 i) il managed

{

ldarg.0

ldarg.1

call instance void aa::.ctor(int32)

ldstr "in const bb"

ldarga.s i

box [mscorlib]System.Int32

call class System.String [mscorlib]System.String::Concat(class System.Object,class System.Object)

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Output

in const aa20

in const bb20

in const bb

We have created only one object, which is an instance of the class bb. Instead of two constructors, one for the base class and one from the derived class, three constructors are called.

• In IL, at first, a call is made to the constructor of bb with no parameters.

• Then, on reaching the constructor bb, a call is made to another constructor of the same class but with a parameter value of 20. this(20) gets converted into an actual constructor call with one parameter.

• Now, we move onto the one constructor of bb. Here, initially a call the one constructor of aa is made as the base class constructor needs to be called first.

Luckily, the base class constructor of aa does not take us on another wild goose chase. After it finishes execution, the strings are displayed, and finally, the constructor of bb that has no parameters, gets called.

Thus, base and this do not exist in IL and are compile time artefacts that get hard coded into the IL code.

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object {

.method public hidebysig static void vijay() il managed {

.entrypoint

.locals (class aa V_0)

newobj instance void aa::.ctor()

ret

}

.class public auto ansi aa extends [mscorlib]System.Object {

.method private hidebysig specialname rtspecialname instance void .ctor() il managed

{

ret

}

Output

Exception occurred: System.MethodAccessException: aa..ctor()

at zzz.vijay()

We cannot access a private member from outside the class. Thus, as we have made the only constructor private in the class bb, we are not allowed to create any object that looks like class bb. In C#, the same rules apply for the access modifiers also.

a.cs

public class zzz

{

public static void Main()

{

yyy a = new yyy();

}

class yyy

{

public int i;

public bool j;

public yyy()

{

System.Console.WriteLine(i);

System.Console.WriteLine(j);

}

a.il

.assembly mukhi {}

.class public auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (class yyy V_0)

newobj instance void yyy::.ctor()

stloc.0

ret

}

.class private auto ansi yyy extends [mscorlib]System.Object

{

.field public int32 i

.field public bool j

.method public hidebysig specialname rtspecialname instance void .ctor() il managed

{

ldarg.0

call instance void [mscorlib]System.Object::.ctor()

ldarg.0

ldfld int32 yyy::i

call void [mscorlib]System.Console::WriteLine(int32)

ldarg.0

ldfld bool yyy::j

call void [mscorlib]System.Console::WriteLine(bool)

ret

}

Output

False

Here, the variables i and j are not initialized. Thus, these fields do not get initialized in the static constructors of class yyy. Before any code in class yyy gets called, these variables are assigned their default values, which depend upon their data type. In this case, they are initialised by the constructors of the int and bool classes, since these constructors get called first.

a.cs

class zzz

{

public static void Main()

{

int i = 10;

string j;

j = i >= 20 ? "hi" : "bye";

System.Console.WriteLine(j);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed {

.entrypoint

.locals (int32 V_0,class System.String V_1)

ldc.i4.s 10

stloc.0

ldloc.0

ldc.i4.s 20

bge.s IL_000f

ldstr "bye"

br.s IL_0014

IL_000f: ldstr "hi"

IL_0014: stloc.1

ldloc.1

call void [mscorlib]System.Console::WriteLine(class System.String)

ret

}

Output

bye

The ternary operator is glorified if statement compressed into a single line. The variables i and j in C# become V_0 and V_1 on conversion to IL. We first initialize variable V_0 to 10 and then, place the condition value 20 on the stack.

The instruction bge.s is based on the instructions clt and brfalse.

• If the condition is TRUE, bge.s executes a jump to the label IL_0014.

• If the condition is FALSE, the program proceeds to the label IL_000f.

Then, the program proceeds to the WriteLine function and prints the appropriate text.

From the resultant IL code, there is no way of deciphering whether the original C# code had used an if statement or a ?: operator. A large number of operators in C#, such as the ternary operator, have been borrowed from the C programming language.

a.cs

class zzz

{

public static void Main()

{

int i = 1, j= 2;

if ( i >= 4 & j > 1)

System.Console.WriteLine("& true");

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (int32 V_0,int32 V_1)

ldc.i4.1

stloc.0

ldc.i4.2

stloc.1

ldloc.0

ldc.i4.4

clt

ldc.i4.0

ceq

ldloc.1

ldc.i4.1

cgt

and

brfalse.s IL_001c

ldstr "& true"

call void [mscorlib]System.Console::WriteLine(class System.String)

IL_001c: ret

}

The & operator in C# makes the if statement more complex. It only returns TRUE if both the conditions are TRUE. Otherwise, it returns FALSE. There is no equivalent for the & operator in IL. Thus, it is implemented in a round about way as follows:

• First we use the ldc instruction to place a constant value on the stack.

• Next, the instruction stloc initializes variables i and j i.e. V_0 and V_1.

• Then, the value of V_0 is placed on the stack.

• Thereafter, the condition value 4 is checked.

• Then, the condition clt is used to check if the first item on the stack is less than the second. If it is, as is the case in the above example, then the value 1 (TRUE) is put on the stack.

• The original expression in C# is i >= 4. In IL, a check for < or clt is made.

• Then we check for equality i.e. = using ceq and place zero on the stack. This results in a FALSE.

• Then we follow the same rules for j > 1. Here, we use cgt instead of clt. The result of the cgt operator is TRUE.

• This result of TRUE is ANDED with the previous result of FALSE to finally give a FALSE value.

Note that the AND instruction will return a 1, if and only if, both the conditions are TRUE. In all other conditions, it will return FALSE.

a.cs

class zzz

{

public static void Main()

{

int i = 1, j= 2;

if ( i >= 4 && j > 1)

System.Console.WriteLine("&& true");

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (int32 V_0,int32 V_1)

ldc.i4.1

stloc.0

ldc.i4.2

stloc.1

ldloc.0

ldc.i4.4

blt.s IL_0016

ldloc.1

ldc.i4.1

ble.s IL_0016

ldstr "&& true"

call void [mscorlib]System.Console::WriteLine(class System.String)

IL_0016: ret

}

Operators like the && operator are called short circuit operators as they execute the second condition only if the first condition is true. We have repeated the same IL code as earlier, but now the condition is checked by instruction blt.s, a combination of the clt and brtrue instructions.

If the condition is FALSE, a jump is made to the ret instruction at label IL_0016. Only if the condition is TRUE, we proceed further and check the second condition. For this, we use the instruction ble.s that is a combination of cgt and brfalse. If the second condition is FALSE, we jump to the ret command as before and for TRUE we execute the WriteLine function.

The && operator executes faster than the & because it only proceeds further if the first condition results in TRUE. In doing so, the output of the first expression affects the final outcome.

The | and || operators also behave in a similar manner.

a.cs

class zzz {

public static void Main()

{

bool x,y;

x = true;

y = false;

System.Console.WriteLine( x ^ y);

x = false;

System.Console.WriteLine( x ^ y); }

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object {

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (bool V_0,bool V_1)

ldc.i4.1

stloc.0

ldc.i4.0

stloc.1

ldloc.0

ldloc.1

xor

call void [mscorlib]System.Console::WriteLine(bool)

ldc.i4.0

stloc.0

ldloc.0

ldloc.1

xor

call void [mscorlib]System.Console::WriteLine(bool)

ret

}

Output

True

False

The ^ sign is called an XOR operator. The XOR is like an OR statement, but there is a difference: An OR returns TRUE if any of its operands is TRUE, but an XOR will return TRUE if and only if one of its operands is TRUE and the other one is FALSE. Even if both operands are TRUE, it will return FALSE. xor is an IL instruction.

The != operator gets converted into the normal set of IL instructions i.e. a comparison is done and the program branches accordingly.

a.cs

class zzz

{

public static void Main()

{

bool x = true;

System.Console.WriteLine(!x);

}

a.il

.assembly mukhi {}

.class private auto ansi zzz extends [mscorlib]System.Object

{

.method public hidebysig static void vijay() il managed

{

.entrypoint

.locals (bool V_0)

ldc.i4.1

stloc.0

ldloc.0

ldc.i4.0

ceq

call void [mscorlib]System.Console::WriteLine(bool)

ret

}

Output

False

The ! operator in C# converts a TRUE to a FALSE and vice versa. In IL, the instruction used is ceq. This instruction checks the last two parameters on the stack. If they are the same, it returns TRUE, otherwise it returns FALSE.

Since the variable x is TRUE, it gets initialized to 1. It is thereafter checked for equality with the value 0. As they are not equal, the final result is 0 or FALSE. This result is put on the stack. The same logic applies had x been FALSE. 0 would have been put on the stack and checked for equality with the other 0. Since they match the final answer would be TRUE.