-3-
Selection and Repetition
In IL, a label is a name followed by the colon sign i.e
":". It gives us the ability to jump from one part of the code to
another, unconditionally. We have been constantly witnessing the labels in the
il code generated by the disassembler. For e.g.
IL_0000: ldstr "hi"
IL_0005: call void
[mscorlib]System.Console::WriteLine(class System.String)
IL_000a: call void zzz::abc()
IL_000f: ret
The words preceding the colon are labels. In the program given
below, we have created a label called a2 in the abc function. The instruction
br facilitates the jumping to any label in the program, whenever desired.
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (int32 V_0,class zzz V_1)
newobj instance void zzz::.ctor()
stloc.1
call int32 zzz::abc()
stloc.0
ldloc.0
call void [mscorlib]System.Console::WriteLine(int32)
ret
}
.method private hidebysig static int32 abc() il managed
{
.locals (int32 V_0)
ldc.i4.s 20
br.s a2
ldc.i4.s 30
a2: ret
}
}
Output
20
The function abc demonstrates this concept. In this function,
the code bypasses the instruction ldc.i4.s 30. Therefore, the return value is
displayed as 20, and not 30. Thus, IL uses the br instruction to jump
unconditionally to any part of the code. (The assembly instruction br takes 4
bytes whereas br followed by .s i.e br.s takes 1 byte, the same explanation is
applicable for every instruction tagged with .s)
The br instruction is one of the key pivots on which IL
revolves.
a.cs
class zzz
{
static bool i = true;
public static void Main()
{
if (i)
System.Console.WriteLine("hi");
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static bool i
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldsfld bool zzz::i
brfalse.s IL_0011
ldstr "hi"
call void
[mscorlib]System.Console::WriteLine(class System.String)
IL_0011: ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.1
stsfld bool zzz::i
ret
}
}
Output
hi
We have initialized the static variable to the value true in
our C# program.
• Static variables, if they are fields,
are initialized in the static constructor .cctor. This is shown in the above
example.
• Local variables, on the other hand, are
initialized in the function that they are present in.
Here, surprisingly, the value 1 is placed on the stack in the
static constructor using the ldc instruction. Even though the field i had been
defined to be of type bool in both, C# and IL, there is no sign of true or
false values.
Next, stsfld is used to initialize the static variable i to the
value 1 even though the variable is of the type bool. This proves that IL supports the concept of a data type called
bool but, it does not recognise the words true and false. Thus, in IL, bool
values are simply aliases for the numbers 1 and 0 respectively.
The bool operators TRUE and FALSE are artefacts introduced by
C# to make the life of programmers easier. Since IL does not support these
artefacts directly, it uses the numbers 1 and 0 instead.
The instruction ldsfld places the value of a static variable on
the stack. The brfalse instruction scans the stack. If it finds the number as
1, it interprets it as TRUE, and if it finds the number 0, it interprets it as
FALSE.
In this example, the value it finds on the stack is a 1 or TRUE
and hence, it does not jump to the label IL_0011. On conversion from C# to IL,
ildasm replaces the label with a name beginning with IL_.
The instruction brfalse means "jump to the label if
FALSE". This differs from br, which always results in a jump. Thus,
brfalse is called a conditional jump instruction.
There is no instruction in IL that provides the functionality
of the if statement. The if statement of C# gets converted to branch
instructions in IL. None of the assemblers that we have worked with, support
high level concepts like the if construct.
It can be appreciated from what we have just learnt that, it is
imperative to gain mastery over IL. This will help one to gain the ability to
differentiate as to which concepts are a part of IL and which ones have been
introduced by the designers of the programming languages.
It is significant to note that if IL does not support a certain
feature, it cannot be implemented in any .NET programming language. Thus, the
importance of familiarising
oneself with the various concepts that
IL supports, cannot be over emphasised.
a.cs
class zzz
{
static bool i = true;
public static void Main()
{
if (i)
System.Console.WriteLine("hi");
else
System.Console.WriteLine("false");
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static bool i
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldsfld bool zzz::i
brfalse.s IL_0013
ldstr "hi"
call void [mscorlib]System.Console::WriteLine(class
System.String)
br.s IL_001d
IL_0013: ldstr "false"
call void [mscorlib]System.Console::WriteLine(class
System.String)
IL_001d: ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.1
stsfld bool zzz::i
ret
}
}
Output
hi
An if-else statement is extremely simple to comprehend in a
programming language, but it is equally baffling in IL. IL checks whether the
value on the stack is 1 or 0.
• If the value on the stack is 1, as in
this case, it calls the WriteLine function with the parameter "hi",
and then jumps to the label IL_001d using the unconditional jump instruction
br.
• If the value on the stack is 0, the
code jumps to IL_0013 and the WriteLine function prints false.
Thus, to implement an if-else construct in IL, a conditional
and unconditional jump are required. The complexity of the IL code increases
dramatically if we use multiple if-else statements.
You can now appreciate the intelligence level of the people who
write compilers.
a.cs
class zzz
{
public static void Main()
{
}
void abc( bool a)
{
if (a)
{
int i = 0;
}
if ( a)
{
int i = 3;
}
}
}
a.il
.assembly mukhi {}
.class public auto ansi zzz extends [mscorlib]System.Object
{
.field private int32 x
.method public hidebysig static void vijay() il managed
{
.entrypoint
ret
}
.method private hidebysig instance void abc(bool a) il managed
{
.locals (int32 V_0,int32 V_1)
ldarg.1
brfalse.s IL_0005
ldc.i4.0
stloc.0
IL_0005: ldarg.1
brfalse.s IL_000a
ldc.i4.3
stloc.1
IL_000a: ret
}
}
The C# programming language can complicate life. In an inner
set of braces, we cannot create a variable that is already created earlier, in
an outer set. The above C# program is syntactically correct since the braces
are at the same level.
In IL, life is comparatively hassle free. The two i's become
two separate variables V_0 and V_1. Thus, IL does not impose any of the
restrictions on variables.
a.cs
class zzz
{
static bool i = true;
public static void Main()
{
while (i)
{
System.Console.WriteLine("hi");
}
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static bool i
.method public hidebysig static void vijay() il managed
{
.entrypoint
br.s IL_000c
IL_0002: ldstr "hi"
call void [mscorlib]System.Console::WriteLine(class
System.String)
IL_000c: ldsfld bool
zzz::i
brtrue.s IL_0002
ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.1
stsfld bool zzz::i
ret
}
}
On seeing the disassembled code, you will comprehend as to why
programmers do not write IL code for a
living. Even a simple while loop gets converted into IL code of stupendous
complexity.
For a while construct, unconditionally a jump is made to the
label IL_000c which is at the end of the function. Here, it loads the value of
the static variable i on the stack.
The next instruction, brtrue, does the reverse of what the
instruction brfalse does. It is implemented as follows:
• If the uppermost value on the stack,
i.e. the value of the field i, is 1, it jumps to label IL_0002. Then the value
"hi" is put on the stack and the WriteLine function is called.
• If the stack value is 0, the program
will jump to the ret instruction.
The above program, as you may have noticed, does not intend to
stop. It continues to flow like a perennial stream of water originating from a
gigantic glacier.
a.cs
class zzz
{
static int i = 2;
public static void Main()
{
i = i + 3;
System.Console.WriteLine(i);
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static int32 i
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldsfld int32 zzz::i
ldc.i4.3
add
stsfld int32 zzz::i
ldsfld int32 zzz::i
call void [mscorlib]System.Console::WriteLine(int32)
ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.2
stsfld bool zzz::i
ret
}
}
Output
5
IL does not have an operator for adding two numbers. The add
instruction has to be used instead.
The add instruction requires the two numbers to be added, to be
first made available on the stack. Therefore, the ldsfld instruction places the
value of the static variable i and the constant value 3 on the stack. The add
instruction then adds them up and places the resultant sum on the stack. It
also removes the two numbers, that were used in the addition, from the stack.
Most instructions in IL get rid of the parameters that are
placed on the stack for the instruction to operate upon, once the instruction
has been executed.
The instruction stsfld is used to initialize the static
variable i with the resultant sum of the addition. The rest of the code simply
displays the value of the variable i.
There is no equivalent for the ++ operator in IL. It gets
converted to the instruction ldc.i4.1.
In the same vein,to multiply two numbers, the mul instruction is used, to
subtract, sub is used and so on. They all have their equivalents in IL. The code following it remains the same.
a.cs
class zzz {
static bool i;
static int j = 19;
public static void Main() {
i = j > 16;
System.Console.WriteLine(i);
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static bool i
.field private static int32 j
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldsfld int32 zzz::j
ldc.i4.s 16
cgt
stsfld bool zzz::i
ldsfld bool zzz::i
call void [mscorlib]System.Console::WriteLine(bool)
ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.s 19
stsfld int32 zzz::j
ret
}
}
Output
True
We shall now delve upon how IL handles the conditional
operator. Let us consider the statement j > 16 in C#. IL first pushes the
value of j on the stack followed by the constant value16. It then calls the
operator cgt, which is being introduced for the first time in our source code.
This instruction checks if the first value on the stack is larger than the
second. If so, it puts the value 1 (TRUE) on the stack, or else it puts the
value 0 (FALSE) on the stack. This
value is then stored in the variable i . Using the WritleLine function,
a bool output is produced, hence we see True displayed.
In the same vein, the < operator gets converted to the
instruction clt, which checks if the first value on the stack is smaller than
the second. Thus, we can see that IL
has its own set of logical operators to internally handle the basic logical
operations.
a.cs
class zzz
{
static bool i;
static int j = 19;
public static void Main()
{
i = j == 16;
System.Console.WriteLine(i);
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static bool i
.field private static int32 j
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldsfld int32 zzz::j
ldc.i4.s 16
ceq
stsfld bool zzz::i
ldsfld bool zzz::i
call void [mscorlib]System.Console::WriteLine(bool)
ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.s 19
stsfld int32 zzz::j
ret
}
}
Output
False
The operator == is the EQUALITY operator It also needs the two
operands to be checked for equality, be placed on the stack. It thereafter uses
the ceq instruction to check for equality. If they are equal, it places the
value 1 (TRUE) on the stack, and if they are not equal, it places the value 0
(FALSE) on the stack . The ceq instruction is an integral part of the logical
instruction set of IL.
a.cs
class zzz
{
static bool i;
static int j = 19;
public static void Main()
{
i = j >= 16;
System.Console.WriteLine(i);
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static bool i
.field private static int32 j
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldsfld int32 zzz::j
ldc.i4.s 16
cgt
ldc.i4.0
ceq
stsfld bool zzz::i
ldsfld bool zzz::i
call void [mscorlib]System.Console::WriteLine(bool)
ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.s 19
stsfld int32 zzz::j
ret
}
}
Output
False
The implementation of the "less than or equal to"
(i.e. <= ) and the "greater than or equal to" (i.e. >=)operator
is a little more complex. They both
actually have 2 conditions rolled into one.
In the case of >=, IL first uses the cgt instruction to
check if the first number is greater than the second one. If so, it will return
the value 1 or else it will return value 0. If the first condition is FALSE,
the ceq instruction checks for the two numbers to be equal. If so, it returns a
TRUE, or else it returns a FALSE.
Let us try to decipher the above IL code from a slightly
different perspective. We are comparing the value 19 with 16. In this case, the
instruction cgt will put the value 1 on the stack since 19 is greater than 16.
The value 0 is put on the stack using the instruction ldc.
The ceq will compare the value 1 returned by the instruction
cgt and the value 0 that was put on the stack by the instruction ldc. Since
these two values are not equal, ceq will return 0 or FALSE on the stack.
Let us change the value of the field j in the static
constructor to 1. Now, since the number 1 is not greater than 16, the cgt
instruction will place the value FALSE or 0 on the stack. Thereafter, another 0
is placed on the stack by the ldc instruction. Now, when the instruction ceq
compares the two values, since they are both 0, it return TRUE.
Now, if we change the value of j to 16, the cgt instruction
will return a FALSE because 16 is not greater than 16. Thereafter, since the
value of 0 is placed on the stack by the instruction ldc, both the values
passed to the instruction ceq will be
0. Since a 0 is equal to a 0, the value returned will be 1 or TRUE.
If you have not understood the above explanation, remove the
lines ldc.i4.0 and ceq from the source code and observe the output.
a.cs
class zzz
{
static bool i;
static int j = 19;
public static void Main()
{
i = j != 16;
System.Console.WriteLine(i);
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static bool i
.field private static int32 j
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldsfld int32 zzz::j
ldc.i4.s 16
ceq
ldc.i4.0
ceq
stsfld bool zzz::i
ldsfld bool zzz::i
call void [mscorlib]System.Console::WriteLine(bool)
ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.s 19
stsfld int32 zzz::j
ret
}
}
Output
True
The "not equal to" operator i.e. != is the reverse of
==. It uses two ceq instructions. The first ceq instruction is used to check
whether the values on the stack are equal. If they are equal, it returns TRUE;
if they are not equal, it returns FALSE.
The second ceq compares the result of the earlier ceq with a
FALSE. If the result of the first ceq is TRUE, the final answer is FALSE and
vice versa.
This is truly an ingenious way of negating a value !
a.cs
class zzz
{
static int i = 1;
public static void Main()
{
while ( i <= 2)
{
System.Console.WriteLine(i);
i++;
}
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static int32 i
.method public hidebysig static void vijay() il managed
{
.entrypoint
br.s IL_0018
IL_0002: ldsfld int32 zzz::i
call void [mscorlib]System.Console::WriteLine(int32)
ldsfld int32 zzz::i
ldc.i4.1
add
stsfld int32 zzz::i
IL_0018: ldsfld int32
zzz::i
ldc.i4.2
ble.s IL_0002
ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.s 1
stsfld int32 zzz::i
ret
}
}
Output
1
2
We shall now refocus on the while loop after the slight
digression into conditional statements. This diversion was essential because we
use conditional statements in loops such as the while loop. A while loop
containing a condition is slightly complex.
Let us go straight to label IL_0018, which is at the end of the
zzz function in IL code. The condition is present here. The value of i (i.e. 1)
is stored on the stack. Next, the constant 2 is placed on the stack.
If you revisit the C# code, the condition in the while
statement is i <= 2. The instruction ble.s is based on the two instructors,
cgt and brfalse. This instruction checks whether the first value, i.e. the
variable i, is less than or equal to the second. If so, it instructs the
program to jump to the label IL_0002. If not, the program moves to the next
instruction.
Thus, instructions like ble make our life simpler because we do
not have to use the instructions cgt and brfalse anymore.
In C#,the condition of a while construct is present at the top,
but the code of the condition, is present at the bottom. On conversion to
IL,the code to be executed for the duration of the while construct is placed
above the code for the condition.
a.cs
class zzz
{
static int i = 1;
public static void Main()
{
for ( i = 1; i <= 2 ; i++)
{
System.Console.WriteLine(i);
}
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends System.Object
{
.field private static int32 i
.method public hidebysig static void vijay() il managed
{
.entrypoint
ldc.i4.1
stsfld int32 zzz::i
br.s IL_001e
IL_0008: ldsfld int32 zzz::i
call void
[mscorlib]System.Console::WriteLine(int32)
ldsfld int32 zzz::i
ldc.i4.1
add
stsfld int32 zzz::i
IL_001e: ldsfld int32 zzz::i
ldc.i4.2
ble.s IL_0008
ret
}
.method public hidebysig specialname rtspecialname static void
.cctor() il managed
{
ldc.i4.s 1
stsfld int32 zzz::i
ret
}
}
Output
1
2
It has been oft repeated that the while and the for constructs
provide the same functionality, and can be interchanged.
In the for loop, the code upto the first semicolon is to be
executed only once. Hence, the variable i that is to be initialised, is placed
outside the loop. Then, we unconditionally jump to label IL_001e to check
whether the value of i is less than 2 or not. If TRUE, the code jumps to label
IL_0008, which is beginning point of the code of the for statement.
The value of i is printed using the WriteLine function.
Thereafter, the value of the variable i is increased by one and the condition
is checked once again.
a.cs
public class zzz
{
public static void Main()
{
int i;
i = 1;
while ( i <= 2)
{
System.Console.Write(i);
i++;
}
i = 1;
do
{
System.Console.Write(i);
i++;
} while ( i <= 2);
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed {
.entrypoint
.locals (int32 V_0)
ldc.i4.1
stloc.0
br.s IL_000e
IL_0004: ldloc.0
call void
[mscorlib]System.Console::Write(int32)
ldloc.0
ldc.i4.1
add
stloc.0
IL_000e: ldloc.0
ldc.i4.2
ble.s IL_0004
ldc.i4.1
stloc.0
IL_0014: ldloc.0
call void
[mscorlib]System.Console::Write(int32)
ldloc.0
ldc.i4.1
add
stloc.0
ldloc.0
ldc.i4.2
ble.s IL_0014
ret
}
}
Output
1212
The difference between a do while and a while in a C# program
lies in the position at which the condition gets checked.
• In a do while, the condition gets
checked at the end of the loop. This means that the code contained in it will
get called at least once.
• In a while, the condition is checked at
the beginning of the loop. Hence, the code may never ever get executed.
In either case, we place the value 1 on the stack and
initialise the variable i or V_1.
• In the while loop, we first jump to
label IL_000e where the condition checked is whether the variable is "less
than or equal to 2". If TRUE, we
jump to Label IL_0004.
• In the do while loop, first the Write
function is called and then, the rest of the code contained in the {} braces is
executed. On reaching the last line of the code within the braces, the
condition is checked.
Thus, it is easier to write a do-while loop in IL than a while
loop, since the condition is a simple check at the end of the loop.
a.cs
public class zzz
{
public static void Main() {
int i ;
for ( i = 1; i<= 10 ; i++)
{
if ( i == 2)
break;
System.Console.WriteLine(i);
}
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (int32 V_0)
ldc.i4.1
stloc.0
br.s IL_0014
IL_0004: ldloc.0
ldc.i4.2
bne.un.s IL_000a
br.s IL_0019
IL_000a: ldloc.0
call void
[mscorlib]System.Console::WriteLine(int32)
ldloc.0
ldc.i4.1
add
stloc.0
IL_0014: ldloc.0
ldc.i4.s 10
ble.s IL_0004
IL_0019: ret
}
}
Output
1
A break statement facilitates an exit from a for loop, while
loop, do-while loop etc.
As usual, we jump to the label IL_0014 where the value of
variable V_0 or i is placed on the stack. Then, we place the condition value 10
on the stack and check whether i is smaller or larger than 10, using the
instruction ble.s.
If it is smaller, we get into the loop at label IL_0004. We
again place the value of the variable i on the stack and place the value 2 of
the if statement on the stack. Then, we use the bne instruction, which is a
combination of the ceq and the brfalse instructions.
If the variable V_0 is TRUE, the break statement ensures an
exit from the loop by jumping to the ret statement at label IL_0019 using the
instruction br.s.
a.cs
public class zzz
{
public static void Main()
{
int i ;
for ( i = 1; i<= 10 ; i++)
{
if ( i == 2)
continue;
System.Console.WriteLine(i);
}
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (int32 V_0)
ldc.i4.1
stloc.0
br.s IL_0014
IL_0004: ldloc.0
ldc.i4.2
bne.un.s IL_000a
br.s IL_0010
IL_000a: ldloc.0
call void
[mscorlib]System.Console::WriteLine(int32)
IL_0010: ldloc.0
ldc.i4.1
add
stloc.0
IL_0014: ldloc.0
ldc.i4.s 10
ble.s IL_0004
ret
}
}
A continue statement takes control to the end of the for loop.
When the if statement results in true, the program will jump to the end of the
loop, bypassing the WriteLine function. The code will then resume execution at
label IL_0010 where, the value of the variable V_0 is incremented by 1.
The main difference between the break and the continue
statements is as follows:
• In a break statement, the programs
jumps out of the loop.
• In a continue statement, the program
jumps to the end of the loop, bypassing the remaining statements.
A goto statement could have also been used to achieve the same
functionality. Thus, the break, continue or goto statements, on conversion to
IL, are transformed into the same br instruction.
The program demonstrates that a goto statement of C# is simply
translated into a br instruction in IL.
a.cs
public class zzz {
public static void Main()
{
goto aa;
aa: ;
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
br.s IL_0002
IL_0002: ret
}
}
A simple goto statement in C# is translated into a br
instruction in IL. Using a goto is considered inappropriate in languages like
C# but, its equivalent br instruction in IL is extensively utilised for
implementing various constructs like the if statement, loops etc. Thus, what is
taboo in a programming language is extremely useful in IL.
a.cs
public class zzz
{
public static void Main()
{
int j;
for ( int i = 1; i <= 2 ; i++)
System.Console.Write(i);
}
}
a.il
.assembly mukhi {}
.class private auto ansi zzz extends [mscorlib]System.Object
{
.method public hidebysig static void vijay() il managed
{
.entrypoint
.locals (int32 V_0,int32 V_1)
ldc.i4.1
stloc.1
br.s IL_000e
IL_0004: ldloc.1
call void
[mscorlib]System.Console::Write(int32)
ldloc.1
ldc.i4.1
add
stloc.1
IL_000e: ldloc.1
ldc.i4.2
ble.s IL_0004
ret
}
}
Output
12
This example illustrates a for statement. We have created a
variable j in the function Main and a variable i in the for statement. This
variable i is visible only in the for loop in C#. Thus, this variable has a
limited scope.
But on conversion to IL, all variables are given the same scope. This is because, the concept of variable scoping is alien to IL. Therefore, it is upto the C# compiler to enforce the rules of variable scoping. We can therefore conclude that, all variables have the same scope or visibility in IL.