Linux – ELF – File Format
|
To write an ELF file virus, one must
have a complete understanding of the ELF File Format. ELF stands for Executable
and Linking Format. This format is used by the Linux Operating system when
creating executable files as against Portable Executable (PE) file format in
Windows.
Our very first program b.c has no main
function but a simple abc
b.c
abc()
{
}
gcc -c b.c
ls -l b.o
-rw-r--r-- 1 root root 728 Nov 14 13:14 b.o
The gcc compiler with the –c option does
not call the linker, which is the default. Hence the output is the object file
b.o. Once this object file is created, we write a program that will read the
contents of this file.
a.c
#include <stdio.h>
main()
{
FILE *fp;
fp =
fopen("b.o","r");
printf("%x ",fgetc(fp));
printf("%c",fgetc(fp));
printf("%c",fgetc(fp));
printf("%c",fgetc(fp));
printf("\n");
}
gcc a.c -o a
./a
7f ELF
The object file b.o is opened in the
read mode using the trustworthy fopen function. The first number retrieved
using the fgetc function is printed in hex which is 0x7f and the next three are
in chars. The output 7F followed by the chars ELF is what every ELF files
begins with, this is called the magic number.
a.c
#include <stdio.h>
main()
{
FILE *fp;
fp =
fopen("b.o","r");
fseek(fp,4,0);
printf("%d ",fgetc(fp));
printf("%d ",fgetc(fp));
printf("%d ",fgetc(fp));
printf("\n");
}
Output
1 1 1
This program displays the next 3 characters
following the magic number. The fseek function leaps 4 bytes ahead to jump to
the fifth byte to pick up the next 3 numbers. The output displays their values
as 1.
The first of the 3 numbers is the class
and can have one of the two values, 1 for a 32 bit object and 2 for a 64 bit
object. As we are not rich enough to use 64 bits, the fifth byte is 1. The
sixth byte is for the endianness, which if 1 specifies LSB or little endian and
2 for MSB or big endian. This byte decides whether the high byte is stored
first or last. In little endian machines like the Intel family the low byte
comes first followed by the high byte. The seventh byte is for the version
number and it is 1 always.
a.c
#include <stdio.h>
char *class[]={"Invalid
Class","32-bit objects", "64-bit objects"};
char *endian[] = {"Invalid",
"Little Endian","Big Endian"};
char *version[] = {"Invalid
Version","Current Version"};
FILE *fp;
main()
{
int i;
fp =
fopen("b.o","r");
fseek(fp,4,0);
i = fgetc(fp);
printf("%s \n",class[i]);
i = fgetc(fp);
printf("%s \n",endian[i]);
i = fgetc(fp);
printf("%s \n",version[i]);
}
Output
32-bit objects
Little Endian
Current Version
The next program words the above numbers
using the programming features. A byte from disk is read and then used as an offset
into an array to return a string. This string is then printed out using the
printf function.
a.c
#include <stdio.h>
struct elfhdr
{
char e_ident[16];
short int e_type;
short int e_machine;
int e_version;
char *e_entry;
int e_phoff;
int e_shoff;
int e_flags;
short int e_ehsize;
short int e_phentsize;
short int e_phnum;
short int e_shentsize;
short int e_shnum;
short int e_shstrndx;
};
struct elfhdr e;
char *class[]={"Invalid
Class","32-bit objects", "64-bit objects"};
char *endian[] = {"Invalid",
"Little Endian","Big Endian"};
char *version[] = {"Invalid
Version","Current Version"};
char *filetype[] =
{"None","Rel
Obj","Executable","Dynamic","Core","Num"};
char *machine[] =
{"None","WE32100","SPARC","80386","MK68000","MK88000","80486","80860","MIPS"};
FILE *fp;
main()
{
fp =
fopen("b.o","r");
fread(&e,52,1,fp);
printf("%x %c%c%c %s %s
%s\n",e.e_ident[0],e.e_ident[1],e.e_ident[2],
e.e_ident[3],class[e.e_ident[4]],endian[e.e_ident[5]],version[e.e_ident[6]]);
printf("Type :%d
%s \n",e.e_type,filetype[e.e_type]);
printf("Machine :%d
%s \n",e.e_machine,machine[e.e_machine]);
printf("Version :%d
%s \n",e.e_version,version[e.e_version]);
printf("Entry Point :%x\n",e.e_entry);
printf("Size of Initial
Structure :%d\n",e.e_ehsize);
printf("Flags :%d\n",e.e_flags);
printf("Program Header Offset :%d\n",e.e_phoff);
printf("PHO : Structure Size %d No.
of Entries
%d\n",e.e_phentsize,e.e_phnum);
printf("Section Header Offset :%d\n",e.e_shoff);
printf("SHO: Structure Size %d No.
of Entries
%d\n",e.e_shentsize,e.e_shnum);
printf("String Table Section
Number :%d\n",e.e_shstrndx);
}
Output
7f ELF 32-bit objects Little Endian Current Version
Type :1 Rel Obj
Machine :3 80386
Version :1 Current Version
Entry Point :0
Size of Initial Structure :52
Flags :0
Program Header Offset :0
PHO : Structure Size 0 No. of Entries 0
Section Header Offset :164
SHO: Structure Size 40 No. of Entries 8
String Table Section Number :5
The first 52 bytes are read into a
structure with the structure tag e. This structure maps the initial bytes of an
ELF file. The first seven bytes have been explained and the next 11 are pad
bytes, hence all 0's. The 17th and 18th byte is for the file type where 1
stands for an object file, 2 for an exe file, 3 for a dll or shared library and
4 for a core file. The next short is for the machine where 3 is for an Intel
386. The array of strings are used to a more readable output. This is followed
by an int or bytes for the version, ours is 1 or current version.
The next four bytes is a pointer to where the first executable
instruction begins in memory. As we are dealing with a object file it is 0.
Then there are two important tables, the program header and the section table.
The next two ints points out that the program header is 0 and the section offset is at 164. The next int is for
processor specific flags which is 0. The size of ELF header is 52 bytes long as
specified by the next two bytes.
The next short is the size of the
Program header table and the number of entries. As there are no program header,
the value is zero. This is followed by the size and no of entries of the
section table.
The section size is 40 with 8 entries. The
last member is the string table section number which will be attended to in
greater detail in a short while.
c.c
#include <stdio.h>
#include <sys/stat.h>
FILE *fp;
struct stat st;
main()
{
fp =
fopen("b.o","r");
printf("Fileno
%d\n",fileno(fp));
fstat(fileno(fp),&st);
printf("Size %d\n",st.st_size);
}
gcc -o c c.c
./c
Fileno 3
Size
605
>ls -l b.o
-rw-r--r-- 1 root root 605 Nov 14 13:14 b.o
This program figures out the size of the
file b.o. At first, the fileno function converts the pointer fp into a file
handle. Since this is the first file that is being opened, it is given a number
3. 0,1 and 2 are taken up by standard input, standard output and stderr.
The fstat function taking a file handle
is used to fill the stat structure. This structure has a member st_size that
holds the file size, 605 bytes in our case.
a.c
#include <stdio.h>
#include <sys/stat.h>
struct elfhdr
{
char e_ident[16];
short int e_type;
short int e_machine;
int e_version;
char *e_entry;
int e_phoff;
int e_shoff;
int e_flags;
short int e_ehsize;
short int e_phentsize;
short int e_phnum;
short int e_shentsize;
short int e_shnum;
short int e_shstrndx;
};
char *class[]={"Invalid
Class","32-bit objects", "64-bit objects"};
char *endian[] = {"Invalid",
"Little Endian","Big Endian"};
char *version[] = {"Invalid
Version","Current Version"};
char *filetype[] =
{"None","Rel
Obj","Executable","Dynamic","Core","Num"};
char *machine[] = {
"None","WE32100","SPARC","80386","MK68000","MK88000","80486","80860","MIPS"};
FILE *fp;struct stat st;
struct elfhdr *e;
char *p;
main()
{
fp =
fopen("b.o","r");
fstat(fileno(fp),&st);
p = (char *) malloc(st.st_size);
fread(p,st.st_size,1,fp);
e = (struct elfhdr *)p;
printf("%x %c%c%c %s %s
%s\n",e->e_ident[0],e->e_ident[1],e->e_ident[2],
e->e_ident[3],class[e->e_ident[4]],endian[e->e_ident[5]],version[e->e_ident[6]]);
printf("Type :%s \n",filetype[e->e_type]);
printf("Machine :%s \n",machine[e->e_machine]);
printf("Version :%s \n",version[e->e_version]);
printf("Entry Point :%x\n",e->e_entry);
printf("Size of Initial
Structure
:%d\n",e->e_ehsize);
printf("Flags :%d\n",e->e_flags);
printf("Program Header Offset :%d\n",e->e_phoff);
printf("PHO : Structure Size %d No.
of Entries
%d\n",e->e_phentsize,e->e_phnum);
printf("Section Header Offset :%d\n",e->e_shoff);
printf("SHO: Structure Size %d No.
of Entries
%d\n",e->e_shentsize,e->e_shnum);
printf("String Table Section
Number
:%d\n",e->e_shstrndx);
}
Output
7f ELF 32-bit objects Little Endian Current Version
Type :Rel Obj
Machine :80386
Version :Current Version
Entry Point :0
Size of Initial Structure :52
Flags :0
Program Header Offset :0
PHO : Structure Size 0 No. of Entries 0
Section Header Offset :164
SHO: Structure Size 40 No. of Entries 8
String Table Section Number :5
This example works on the same principle
as the earlier one but has more style. We, as before, get the size of the file
using fstat and then use malloc to allocate an area of memory. Thereafter,
using the fread function, the entire file is read into memory. E, which is
pointer to a structure is then used to read the entire header. It is much
easier from the programming standpoint to deal with pointers to structures
instead of actual structures. There is no other change.
a.c
#include <stdio.h>
#include <sys/stat.h>
struct elfhdr
{
char e_ident[16];
short int e_type;
short int e_machine;
int e_version;
char *e_entry;
int e_phoff;
int e_shoff;
int e_flags;
short int e_ehsize;
short int e_phentsize;
short int e_phnum;
short int e_shentsize;
short int e_shnum;
short int e_shstrndx;
};
struct elf_shdr
{
int name;
int type;
int flags;
int addr;
int offset;
int size;
int link;
int info;
int align;
int esize;
};
FILE *fp;struct stat st;
struct elfhdr *e;
struct elf_shdr *s;
char *p;
int i;
main()
{
fp =
fopen("b.o","r");
fstat(fileno(fp),&st);
p = (char *) malloc(st.st_size);
fread(p,st.st_size,1,fp);
e = (struct elfhdr *)p;
printf("Section Header Offset :%d\n",e->e_shoff);
printf("SHO: Structure Size %d No.
of Entries
%d\n",e->e_shentsize,e->e_shnum);
printf("String Table Section
Number
:%d\n",e->e_shstrndx);
s = p + e->e_shoff;
printf("\n");
printf("%-3s %-15s %-5s %-5s %-8s
%-8s %-5s %-5s %-5s %-5s %-6s\n",
"No","Name","Type","Flags","Address","Offset","Size","Link","Info","Align","Esize");
for(i = 0; i < e->e_shnum; i++)
{
printf("%-3d %-15d %-5d %-5d %-8x
%-8d %-5d %-5d %-5d %-5d %-6d\n",
i,s->name,s->type,s->flags,s->addr,s->offset,s->size,s->link,s->info,s->align,s->esize);
s++;
}
}
Output
Section Header Offset :164
SHO: Structure Size 40 No. of Entries 8
String Table Section Number :5
No
Name Type Flags Address Offset Size Link
Info Align Esize
0
0 0 0
0 0 0
0 0 0 0
1
27 1 6
0 52 5
0 0 4 0
2
33 1 3
0 60 0
0 0 4 0
3
39 8 3 0
60 0 0
0 4 0
4
44 1 0
0 60 51
0 0 1 0
5
17 3 0
0 111 53
0 0 1 0
6
1 2
0 0 484 112 7
6 4 16
7
9 3 0
0 596 9
0 0 1 0
This program displays the entire section
headers. The initial ELF header gives the offset of the section header, its
size and number of entries. The offset e->e_shoff is directed to the start
of the file p to give a pointer to where the section headers begin in the file.
These section headers have a pre-decided form whose values are simply displayed
in a loop.
a.c
#include <stdio.h>
#include <sys/stat.h>
struct elfhdr
{
char e_ident[16];
short int e_type;
short int e_machine;
int e_version;
char *e_entry;
int e_phoff;
int e_shoff;
int e_flags;
short int e_ehsize;
short int e_phentsize;
short int e_phnum;
short int e_shentsize;
short int e_shnum;
short int e_shstrndx;
};
struct elf_shdr
{
int name;
int type;
int flags;
int addr;
int offset;
int size;
int link;
int info;
int align;
int esize;
};
FILE *fp;struct stat st;
struct elfhdr *e;
struct elf_shdr *s,*sh;
char *p,*str;
int i;
main()
{
fp =
fopen("b.o","r");
fstat(fileno(fp),&st);
p = (char *) malloc(st.st_size);
fread(p,st.st_size,1,fp);
e = (struct elfhdr *)p;
printf("Section Header Offset :%d\n",e->e_shoff);
printf("SHO: Structure Size %d No.
of Entries
%d\n",e->e_shentsize,e->e_shnum);
printf("String Table Section
Number
:%d\n",e->e_shstrndx);
sh = p + e->e_shoff;
s= sh;
printf("\n");
printf("%-3s %-15s %-5s %-5s %-8s
%-8s %-5s %-5s %-5s %-5s %-6s\n",
"No","Name","Type","Flags","Address","Offset","Size","Link","Info","Align","Esize");
for(i = 0; i < e->e_shnum; i++)
{
printf("%-3d %-15d %-5d %-5d %-8x
%-8d %-5d %-5d %-5d %-5d %-6d\n",
i,s->name,s->type,s->flags,s->addr,s->offset,s->size,s->link,s->info,s->align,s->esize);
s++;
}
s = sh + e->e_shstrndx;
str = p + s->offset;
for(i=0; i < s->size; i++)
{
if ( i%6 == 0)
printf("\n");
if (str[i] == 0)
printf("%2d) %3d 0 ",i,str[i]);
else
printf("%2d) %3d %2c
",i,str[i],str[i]);
}
printf("\n");
}
Output
Section Header Offset :164
SHO: Structure Size 40 No. of Entries 8
String Table Section Number :5
No
Name Type Flags Address Offset Size Link
Info Align Esize
0
0 0 0
0 0 0
0 0 0 0
1
27 1 6
0 52 5
0 0 4 0
2
33 1 3
0 60 0
0 0 4 0
3
39 8 3
0 60 0
0 0 4 0
4
44 1 0 0
60 51 0
0 1 0
5
17 3 0
0 111 53
0 0 1 0
6
1 2 0
0 484 112
7 6 4 16
7
9 3 0
0 596 9
0 0 1 0
0) 0 0
1) 46 . 2) 115 s 3)
121 y
4) 109 m 5) 116
t
6) 97 a
7) 98 b 8) 0
0 9) 46 . 10) 115 s 11) 116
t
12) 114
r 13) 116 t 14) 97 a
15) 98
b 16) 0 0 17)
46 .
18) 115
s 19) 104 h 20) 115 s 21) 116
t 22) 114 r 23) 116 t
24)
97 a 25) 98 b
26) 0
0 27) 46 . 28) 116
t 29) 101 e
30) 120
x 31) 116 t 32) 0 0
33) 46
. 34) 100 d 35) 97 a
36) 116
t 37) 97 a 38)
0 0 39) 46 .
40) 98
b 41) 115 s
42) 115
s 43) 0 0 44)
46 . 45) 99 c
46) 111 o 47) 109 m
48) 109
m 49) 101 e 50) 110 n 51) 116
t 52) 0 0
This example is similar to the earlier
one but at the end it displays the details of one section, the string table. In
the ELF header the last member referred to the string section number, which was
5 in our case.
The variable sh points to the start of
the section header. An increase of 5 will make it point to the fifth structure.
The offset member gives away the beginning of this section from the start of
the file and the size member is its total size. Then using a for loop, each
byte is displayed as a char and a number which makes up the section and a
number which is a relative offset within this section.
A look at the string table simply shows
strings terminated by a null. The offset 1 is a 0.
Lets look at the values of the name
field. Instead of storing the name of the section, an offset into the string
table is stored. Thus the last sections name is an offset 9 into the string table
and its
actual name is .strtab. Section names
start with a . always.
a.c
#include <stdio.h>
#include <sys/stat.h>
struct elfhdr
{
char e_ident[16];
short int e_type;
short int e_machine;
int e_version;
char *e_entry;
int e_phoff;
int e_shoff;
int e_flags;
short int e_ehsize;
short int e_phentsize;
short int e_phnum;
short int e_shentsize;
short int e_shnum;
short int e_shstrndx;
};
struct elf_shdr
{
int name;
int type;
int flags;
int addr;
int offset;
int size;
int link;
int info;
int align;
int esize;
};
FILE *fp;struct stat st;
struct elfhdr *e;
struct elf_shdr *s,*sh;
char *p,*str;
int i;
char flags[4];
char *stype[] =
{"NULL","PROGBITS","SYMTAB","STRTAB","RELA","HASH","DYNAMIC",
"NOTE","NOBITS","REL","SHLIB","DYNSYM","NUM"};
main()
{
fp = fopen("b.o","r");
fstat(fileno(fp),&st);
p = (char *) malloc(st.st_size);
fread(p,st.st_size,1,fp);
e = (struct elfhdr *)p;
printf("Section Header Offset :%d\n",e->e_shoff);
printf("SHO: Structure Size %d No.
of Entries
%d\n",e->e_shentsize,e->e_shnum);
printf("String Table Section
Number
:%d\n",e->e_shstrndx);
sh = p + e->e_shoff;
s = sh + e->e_shstrndx;
str = p + s->offset;
s= sh;
printf("\n");
printf("%-2s %-12s %-10s %-5s %-8s
%-8s %-5s %-5s %-5s %-5s %-5s\n",
"No","Name","Type","Flg","Address","Offset","Size","Link","Info","Align","Esize");
for(i = 0; i < e->e_shnum; i++)
{
printf("%-2d %-12s
",i,str+s->name);
if(s->type <= 12)
printf("%-10s",stype[s->type]);
else
printf("%10x",s->type);
strcpy(flags,"");
if((s->flags & 0x01) == 1)
strcat(flags,"W");
if((s->flags & 0x02) == 2)
strcat(flags,"A");
if((s->flags & 0x04) == 4)
strcat(flags,"E");
printf(" %-5s %-8x %-8d %-5d %-5d
%-5d %-5d %-5d\n",
flags,s->addr,s->offset,s->size,s->link,s->info,s->align,s->esize);
s++;
}
}
Output
Section Header Offset :164
SHO: Structure Size 40 No. of Entries 8
String Table Section Number :5
No Name Type Flg
Address Offset Size
Link Info Align Esize
0 NULL 0 0 0
0
0 0 0
1
.text PROGBITS AE 0
52 5 0
0 4 0
2
.data PROGBITS WA 0
60 0 0
0 4 0
3
.bss NOBITS WA 0
60 0 0 0 4 0
4
.comment PROGBITS 0 60 51 0 0
1 0
5
.shstrtab STRTAB 0 111 53 0 0 1
0
6
.symtab SYMTAB 0 484 112 7
6 4
16
7
.strtab STRTAB 0 596 9 0
0 1 0
This program performs the same task as
before. It prints out the name of each section. The str variable contains the
offset of the start of the section to which the name offset is added. These
names are null terminated. Also the values of the flags member are printed in a
readable form.
W and E stand for write-able and
executable.
e.c
abc()
{
printf("hi\n");
printf("bye\n");
printf("Good\n");
fopen(0,0);
}
//gcc -c e.c
//ld -o e e.o -lc -e abc
//objdump -d e
We now create an executable e that calls two functions, printf and fopen.
Then the gcc compiler creates an object file e.o and the linker ld creates the
executable. The compiler gcc calls the linker ld. The -l option links the
shared library libc.so and since there is no main function, the -e option makes
abc the first function to be called which becomes the entry point of the
program.
a.c
#include <stdio.h>
#include <sys/stat.h>
struct elf_hdr
{
char e_ident[16];
short int type;
short int machine;
int version;
char *entry;
int phoff;
int shoff;
int flags;
short int ehsize;
short int phentsize;
short int phnum;
short int shentsize;
short int shnum;
short int shstrndx;
};
struct elf_phdr
{
int type;
int offset;
int vaddr;
int paddr;
int filesz;
int memsz;
int flags;
int align;
};
struct elf_shdr
{
int name;
int type;
int flags;
int addr;
int offset;
int size;
int link;
int info;
int aalign;
int entsize;
};
struct elf_hdr *h;
struct elf_phdr *p;
struct elf_shdr *s, *s1,*u;
struct sym {
int name;
unsigned int value;
int size;
unsigned char info;
unsigned char other ;
short int shndx;
};
struct rel {
long offset;
int info;
};
struct rel *r;
struct dyn {
long tag;
long addr;
};
char *dtype[] =
{
"NULL",
"NEEDED","PLTRELSZ","PLTGOT","HASH","STRTAB","SYMTAB","RELA",
"RELASZ","RELAENT","STRSZ","SYMENT","INIT","FINI","SONAME","RPATH",
"SYMBOLIC","REL","RELSZ","RELENT","PLTREL","DEBUG","TEXTREL","JMPREL"
};
struct dyn *d;
char *bind[] =
{"LOCAL","GLOBAL","WEAK"};
char *type[] = {"NOTYPE",
"OBJECT","FUNC", "SECTION","FILE"};
struct sym *t; int no,j,k;
FILE *fpi;
char *e, *e1;int i;
char *class[] =
{"Null","32-bits Object","64-bits Object"};
char *endian[] =
{"Null","Little Endian","Big Endian"};
char *filetype[] =
{"None","Rel Obj","Executable",
"Dynamic","Core","Num"};
char *machine[] = {"None",
"WE32100","Sparc","80386","MK68000",
"MK88000","80486","80860","MIPS"};
char *version[] = {"Invalid
Version", "Current Version"};
struct stat st;
int i,y;
long *hs;
char *ptype[] = {
"Null","Load","Dynamic",
"Interp","Note", "Shared
lib",
"PHDR","Num"};
char *stype[]
={"NULL","PROGBITS","SYMTAB","STRTAB",
"RELA", "HASH",
"DYNAMIC","NOTE","NOBITS","REL","SHLIB",
"DYNSYM", "NUM"};
char
strname[100];
unsigned long h1;
char flg[5];
unsigned char *so;
int ii;
long bucket, chain;
int *buck, *ch;
char * symstr;
struct sym *t;int k;
unsigned long abc(unsigned char
*name)
{
unsigned long h=0,g=0;
while(*name)
{
h = ( h << 4 ) + *name++;
if(g = h & 0xf0000000)
h ^= g >> 24;
h &= ~g;
}
return h;
}
void rel()
{
no = s->size / s->entsize;
printf("Relocation entries..%d for
%s\n",no,strname);
r = (struct rel *) (e + s->offset);
for (k = 1; k <= no; k++)
{
printf("%08x %08x
",r->offset, r->info);
j = r->info; j = j &0x000000ff;
printf("%03d \n",j);
r++;
}
}
void strtab()
{
for(ii=0; ii< s->size; ii++)
{
if(so[ii] == 0)
printf("\n");
else
printf("%c", so[ii]);
}
printf("\n");
}
void dynamic() {
no = s->size / s->entsize;
printf("Number of Entries
%d\n",no);
d = (struct dyn *) so;
for(j =1 ; j <= no; j++)
{
if (d->tag <= 23)
printf("%2d %-8s %d\n",j, dtype[d->tag],
d->addr);
else
printf("%2d %08x %d\n",j, d->tag, d->addr);
d++;
}
}
void dynsym() {
u = (struct elf_shdr *)( e + h->shoff
);
u = u + s->link;
symstr = e + u->offset;
t = (struct sym *)(e + s->offset);
no = s->size / s->entsize;
printf("No. of Symbols %d\n",
no);
printf(" Name
Value Size O
SInd
Info \n");
for(ii = 1; ii <= no ; ii++)
{
printf("%03d %-25s
%08x %04d ", t->name, symstr+t->name,
t->value,
t->size);
printf("%01d %04d
%04d ", t->other,
t->shndx, t->info);
j = t->info;
j = j >> 4;
printf("%-6s ", bind[j]);
k = t->info ; k = k&0xF;
printf(" %-7s ", type[k]);
h1 = abc(e1 + t->name);
printf("\n");
//printf("%02d ", h1%3);
//y = buck[h1%3];
//printf("%02d %02d\n",y,ch[y]);
t++;
}
}
void symtab() {
u = (struct elf_shdr *)( e + h->shoff
);
u = u + s->link;
symstr = e + u->offset;
t = (struct sym *)(e + s->offset);
no = s->size / s->entsize;
printf("No. of symbols
%d\n",no);
printf(" Name Value Size
O SInd Info\n");
for(ii = 1; ii <=no;ii++)
{
printf("%3d %03d
%-33s %08x %04d
",ii, t->name, symstr+t->name,
t->value, t->size);
printf("%01d %04d
%04d ", t->other,
t->shndx, t->info);
j = t->info;
j = j >> 4;
printf("%-6s", bind[j]);
k = t->info ; k = k & 0xF;
printf("%-7s \n",type[k]);
t++;
}
}
void interp() {
for(ii = 0 ; ii <s->size-1; ii++)
printf("%c",so[ii]);
printf("\n");
}
void plt() {
int j=0;
for(ii = 0 ; ii <s->size;
ii++,j++){
if(j == 16){
j = 0;
printf("\n");
}
printf("%02x ",so[ii]);
}
printf("\n");
}
void text() {
int j=0;
for(ii = 0 ; ii <s->size;
ii++,j++){
if( j == 16){
j = 0;
printf("\n");
}
printf("%02x ",so[ii]);
}
printf("\n");
}
void got() {
long * k;
k = (long *)so;
for(ii = 0 ; ii <s->size; ii+=4)
{
printf("%08x:
%08x\n",s->addr+ii,*k);
k++;
}
printf("\n");
}
main(int argc, char **argv)
{
fpi = fopen(argv[1],"r");
fstat(fileno(fpi),&st);
e = (char *)malloc(st.st_size);
fread(e,st.st_size,1,fpi);
h=(struct elf_hdr *) e;
printf("Header ..\n");
printf("%2x %c
%c %c %-15s %-15s %d\n",h->e_ident[0],
h->e_ident[1],h->e_ident[2],h->e_ident[3],class[h->e_ident[4]],
endian[h->e_ident[5]], h->e_ident[6]);
printf("FileType ..
%s\n",filetype[h->type]);
printf("Machine .. %s\n",machine[h->machine]);
printf("Version .. %s\n",version[h->version]);
printf("Entry Point
%x\n",h->entry);
printf("Program Header Offset
%d\n", h->phoff);
printf("Section Header Offset
%d\n", h->shoff);
printf("Flags %d\n",h->flags);
printf("Header Size %d\n",h->ehsize);
printf("Program Header Entry size
%d\n",h->phentsize);
printf("No. of Program Headers %d\n",h->phnum);
printf("Section Header Entry size
%d\n",h->shentsize);
printf("No. of Section Headers %d\n",h->shnum);
printf("String Section %d\n",h->shstrndx);
p = e + h->phoff;
printf("\nProgram Header
\n\n");
printf("Type Offset Vaddr Paddr Filesz
Memsz Flags
Align\n");
for(i = 0; i<h->phnum; i++)
{
printf("%-10s %4d %8x
%8x",ptype[p->type],p->offset,p->vaddr,p->paddr);
printf(" %4d %4d %4d
%d",p->filesz,
p->memsz,p->flags,p->align);
printf("\n");
p++;
}
printf("\nSection Header
\n\n");
s = e + h->shoff;
s1 = s + h->shstrndx ;
e1 = e + s1->offset;
printf(" Name
Type Flags Addr
Offset
Size
\
Link
Info aalign entsize\n");
for(i = 0; i < h->shnum; i++)
{
printf("%02d ",i);
strcpy(strname,(e1+s->name));
printf("%-15s ",strname);
if(s->type <= 12)
printf("%10s ",stype[s->type]);
else
printf("%10x ",s->type);
flg[0]=0;
if((s->flags & 0x01) == 1)
strcat(flg,"W");
if((s->flags & 0x02) == 2)
strcat(flg,"A");
if((s->flags & 0x04) == 4)
strcat(flg,"E");
printf("%-4s ",flg);
printf("%8x %8x
%4d %4d %4d
%4d %4d \n",
s->addr,s->offset,s->size,s->link,s->info,s->aalign,s->entsize);
so = e + s->offset;
switch(s->type)
{
case 1 :
if(strcmp(".interp",strname)
== 0)
interp();
if(strcmp(".text",strname) ==
0)
text();
if(strcmp(".plt",strname) ==
0)
plt();
if(strcmp(".got",strname) ==
0)
got();
break;
case 2 :
printf("SYMTAB\n");
symtab();
break;
case 3 :
printf("STRTAB\n");
break;
case 4 :
printf("RELA");
break;
case 5 :
printf("HASH\n");
break;
case 6 :
printf("\nDYNAMIC\n");
dynamic();
break;
case 7 :
printf("NOTES");
break;
case 8 :
printf("NOBITS");
break;
case 9 :
printf("REL\n");
rel();
break;
case 10 :
printf("SHLIB");
break;
case 11 :
printf("DYNSYM\n");
dynsym();
break;
case 12 :
printf("NUM");
break;
default:
printf("");
}
printf("\n");
s++;
}
}
Lets us run the program
objdump -d e
and its output is as.
e:
file format elf32-i386
Disassembly of section .plt:
080481ac <.plt>:
80481ac: ff 35 d8 92 04
08 pushl 0x80492d8
80481b2: ff 25 dc 92 04
08 jmp *0x80492dc
80481b8: 00 00 add %al,(%eax)
80481ba: 00 00 add %al,(%eax)
80481bc: ff 25 e0 92 04
08 jmp *0x80492e0
80481c2: 68 00 00 00
00 push $0x0
80481c7: e9 e0 ff ff
ff jmp 80481ac <abc-0x30>
80481cc: ff 25 e4 92 04
08 jmp *0x80492e4
80481d2: 68 08 00 00
00 push $0x8
80481d7: e9 d0 ff ff
ff jmp 80481ac <abc-0x30>
Disassembly of section .text:
080481dc <abc>:
80481dc: 55 push %ebp
80481dd: 89 e5 mov %esp,%ebp
80481df: 83 ec 08 sub $0x8,%esp
80481e2: 83 ec 0c sub $0xc,%esp
80481e5: 68 23 82 04
08 push $0x8048223
80481ea: e8 cd ff ff
ff call 80481bc <abc-0x20>
80481ef: 83 c4 10 add $0x10,%esp
80481f2: 83 ec 0c sub $0xc,%esp
80481f5: 68 27 82 04
08 push $0x8048227
80481fa: e8 bd ff ff
ff call 80481bc <abc-0x20>
80481ff: 83 c4 10 add $0x10,%esp
8048202: 83 ec 0c sub $0xc,%esp
8048205: 68 2c 82 04
08 push $0x804822c
804820a: e8 ad ff ff
ff call 80481bc <abc-0x20>
804820f: 83 c4 10 add $0x10,%esp
8048212: 83 ec 08 sub $0x8,%esp
8048215: 6a 00 push $0x0
8048217: 6a 00 push $0x0
8048219: e8 ae ff ff
ff call 80481cc <abc-0x10>
804821e: 83 c4 10 add $0x10,%esp
8048221: c9 leave
8048222: c3 ret
Header ..
7f
E L F 32-bits Object Little Endian 1
FileType .. Executable
Machine
.. 80386
Version
.. Current Version
Entry Point 80481dc
Program Header Offset 52
Section Header Offset 952
Flags 0
Header Size 52
Program Header Entry size 32
No. of Program Headers 5
Section Header Entry size 40
No. of Section Headers 19
String Section 16
Program Header
Type Offset Vaddr Paddr
Filesz Memsz Flags
Align
PHDR 52 8048034 8048034
160 160 5
4
Interp 212 80480d4 80480d4
19 19 4
1
Load 0 8048000 8048000
574 574 5
4096
Load 576 8049240 8049240
180 180 6
4096
Dynamic 576 8049240 8049240
160 160 6
4
Section Header
Name Type Flags
Addr Offset Size
Link
Info
aalign entsize
00 NULL 0
0 0
0
0 0 0
01
.interp PROGBITS A 80480d4 d4
19
0
0 1 0
/usr/lib/libc.so.1
02
.hash HASH A
80480e8 e8 32
3
0 4 4
HASH
03
.dynsym DYNSYM A
8048108 108 48
4
1 4 16
DYNSYM
No. of Symbols 3
Name Value Size O SInd Info
000 00000000 0000 0 0000
0000 LOCAL
NOTYPE
011
printf
080481bc 0057 0
0000 0018 GLOBAL
FUNC
018
fopen
080481cc 0053 0
0000 0018 GLOBAL
FUNC
04
.dynstr STRTAB A
8048138 138 44
0
0 1 0
STRTAB
05
.gnu.version 6fffffff A
8048164 164 6
3
0 2 2
06
.gnu.version_r 6ffffffe A
804816c 16c 48
4
1 4 0
07
.rel.plt REL A
804819c 19c 16
3
8 4 8
REL
Relocation entries..2 for .rel.plt
080492ec 00000107 007
080492f0 00000207 007
08
.plt PROGBITS AE
80481ac 1ac 48
0
0 4 4
ff 35 e4 92 04 08 ff 25 e8 92 04 08 00
00 00 00
ff 25 ec 92 04 08 68 00 00 00 00 e9 e0
ff ff ff
ff 25 f0 92 04 08 68 08 00 00 00 e9 d0
ff ff ff
09
.text PROGBITS AE
80481dc 1dc 77
0
0 4 0
55 89 e5 83 ec 08 83 ec 0c 68 29 82 04
08 e8 cd
ff ff ff 83 c4 10 83 ec 0c 68 2d 82 04
08 e8 bd
ff ff ff 83 c4 10 83 ec 0c 68 32 82 04
08 e8 ad
ff ff ff 83 c4 10 83 ec 08 68 38 82 04
08 68 3a
82 04 08 e8 a8 ff ff ff 83 c4 10 c9 c3
10
.rodata PROGBITS A
8048229 229 21
0
0 1 0
11
.data PROGBITS WA
8049240 240 0
0
0 4 0
12
.dynamic DYNAMIC WA
8049240 240 160
4
0 4 8
DYNAMIC
Number of Entries 20
1 NEEDED 1
2 HASH 134512872
3 STRTAB 134512952
4 SYMTAB 134512904
5 STRSZ 44
6 SYMENT 16
7 DEBUG 0
8 PLTGOT 134517472
9 PLTRELSZ 16
10
PLTREL 17
11
JMPREL 134513052
12
6ffffffe 134513004
13
6fffffff 1
14
6ffffff0 134512996
15
NULL 0
16
NULL 0
17
NULL 0
18
NULL 0
19
NULL 0
20
NULL 0
13
.got PROGBITS WA
80492e0 2e0 20
0
0 4 4
080492e0: 08049240
080492e4: 00000000
080492e8: 00000000
080492ec: 080481c2
080492f0: 080481d2
14
.bss NOBITS
WA 80492f4 2f4
0
0
0 4 0
NOBITS
15
.comment PROGBITS 0 2f4 51
0
0 1 0
16
.shstrtab STRTAB 0 327 142
0
0 1 0
STRTAB
17
.symtab SYMTAB 0 6b0 448
18
20 4 16
SYMTAB
No. of symbols 28
Name
Value Size O
SInd
Info
1 000 00000000 0000 0
0000
0000
LOCAL NOTYPE
2 000 080480d4 0000
0 0001
0003
LOCAL SECTION
3 000 080480e8 0000
0 0002
0003
LOCAL SECTION
4 000 08048108 0000
0 0003
0003
LOCAL SECTION
5 000 08048138 0000
0 0004
0003
LOCAL SECTION
6 000 08048164 0000
0 0005
0003
LOCAL SECTION
7 000 0804816c 0000
0 0006
0003
LOCAL SECTION
8 000 0804819c 0000
0 0007
0003
LOCAL SECTION
9 000 080481ac 0000 0
0008
0003
LOCAL SECTION
10 000
080481dc 0000 0
0009
0003
LOCAL SECTION
11 000
08048229 0000 0
0010
0003
LOCAL SECTION
12 000 08049240 0000
0 0011
0003
LOCAL SECTION
13 000
08049240 0000 0
0012
0003
LOCAL SECTION
14 000
080492e0 0000 0
0013
0003
LOCAL SECTION
15 000
080492f4 0000 0
0014
0003
LOCAL SECTION
16 000
00000000 0000 0
0015
0003
LOCAL SECTION
17 000 00000000
0000 0 0016
0003
LOCAL SECTION
18 000
00000000 0000 0
0017
0003
LOCAL SECTION
19 000
00000000 0000 0
0018
0003
LOCAL SECTION
20 001 e.c 00000000 0000 0 -015
0004
LOCAL FILE
21 005 _DYNAMIC 08049240 0000 0 0012
0017
GLOBALOBJECT
22 014 abc 080481dc 0077 0 0009
0018
GLOBALFUNC
23 018 __bss_start 080492f4
0000 0 -015
0016
GLOBALNOTYPE
24 030 printf@@GLIBC_2.0 080481bc
0057 0 0000
0018
GLOBALFUNC
25 048 _edata 080492f4 0000 0
-015
0016
GLOBALNOTYPE
26 055 _GLOBAL_OFFSET_TABLE_ 080492e0 0000
0 0013
0017
GLOBALOBJECT
27 077 _end 080492f4 0000 0 -015
0016
GLOBALNOTYPE
28 082 fopen@@GLIBC_2.1 080481cc
0053 0 0000
0018
GLOBALFUNC
18
.strtab STRTAB 0 870 99
0
0 1 0
STRTAB
/*
printf("hi\n")
abc
call 80481bc
This call is in the plt. at 80481bc is
the instruction as
plt
80481bc : jmp *0x8049230
8049230
belongs to the got section *8049230 will result in 080481c2
got
8049230 : 80481c2
80481c2 is again in the plt section with
the instructions as
plt
80481c2: 68 00 00 00 00 - pushl $0x0
The push instruction carries with it the
offset as per the relocation Area. Since the offset is 0 , it means the first
relocation entry will be looked at is 0. In other words the relocation entry is
pushed on the stack The next instruction in plt is as follows
80481c7: e9 e0 ff ff ff - jmp 80481ac
This jump instruction points to the
begin of the plt section
80481ac: ff 35 28 92 04 08 - pushl 0x8049228
pushl instruction looks at the second
entry in the got. As of now, while the file is lying idle on disk, the values
are 0 but when this executable is alive or being executed, the loader replaces
this second member in the got with its address location, as in where in memory
the loader has been loaded. Normally the address is 40000000. After the push,
the next instruction looked upon is
80481b2 : ff 25 2c 92 04 08 - jmp
0x804922c
This entry is again in the got ,
pointing to the third member. When this code gets executed, the loader replaces
this value of all 0s with the address location of fixup - symbol or a function
which finally relocates the addresses of the external symbols. The fixup
function when called has two addresses on the stack. The first push is that of
the relocation offset and the second is of the loader. The relocation entry
gives the details as in which offset in the got has to be reloacted and the
second int/word gives info of the symbol to be relocated.
Relocation entry :
8049230
00000107
address
00000107 >> 8 = 00000001 = 1 entry in dynsym = printf
in got value = 80481bc
Example case - fopen
text
fopen(0,0)
804820a e8 bd ff ff
ff call 80481cc
plt
80481cc : ff 25 34 92 04 08
jmp *0x8049234
got
8049234 : 80481d2
plt
80481d2 : 68 08 00 00 00 pushl $0x8
relocation entry
8:
8049234 : 00000207
addr 00000207 >> 8 = 00000002 = 2 entry in dynsym = fopen
in got value 80481cc
plt
80481d7 : e9 d0 ff ff ff jmp 80481ac
80481ac : ff 35 28 92 04 08 pushl 8049228
got
8049228 00000000
(address of loader at run time)
plt
80481b2
: ff 25 2c 92 04 08 jmp
*0x804922c
got
804922c 00000000
(address of fixup)
*/
Firstly, the file passed as parameter e
is opened. Then the file size is determined. Thereafter, memory is allocated
for the entire file using the malloc function and the file is read into memory.
The pointer h points to the initial 52
bytes and is used to display this initial header. Since the file is an
executable file and not an object file, there are 5 program headers of size 32
bytes each. Every program header is then displayed in the similar manner as the
section header.
A program header is used by the dynamic
linker to load the executable into
memory. The first member type can have upto 9 different values. The value
PT_LOAD specifies that this segment must be loaded in memory. The type interp
gives the path name of the interpreter to be used. The type Dynamic has the
information required by the dynamic linker and PHDR is the program header
itself.
The offset member holds the value as to
where in file the segment begins and vaddr is its address in memory. The p_addr
member is ignored here as it is used in systems using physical addressing. The file
size and mem size is the size on disk and in memory.
The alignment decides on the byte
boundary this segment can begin from. Also, the pointer s now points to the
start of the section headers. Then there is the string table that gives the
names of entities. So, at first using the shstrndx member of the main header,
the section header for the string table is looked at. Thus, s1 points to this
section responsible for the string table and from thereon the offset of the
string table is retrieved which is then stored in the e1 variable.
All the section headers are displayed
using a loop along with their contents for some sections only. The type is
displayed as a string followed by the flags and then the rest of the fields.
Then comes a huge switch statement that displays further information of the
section.
Firstly, the sections whose type is 1 or
PROGBITS is looked at. This section holds information that is defined by the
program and no one but the program can make sense of it. There is no ELF
specification on the nature of information stored in such sections.
The type 2 is for a symbol table and a
type 3 is a string table, 4 for relocation entries and 5 for a hash table. Type
7 is for dynamic linking and 8 for leaving notes behind. The type 8 occupies no
space in the file but represents PROGBITS. Type 9 is also for relocations but
without explicit addins and type 10 is reserved and when used makes the ELF
file incompatible. Type 11 is for dynamic symbols and the rest are reserved.
Lets first display some PROGBIT
sections. A section with the name interp holds the path name of the program
interpreter which is libc.so.1 in our case, the function interp simply displays
this name. The section .text holds the actual code that will be executed. The
text function simply displays the raw hex bytes with an enter after displaying
16 numbers.
The next section in focus is the one
that contains dynamic symbols. For eg, functions like printf and fopen are
found in the so files and they have not been written by us. The dynsym section
gives a list of functions whose code is not supplied by us. The section symtab
contains the rest of the symbols like the function abc. Finally the .dynamic
section contains information for dynamic linking. There is a call instruction
call 80481bc which is nothing but a call to the printf function.
Since the printf function is called
thrice, the call instruction too is seen thrice in the output. The only concern
is that the address of the printf function is in libc.so and the address 80481bc
is in our code.
The objdump disassembly for the plt
section clearly shows that this address falls within the plt area. It starts
with a jump instruction jmp *0x80492e0. Which means that the call to the printf
function calls code in the plt due to the above jump instruction. The address
0x80492e0 refers to the GOT or the Global Offset Table.
The GOT section is nothing but a series
of address. The address at 0x80492e0 is 0x80481c2. This address happens to be
the next instruction after the call 0x80481bc instruction in the PLT. The star
in the jmp ensures that the next instruction called is at 0x80481c2 and not
0x80492e0.
As it is the first entry in the PLT, a 0
is pushed on the stack. This 0 is also the offset into the relocation table.
Meaning that, calling code beginning at 80481ac is actually executing code
which at the start of the plt section. The code at the beginning of the plt
simply pushes a value 0x80492d8 on the stack. This value is the second entry in
the GOT which is zero as of now.
On running the program, this second
entry will be 400000, the address where our program is loaded in memory. After
this push is an indirect jump at 0x80492dc. This location is the third entry in
the GOT and its value is zero as of now. The dynamic loader puts the address of
a fix up function that figures out the address of the printf function. This
code will place the actual address of printf so that the above complex process does not happen. This
functions has two values on the stack, the first one an offset of 0 and the
second is the address where our program is loaded in memory.
As the offset is zero, the first int is
picked up which is 00000107. A right shift by 8 bits gives 1, which is the
first entry in the dynamic symbol table, the entry for printf. The value is
80481bc which is the value to be used to call the printf function in the PLT.
a.c
#include <stdio.h>
main()
{
int **p;
printf("hi\n");
printf("bye\n");
printf("Good\n");
fopen("e.c","r");
p = 0x8049658;
printf("%x=%x\n",p,*p);
p++;
printf("%x=%x\n",p,*p);
p++;
printf("%x=%x\n",p,*p);
p++;
printf("%x=%x\n",p,*p);
p++;
printf("%x=%x\n",p,*p);
printf("printf=%p\n",printf);
printf("fopen=%p\n",fopen);
}
Output
hi
bye
Good
8049658=804957c
804965c=40015a38
8049660=4000bcb0
8049664=42015490
8049668=4204f0e0
printf=0x80482a4
fopen=0x80482b4
08049658: 0804957c
0804965c: 00000000
08049660: 00000000
08049664: 0804829a
08049668: 080482aa
0804966c: 080482ba
08049670: 00000000
The program e.c is an extension of the above. The
abc function is replaced with main and the last output is running our program a
with e. The GOT in this case starts at 8049658 whereas the next two ints are
zeroes.