Linux – ELF – File Format

 

 

To write an ELF file virus, one must have a complete understanding of the ELF File Format. ELF stands for Executable and Linking Format. This format is used by the Linux Operating system when creating executable files as against Portable Executable (PE) file format in Windows.

Our very first program b.c has no main function but a simple abc

 

b.c

abc()

{

}

gcc -c b.c

 

ls -l b.o

-rw-r--r--   1 root     root          728 Nov 14 13:14 b.o

 

The gcc compiler with the –c option does not call the linker, which is the default. Hence the output is the object file b.o. Once this object file is created, we write a program that will read the contents of this file.

 

a.c

#include <stdio.h>

main()

{

FILE *fp;

fp = fopen("b.o","r");

printf("%x ",fgetc(fp));

printf("%c",fgetc(fp));

printf("%c",fgetc(fp));

printf("%c",fgetc(fp));

printf("\n");

}

 

gcc a.c -o a

./a

7f ELF

 

The object file b.o is opened in the read mode using the trustworthy fopen function. The first number retrieved using the fgetc function is printed in hex which is 0x7f and the next three are in chars. The output 7F followed by the chars ELF is what every ELF files begins with, this is called the magic number.

 

a.c

#include <stdio.h>

main()

{

FILE *fp;

fp = fopen("b.o","r");

fseek(fp,4,0);

printf("%d ",fgetc(fp));

printf("%d ",fgetc(fp));

printf("%d ",fgetc(fp));

printf("\n");

}

 

Output

1 1 1

 

This program displays the next 3 characters following the magic number. The fseek function leaps 4 bytes ahead to jump to the fifth byte to pick up the next 3 numbers. The output displays their values as 1.

 

The first of the 3 numbers is the class and can have one of the two values, 1 for a 32 bit object and 2 for a 64 bit object. As we are not rich enough to use 64 bits, the fifth byte is 1. The sixth byte is for the endianness, which if 1 specifies LSB or little endian and 2 for MSB or big endian. This byte decides whether the high byte is stored first or last. In little endian machines like the Intel family the low byte comes first followed by the high byte. The seventh byte is for the version number and it is 1 always.

 

a.c

#include <stdio.h>

char *class[]={"Invalid Class","32-bit objects", "64-bit objects"};

char *endian[] = {"Invalid", "Little Endian","Big Endian"};

char *version[] = {"Invalid Version","Current Version"};

FILE *fp;

main()

{

int i;

fp = fopen("b.o","r");

fseek(fp,4,0);

i = fgetc(fp);

printf("%s \n",class[i]);

i = fgetc(fp);

printf("%s \n",endian[i]);

i = fgetc(fp);

printf("%s \n",version[i]);

}

 

Output

32-bit objects

Little Endian

Current Version

 

The next program words the above numbers using the programming features. A byte from disk is read and then used as an offset into an array to return a string. This string is then printed out using the printf function.

 

a.c

#include <stdio.h>

struct elfhdr 

{

char e_ident[16];

short int e_type;

short int e_machine;

int e_version;

char *e_entry;

int e_phoff;

int e_shoff;

int e_flags;

short int e_ehsize;

short int e_phentsize;

short int e_phnum;

short int e_shentsize;

short int e_shnum;

short int e_shstrndx;

};

struct elfhdr e;

char *class[]={"Invalid Class","32-bit objects", "64-bit objects"};

char *endian[] = {"Invalid", "Little Endian","Big Endian"};

char *version[] = {"Invalid Version","Current Version"};

char *filetype[] = {"None","Rel

Obj","Executable","Dynamic","Core","Num"};

char *machine[] = {"None","WE32100","SPARC","80386","MK68000","MK88000","80486","80860","MIPS"};

FILE *fp;

main()

{

fp = fopen("b.o","r");

fread(&e,52,1,fp);

printf("%x %c%c%c %s  %s  %s\n",e.e_ident[0],e.e_ident[1],e.e_ident[2],

e.e_ident[3],class[e.e_ident[4]],endian[e.e_ident[5]],version[e.e_ident[6]]);

printf("Type         :%d  %s \n",e.e_type,filetype[e.e_type]);

printf("Machine     :%d  %s \n",e.e_machine,machine[e.e_machine]);

printf("Version     :%d  %s \n",e.e_version,version[e.e_version]);

printf("Entry Point     :%x\n",e.e_entry);

printf("Size of Initial Structure     :%d\n",e.e_ehsize);

printf("Flags                 :%d\n",e.e_flags);

printf("Program Header Offset         :%d\n",e.e_phoff);

printf("PHO : Structure Size %d No. of Entries

%d\n",e.e_phentsize,e.e_phnum);

printf("Section Header Offset         :%d\n",e.e_shoff);

printf("SHO: Structure Size %d No. of Entries

%d\n",e.e_shentsize,e.e_shnum);

printf("String Table Section Number     :%d\n",e.e_shstrndx);

}

 

Output

7f ELF 32-bit objects  Little Endian  Current Version

Type         :1  Rel Obj

Machine     :3  80386

Version     :1  Current Version

Entry Point     :0

Size of Initial Structure     :52

Flags                 :0

Program Header Offset         :0

PHO : Structure Size 0 No. of Entries 0

Section Header Offset         :164

SHO: Structure Size 40 No. of Entries 8

String Table Section Number     :5

 

The first 52 bytes are read into a structure with the structure tag e. This structure maps the initial bytes of an ELF file. The first seven bytes have been explained and the next 11 are pad bytes, hence all 0's. The 17th and 18th byte is for the file type where 1 stands for an object file, 2 for an exe file, 3 for a dll or shared library and 4 for a core file. The next short is for the machine where 3 is for an Intel 386. The array of strings are used to a more readable output. This is followed by an int or bytes for the version, ours is 1 or current version.

 

The next four bytes is a  pointer to where the first executable instruction begins in memory. As we are dealing with a object file it is 0. Then there are two important tables, the program header and the section table. The next two ints points out that the program header is 0 and the section  offset is at 164. The next int is for processor specific flags which is 0. The size of ELF header is 52 bytes long as specified by the next two bytes.

 

The next short is the size of the Program header table and the number of entries. As there are no program header, the value is zero. This is followed by the size and no of entries of the section table.

The section size is 40 with 8 entries. The last member is the string table section number which will be attended to in greater detail in a short while.

 

c.c

#include <stdio.h>

#include <sys/stat.h>

FILE *fp;

struct stat st;

main()

{

fp = fopen("b.o","r");

printf("Fileno %d\n",fileno(fp));

fstat(fileno(fp),&st);

printf("Size  %d\n",st.st_size);

}

 

gcc -o c c.c

./c

Fileno 3

Size  605

 

>ls -l b.o

-rw-r--r--   1 root     root          605 Nov 14 13:14 b.o

 

This program figures out the size of the file b.o. At first, the fileno function converts the pointer fp into a file handle. Since this is the first file that is being opened, it is given a number 3. 0,1 and 2 are taken up by standard input, standard output and stderr.

 

The fstat function taking a file handle is used to fill the stat structure. This structure has a member st_size that holds the file size, 605 bytes in our case.

 

a.c

#include <stdio.h>

#include <sys/stat.h>

struct elfhdr 

{

char e_ident[16];

short int e_type;

short int e_machine;

int e_version;

char *e_entry;

int e_phoff;

int e_shoff;

int e_flags;

short int e_ehsize;

short int e_phentsize;

short int e_phnum;

short int e_shentsize;

short int e_shnum;

short int e_shstrndx;

};

char *class[]={"Invalid Class","32-bit objects", "64-bit objects"};

char *endian[] = {"Invalid", "Little Endian","Big Endian"};

char *version[] = {"Invalid Version","Current Version"};

char *filetype[] = {"None","Rel

Obj","Executable","Dynamic","Core","Num"};

char *machine[] = {

"None","WE32100","SPARC","80386","MK68000","MK88000","80486","80860","MIPS"};

FILE *fp;struct stat st;

struct elfhdr *e;

char *p;

main()

{

fp = fopen("b.o","r");

fstat(fileno(fp),&st);

p = (char *) malloc(st.st_size);

fread(p,st.st_size,1,fp);

e = (struct elfhdr *)p;

printf("%x %c%c%c %s  %s 

%s\n",e->e_ident[0],e->e_ident[1],e->e_ident[2],

e->e_ident[3],class[e->e_ident[4]],endian[e->e_ident[5]],version[e->e_ident[6]]);

printf("Type         :%s \n",filetype[e->e_type]);

printf("Machine     :%s \n",machine[e->e_machine]);

printf("Version     :%s \n",version[e->e_version]);

printf("Entry Point     :%x\n",e->e_entry);

printf("Size of Initial Structure     :%d\n",e->e_ehsize);

printf("Flags                 :%d\n",e->e_flags);

printf("Program Header Offset         :%d\n",e->e_phoff);

printf("PHO : Structure Size %d No. of Entries

%d\n",e->e_phentsize,e->e_phnum);

printf("Section Header Offset         :%d\n",e->e_shoff);

printf("SHO: Structure Size %d No. of Entries

%d\n",e->e_shentsize,e->e_shnum);

printf("String Table Section Number     :%d\n",e->e_shstrndx);

}

 

Output

7f ELF 32-bit objects  Little Endian  Current Version

Type         :Rel Obj

Machine     :80386

Version     :Current Version

Entry Point     :0

Size of Initial Structure     :52

Flags                 :0

Program Header Offset         :0

PHO : Structure Size 0 No. of Entries 0

Section Header Offset         :164

SHO: Structure Size 40 No. of Entries 8

String Table Section Number     :5

 

This example works on the same principle as the earlier one but has more style. We, as before, get the size of the file using fstat and then use malloc to allocate an area of memory. Thereafter, using the fread function, the entire file is read into memory. E, which is pointer to a structure is then used to read the entire header. It is much easier from the programming standpoint to deal with pointers to structures instead of actual structures. There is no other change.

 

a.c

#include <stdio.h>

#include <sys/stat.h>

struct elfhdr  

{

char e_ident[16];

short int e_type;

short int e_machine;

int e_version;

char *e_entry;

int e_phoff;

int e_shoff;

int e_flags;

short int e_ehsize;

short int e_phentsize;

short int e_phnum;

short int e_shentsize;

short int e_shnum;

short int e_shstrndx;

};

struct elf_shdr 

{

int name;

int type;

int flags;

int addr;

int offset;

int size;

int link;

int info;

int align;

int esize;

};

FILE *fp;struct stat st;

struct elfhdr *e;

struct elf_shdr *s;

char *p;

int i;

main()

{

fp = fopen("b.o","r");

fstat(fileno(fp),&st);

p = (char *) malloc(st.st_size);

fread(p,st.st_size,1,fp);

e = (struct elfhdr *)p;

printf("Section Header Offset         :%d\n",e->e_shoff);

printf("SHO: Structure Size %d No. of Entries

%d\n",e->e_shentsize,e->e_shnum);

printf("String Table Section Number     :%d\n",e->e_shstrndx);

s = p + e->e_shoff;

printf("\n");

printf("%-3s %-15s %-5s %-5s %-8s %-8s %-5s %-5s %-5s %-5s %-6s\n",

"No","Name","Type","Flags","Address","Offset","Size","Link","Info","Align","Esize");

for(i = 0; i < e->e_shnum; i++)

{

printf("%-3d %-15d %-5d %-5d %-8x %-8d %-5d %-5d %-5d %-5d %-6d\n",

i,s->name,s->type,s->flags,s->addr,s->offset,s->size,s->link,s->info,s->align,s->esize);

s++;

}

}

 

Output

Section Header Offset         :164

SHO: Structure Size 40 No. of Entries 8

String Table Section Number     :5

 

No  Name    Type  Flags Address  Offset   Size  Link  Info  Align Esize

0   0               0     0     0        0        0     0     0     0     0    

1   27              1     6     0        52       5     0     0    4     0    

2   33              1     3     0        60       0     0     0    4     0    

3   39              8     3     0        60       0     0     0    4     0    

4   44              1     0     0        60       51    0     0    1     0    

5   17              3     0     0        111      53    0     0    1     0    

6   1               2     0     0        484      112   7     6    4     16   

7   9               3     0     0        596      9     0     0     1     0    

 

This program displays the entire section headers. The initial ELF header gives the offset of the section header, its size and number of entries. The offset e->e_shoff is directed to the start of the file p to give a pointer to where the section headers begin in the file. These section headers have a pre-decided form whose values are simply displayed in a loop.

 

a.c

#include <stdio.h>

#include <sys/stat.h>

struct elfhdr 

{

char e_ident[16];

short int e_type;

short int e_machine;

int e_version;

char *e_entry;

int e_phoff;

int e_shoff;

int e_flags;

short int e_ehsize;

short int e_phentsize;

short int e_phnum;

short int e_shentsize;

short int e_shnum;

short int e_shstrndx;

};

struct elf_shdr 

{

int name;

int type;

int flags;

int addr;

int offset;

int size;

int link;

int info;

int align;

int esize;

};

FILE *fp;struct stat st;

struct elfhdr *e;

struct elf_shdr *s,*sh;

char *p,*str;

int i;

main()

{

fp = fopen("b.o","r");

fstat(fileno(fp),&st);

p = (char *) malloc(st.st_size);

fread(p,st.st_size,1,fp);

e = (struct elfhdr *)p;

printf("Section Header Offset         :%d\n",e->e_shoff);

printf("SHO: Structure Size %d No. of Entries

%d\n",e->e_shentsize,e->e_shnum);

printf("String Table Section Number     :%d\n",e->e_shstrndx);

sh = p + e->e_shoff;

s= sh;

printf("\n");

printf("%-3s %-15s %-5s %-5s %-8s %-8s %-5s %-5s %-5s %-5s %-6s\n",

"No","Name","Type","Flags","Address","Offset","Size","Link","Info","Align","Esize");

for(i = 0; i < e->e_shnum; i++)

{

printf("%-3d %-15d %-5d %-5d %-8x %-8d %-5d %-5d %-5d %-5d %-6d\n",

i,s->name,s->type,s->flags,s->addr,s->offset,s->size,s->link,s->info,s->align,s->esize);

s++;

}

s = sh + e->e_shstrndx;

str = p + s->offset;

for(i=0; i < s->size; i++)

{

if ( i%6 == 0)

printf("\n");

if (str[i] == 0)

printf("%2d) %3d  0 ",i,str[i]);

else

printf("%2d) %3d %2c ",i,str[i],str[i]);

}

printf("\n");

}

 

Output

Section Header Offset         :164

SHO: Structure Size 40 No. of Entries 8

String Table Section Number     :5

 

 

No  Name   Type  Flags Address  Offset   Size  Link  Info  Align Esize

0   0               0     0     0        0        0     0     0     0     0    

1   27              1     6     0        52       5     0     0    4     0    

2   33              1     3     0        60       0     0     0    4     0    

3   39              8     3     0        60       0     0     0    4     0    

4   44              1     0     0        60       51    0     0   1     0    

5   17              3     0     0        111      53    0     0   1     0    

6   1               2     0     0        484      112   7     6   4     16   

7   9               3     0     0        596      9     0     0    1     0    

 

 0)   0  0  1)  46  .  2) 115  s  3) 121  y  4) 109  m  5) 116  t

 6)  97  a  7)  98  b  8)   0  0  9)  46  . 10) 115  s 11) 116  t

12) 114  r 13) 116  t 14)  97  a 15)  98  b 16)   0  0 17)  46  .

18) 115  s 19) 104  h 20) 115  s 21) 116  t 22) 114  r 23) 116  t

24)  97  a 25)  98  b 26)   0  0 27)  46  . 28) 116  t 29) 101  e

30) 120  x 31) 116  t 32)   0  0 33)  46  . 34) 100  d 35)  97  a

36) 116  t 37)  97  a 38)   0  0 39)  46  . 40)  98  b 41) 115  s

42) 115  s 43)   0  0 44)  46  . 45)  99  c 46) 111  o 47) 109  m

48) 109  m 49) 101  e 50) 110  n 51) 116  t 52)   0  0

 

This example is similar to the earlier one but at the end it displays the details of one section, the string table. In the ELF header the last member referred to the string section number, which was 5 in our case.

 

The variable sh points to the start of the section header. An increase of 5 will make it point to the fifth structure. The offset member gives away the beginning of this section from the start of the file and the size member is its total size. Then using a for loop, each byte is displayed as a char and a number which makes up the section and a number which is a relative offset within this section.

A look at the string table simply shows strings terminated by a null. The offset 1 is a 0.

 

Lets look at the values of the name field. Instead of storing the name of the section, an offset into the string table is stored. Thus the last sections name is an offset 9 into the string table and its

actual name is .strtab. Section names start with a . always.

 

a.c

#include <stdio.h>

#include <sys/stat.h>

struct elfhdr   

{

char e_ident[16];

short int e_type;

short int e_machine;

int e_version;

char *e_entry;

int e_phoff;

int e_shoff;

int e_flags;

short int e_ehsize;

short int e_phentsize;

short int e_phnum;

short int e_shentsize;

short int e_shnum;

short int e_shstrndx;

};

struct elf_shdr  

{

int name;

int type;

int flags;

int addr;

int offset;

int size;

int link;

int info;

int align;

int esize;

};

FILE *fp;struct stat st;

struct elfhdr *e;

struct elf_shdr *s,*sh;

char *p,*str;

int i;

char flags[4];

char *stype[] = {"NULL","PROGBITS","SYMTAB","STRTAB","RELA","HASH","DYNAMIC",

"NOTE","NOBITS","REL","SHLIB","DYNSYM","NUM"};

main()

{

fp = fopen("b.o","r");

fstat(fileno(fp),&st);

p = (char *) malloc(st.st_size);

fread(p,st.st_size,1,fp);

e = (struct elfhdr *)p;

printf("Section Header Offset         :%d\n",e->e_shoff);

printf("SHO: Structure Size %d No. of Entries

%d\n",e->e_shentsize,e->e_shnum);

printf("String Table Section Number     :%d\n",e->e_shstrndx);

sh = p + e->e_shoff;

s = sh + e->e_shstrndx;

str = p + s->offset;

s= sh;

printf("\n");

printf("%-2s %-12s %-10s %-5s %-8s %-8s %-5s %-5s %-5s %-5s %-5s\n",

"No","Name","Type","Flg","Address","Offset","Size","Link","Info","Align","Esize");

for(i = 0; i < e->e_shnum; i++)

{

printf("%-2d %-12s ",i,str+s->name);

if(s->type <= 12)

printf("%-10s",stype[s->type]);

else

printf("%10x",s->type);

strcpy(flags,"");

if((s->flags & 0x01) == 1)

strcat(flags,"W");

if((s->flags & 0x02) == 2)

strcat(flags,"A");

if((s->flags & 0x04) == 4)

strcat(flags,"E");

printf(" %-5s %-8x %-8d %-5d %-5d %-5d %-5d %-5d\n",

flags,s->addr,s->offset,s->size,s->link,s->info,s->align,s->esize);

s++;

}

}

 

Output

Section Header Offset         :164

SHO: Structure Size 40 No. of Entries 8

String Table Section Number     :5

 

No Name     Type          Flg   Address  Offset   Size  Link  Info     Align Esize

0               NULL                      0        0        0                     0     0                       0     0   

1  .text       PROGBITS              AE         0        52         5     0       0        4     0   

2  .data       PROGBITS              WA        0        60         0     0       0        4     0   

3  .bss         NOBITS                 WA        0        60         0     0       0        4     0   

4  .comment PROGBITS      0       60       51          0     0             1     0   

5  .shstrtab  STRTAB         0       111      53          0     0              1     0   

6  .symtab   SYMTAB         0       484      112         7    6                          4     16  

7  .strtab    STRTAB          0       596      9                 0     0          1     0   

 

This program performs the same task as before. It prints out the name of each section. The str variable contains the offset of the start of the section to which the name offset is added. These names are null terminated. Also the values of the flags member are printed in a readable form.

W and E stand for write-able and executable.

 

e.c

abc()  

{

printf("hi\n");

printf("bye\n");

printf("Good\n");

fopen(0,0);

}

 

//gcc -c e.c

//ld -o e e.o -lc -e abc

//objdump -d e

 

We now create an executable e  that calls two functions, printf and fopen. Then the gcc compiler creates an object file e.o and the linker ld creates the executable. The compiler gcc calls the linker ld. The -l option links the shared library libc.so and since there is no main function, the -e option makes abc the first function to be called which becomes the entry point of the program.

 

a.c

#include <stdio.h>

#include <sys/stat.h>

struct elf_hdr 

{

char e_ident[16];

short int type;

short int machine;

int version;

char *entry;

int phoff;

int shoff;

int flags;

short int ehsize;

short int phentsize;

short int phnum;

short int shentsize;

short int shnum;

short int shstrndx;

};

struct elf_phdr  

{

int type;

int offset;

int vaddr;

int paddr;

int filesz;

int memsz;

int flags;

int align;

};

struct elf_shdr  

{

int name;

int type;

int flags;

int addr;

int offset;

int size;

int link;

int info;

int aalign;

int entsize;

};

struct elf_hdr *h;

struct elf_phdr *p;

struct elf_shdr *s, *s1,*u;

struct sym  {

int name;

unsigned int value;

int size;

unsigned char info;

unsigned char other ;

short int shndx;

};

struct rel   {

long offset;

int info;

};

struct rel *r;

struct dyn  {

long tag;

long addr;

};

char *dtype[] =

{

"NULL", "NEEDED","PLTRELSZ","PLTGOT","HASH","STRTAB","SYMTAB","RELA",

"RELASZ","RELAENT","STRSZ","SYMENT","INIT","FINI","SONAME","RPATH",

"SYMBOLIC","REL","RELSZ","RELENT","PLTREL","DEBUG","TEXTREL","JMPREL"

};

struct dyn *d;

char *bind[] = {"LOCAL","GLOBAL","WEAK"};

char *type[] = {"NOTYPE", "OBJECT","FUNC", "SECTION","FILE"};

struct sym *t; int no,j,k;

FILE *fpi;

char *e, *e1;int i;

char *class[] = {"Null","32-bits Object","64-bits Object"};

char *endian[] = {"Null","Little Endian","Big Endian"};

char *filetype[] = {"None","Rel Obj","Executable", "Dynamic","Core","Num"};

char *machine[] = {"None", "WE32100","Sparc","80386","MK68000",

           "MK88000","80486","80860","MIPS"};

char *version[] = {"Invalid Version", "Current Version"};

struct stat st;

int i,y;

long *hs;

char *ptype[] = { "Null","Load","Dynamic", "Interp","Note", "Shared

lib", "PHDR","Num"};

char *stype[] ={"NULL","PROGBITS","SYMTAB","STRTAB", "RELA", "HASH",

"DYNAMIC","NOTE","NOBITS","REL","SHLIB", "DYNSYM", "NUM"};

char  strname[100];

unsigned long h1;

char flg[5];

unsigned char *so;

int ii;

long bucket, chain;

int *buck, *ch;

char * symstr;

struct sym *t;int k;

unsigned long abc(unsigned char *name) 

{

unsigned long h=0,g=0;

while(*name)

{

h = ( h << 4 ) + *name++;

if(g = h & 0xf0000000)

h ^= g >> 24;

h &= ~g;

}

return h;

}

void rel()

{

no = s->size / s->entsize;

printf("Relocation entries..%d for %s\n",no,strname);

r = (struct rel *) (e + s->offset);

for (k = 1; k <= no; k++)

{

printf("%08x %08x ",r->offset, r->info);

j = r->info; j = j &0x000000ff;

printf("%03d \n",j);

r++;

}

}

void strtab() 

{

for(ii=0; ii< s->size; ii++)

{

if(so[ii] == 0)

printf("\n");

else

printf("%c", so[ii]);

}

printf("\n");

}

void dynamic()  {

no = s->size / s->entsize;

printf("Number of Entries %d\n",no);

d = (struct dyn *) so;

for(j =1 ; j <= no; j++)

{

if (d->tag <= 23)

printf("%2d  %-8s %d\n",j, dtype[d->tag], d->addr);

else

printf("%2d  %08x %d\n",j, d->tag, d->addr);

d++;

}

}

void dynsym()  {

u = (struct elf_shdr *)( e + h->shoff );

u = u + s->link;

symstr = e + u->offset;

t = (struct sym *)(e + s->offset);

no = s->size / s->entsize;

printf("No. of Symbols %d\n", no);

printf("    Name                  Value       Size     O  SInd 

Info         \n");

for(ii = 1; ii <= no ; ii++)

{

printf("%03d  %-25s  %08x   %04d  ", t->name, symstr+t->name, t->value,

t->size);

printf("%01d  %04d  %04d  ", t->other, t->shndx, t->info);

j = t->info;

j = j >> 4;

printf("%-6s  ", bind[j]);

k = t->info ; k = k&0xF;

printf(" %-7s ", type[k]);

h1 = abc(e1 + t->name);

printf("\n");

//printf("%02d ", h1%3);

//y = buck[h1%3];

//printf("%02d  %02d\n",y,ch[y]);

t++;

}

}

void symtab()  {

u = (struct elf_shdr *)( e + h->shoff );

u = u + s->link;

symstr = e + u->offset;

t = (struct sym *)(e + s->offset);

no = s->size / s->entsize;

printf("No. of symbols %d\n",no);

printf("    Name        Value     Size   O  SInd   Info\n");

for(ii = 1; ii <=no;ii++)

{

printf("%3d   %03d   %-33s   %08x  %04d  ",ii, t->name, symstr+t->name,

t->value, t->size);

printf("%01d  %04d  %04d  ", t->other, t->shndx, t->info);

j = t->info;

j = j >> 4;

printf("%-6s", bind[j]);

k = t->info ; k = k & 0xF;

printf("%-7s  \n",type[k]);

t++;

}

}

void interp()    {

for(ii = 0 ; ii <s->size-1; ii++)

printf("%c",so[ii]);

printf("\n");

}

void plt()       {

int j=0;

for(ii = 0 ; ii <s->size; ii++,j++){

if(j == 16){

j = 0;

printf("\n");

}

printf("%02x ",so[ii]);

}

printf("\n");

}

void text()     {

int j=0;

for(ii = 0 ; ii <s->size; ii++,j++){

if( j == 16){

j = 0;

printf("\n");

}

printf("%02x ",so[ii]);

}

printf("\n");

}

void got()    {

long * k;

k = (long *)so;

for(ii = 0 ; ii <s->size; ii+=4)

{

printf("%08x: %08x\n",s->addr+ii,*k);

k++;

}

printf("\n");

}

main(int argc, char **argv)

{

fpi = fopen(argv[1],"r");

fstat(fileno(fpi),&st);

e = (char *)malloc(st.st_size);

fread(e,st.st_size,1,fpi);

h=(struct elf_hdr *) e;

printf("Header ..\n");

printf("%2x  %c  %c  %c  %-15s  %-15s  %d\n",h->e_ident[0],

h->e_ident[1],h->e_ident[2],h->e_ident[3],class[h->e_ident[4]],

endian[h->e_ident[5]], h->e_ident[6]);

printf("FileType .. %s\n",filetype[h->type]);

printf("Machine  .. %s\n",machine[h->machine]);

printf("Version  .. %s\n",version[h->version]);

printf("Entry Point %x\n",h->entry);

printf("Program Header Offset %d\n", h->phoff);

printf("Section Header Offset %d\n", h->shoff);

printf("Flags                %d\n",h->flags);

printf("Header Size           %d\n",h->ehsize);

printf("Program Header Entry size %d\n",h->phentsize);

printf("No. of Program Headers    %d\n",h->phnum);

printf("Section Header Entry size %d\n",h->shentsize);

printf("No. of Section Headers    %d\n",h->shnum);

printf("String Section            %d\n",h->shstrndx);

p = e + h->phoff;

printf("\nProgram Header \n\n");

printf("Type      Offset    Vaddr    Paddr   Filesz    Memsz    Flags  

Align\n");

for(i = 0; i<h->phnum; i++)

{

printf("%-10s  %4d %8x

%8x",ptype[p->type],p->offset,p->vaddr,p->paddr);

printf("   %4d    %4d    %4d    %d",p->filesz,

p->memsz,p->flags,p->align);

printf("\n");

p++;

}

printf("\nSection Header \n\n");

s = e + h->shoff;

s1 = s + h->shstrndx ;

e1 = e + s1->offset;

printf("       Name                Type  Flags   Addr      Offset  

Size   \

Link  Info   aalign  entsize\n");

for(i = 0; i < h->shnum; i++)

{

printf("%02d  ",i);

strcpy(strname,(e1+s->name));

printf("%-15s   ",strname);

if(s->type <= 12)

printf("%10s   ",stype[s->type]);

else

printf("%10x   ",s->type);

flg[0]=0;

if((s->flags & 0x01) == 1)

strcat(flg,"W");

if((s->flags & 0x02) == 2)

strcat(flg,"A");

if((s->flags & 0x04) == 4)

strcat(flg,"E");

printf("%-4s ",flg);

printf("%8x  %8x  %4d   %4d  %4d    %4d   %4d \n",

 s->addr,s->offset,s->size,s->link,s->info,s->aalign,s->entsize);

so = e + s->offset;

switch(s->type)

{

case 1 :

if(strcmp(".interp",strname) == 0)

interp();

if(strcmp(".text",strname) == 0)

text();

if(strcmp(".plt",strname) == 0)

plt();

if(strcmp(".got",strname) == 0)

got();

break;

case 2 :

printf("SYMTAB\n");

symtab();

break;

case 3 :

printf("STRTAB\n");

break;

case 4 :

printf("RELA");

break;

case 5 :

printf("HASH\n");

break;

case 6 :

printf("\nDYNAMIC\n");

dynamic();

break;

case 7 :

printf("NOTES");

break;

case 8 :

printf("NOBITS");

break;

case 9 :

printf("REL\n");

rel();

break;

case 10 :

printf("SHLIB");

break;

case 11 :

printf("DYNSYM\n");

dynsym();

break;

case 12 :

printf("NUM");

break;

default:

printf("");

}

printf("\n");

s++;

}

}

 

Lets us run the program

objdump -d e

 

and its output is as.

e:     file format elf32-i386

 

Disassembly of section .plt:

080481ac <.plt>:

 80481ac:    ff 35 d8 92 04 08        pushl  0x80492d8

 80481b2:    ff 25 dc 92 04 08        jmp    *0x80492dc

 80481b8:    00 00                    add    %al,(%eax)

 80481ba:    00 00                    add    %al,(%eax)

 80481bc:    ff 25 e0 92 04 08        jmp    *0x80492e0

 80481c2:    68 00 00 00 00           push   $0x0

 80481c7:    e9 e0 ff ff ff           jmp    80481ac <abc-0x30>

 80481cc:    ff 25 e4 92 04 08        jmp    *0x80492e4

 80481d2:    68 08 00 00 00           push   $0x8

 80481d7:    e9 d0 ff ff ff           jmp    80481ac <abc-0x30>

Disassembly of section .text:

 

080481dc <abc>:

 80481dc:    55                       push   %ebp

 80481dd:    89 e5                    mov    %esp,%ebp

 80481df:    83 ec 08                 sub    $0x8,%esp

 80481e2:    83 ec 0c                 sub    $0xc,%esp

 80481e5:    68 23 82 04 08           push   $0x8048223

 80481ea:    e8 cd ff ff ff           call   80481bc <abc-0x20>

 80481ef:    83 c4 10                 add    $0x10,%esp

 80481f2:    83 ec 0c                 sub    $0xc,%esp

 80481f5:    68 27 82 04 08           push   $0x8048227

 80481fa:    e8 bd ff ff ff           call   80481bc <abc-0x20>

 80481ff:    83 c4 10                 add    $0x10,%esp

 8048202:    83 ec 0c                 sub    $0xc,%esp

 8048205:    68 2c 82 04 08           push   $0x804822c

 804820a:    e8 ad ff ff ff           call   80481bc <abc-0x20>

 804820f:    83 c4 10                 add    $0x10,%esp

 8048212:    83 ec 08                 sub    $0x8,%esp

 8048215:    6a 00                    push   $0x0

 8048217:    6a 00                    push   $0x0

 8048219:    e8 ae ff ff ff           call   80481cc <abc-0x10>

 804821e:    83 c4 10                 add    $0x10,%esp

 8048221:    c9                       leave 

 8048222:    c3                       ret   

 

Header ..

7f  E  L  F  32-bits Object   Little Endian    1

FileType .. Executable

Machine  .. 80386

Version  .. Current Version

Entry Point 80481dc

Program Header Offset 52

Section Header Offset 952

Flags                0

Header Size           52

Program Header Entry size 32

No. of Program Headers    5

Section Header Entry size 40

No. of Section Headers    19

String Section            16

 

Program Header

 

Type      Offset    Vaddr    Paddr   Filesz    Memsz    Flags   Align

PHDR          52  8048034  8048034    160     160       5    4

Interp       212  80480d4  80480d4     19      19       4    1

Load           0  8048000  8048000    574     574       5    4096

Load         576  8049240  8049240    180     180       6    4096

Dynamic      576  8049240  8049240    160     160       6    4

 

Section Header

 

       Name                Type  Flags   Addr      Offset   Size   Link 

Info   aalign  entsize

00                          NULL               0         0     0     

0     0       0      0

 

01  .interp             PROGBITS   A     80480d4        d4    19     

0     0       1      0

/usr/lib/libc.so.1

 

02  .hash                   HASH   A     80480e8        e8    32     

3     0       4      4

HASH

 

03  .dynsym               DYNSYM   A     8048108       108    48     

4     1       4     16

DYNSYM

No. of Symbols 3

    Name                  Value       Size     O  SInd  Info        

000                             00000000   0000  0  0000  0000  LOCAL   

NOTYPE 

011  printf                     080481bc   0057  0  0000  0018  GLOBAL  

FUNC   

018  fopen                      080481cc   0053  0  0000  0018  GLOBAL  

FUNC   

 

04  .dynstr               STRTAB   A     8048138       138    44     

0     0       1      0

STRTAB

 

05  .gnu.version        6fffffff   A     8048164       164     6     

3     0       2      2

 

06  .gnu.version_r      6ffffffe   A     804816c       16c    48     

4     1       4      0

 

07  .rel.plt                 REL   A     804819c       19c    16     

3     8       4      8

REL

Relocation entries..2 for .rel.plt

080492ec 00000107 007

080492f0 00000207 007

 

08  .plt                PROGBITS   AE    80481ac       1ac    48     

0     0       4      4

ff 35 e4 92 04 08 ff 25 e8 92 04 08 00 00 00 00

ff 25 ec 92 04 08 68 00 00 00 00 e9 e0 ff ff ff

ff 25 f0 92 04 08 68 08 00 00 00 e9 d0 ff ff ff

 

09  .text               PROGBITS   AE    80481dc       1dc    77     

0     0       4      0

55 89 e5 83 ec 08 83 ec 0c 68 29 82 04 08 e8 cd

ff ff ff 83 c4 10 83 ec 0c 68 2d 82 04 08 e8 bd

ff ff ff 83 c4 10 83 ec 0c 68 32 82 04 08 e8 ad

ff ff ff 83 c4 10 83 ec 08 68 38 82 04 08 68 3a

82 04 08 e8 a8 ff ff ff 83 c4 10 c9 c3

 

10  .rodata             PROGBITS   A     8048229       229    21     

0     0       1      0

 

11  .data               PROGBITS   WA    8049240       240     0     

0     0       4      0

 

12  .dynamic             DYNAMIC   WA    8049240       240   160     

4     0       4      8

 

DYNAMIC

Number of Entries 20

 1  NEEDED   1

 2  HASH     134512872

 3  STRTAB   134512952

 4  SYMTAB   134512904

 5  STRSZ    44

 6  SYMENT   16

 7  DEBUG    0

 8  PLTGOT   134517472

 9  PLTRELSZ 16

10  PLTREL   17

11  JMPREL   134513052

12  6ffffffe 134513004

13  6fffffff 1

14  6ffffff0 134512996

15  NULL     0

16  NULL     0

17  NULL     0

18  NULL     0

19  NULL     0

20  NULL     0

 

13  .got                PROGBITS   WA    80492e0       2e0    20     

0     0       4      4

080492e0: 08049240

080492e4: 00000000

080492e8: 00000000

080492ec: 080481c2

080492f0: 080481d2

 

14  .bss                  NOBITS   WA    80492f4       2f4     0     

0     0       4      0

NOBITS

15  .comment            PROGBITS               0       2f4    51     

0     0       1      0

 

16  .shstrtab             STRTAB               0       327   142     

0     0       1      0

STRTAB

 

17  .symtab               SYMTAB               0       6b0   448    

18    20       4     16

SYMTAB

No. of symbols 28

    Name                                 Value     Size   O  SInd  

Info

  1   000                                       00000000  0000  0  0000 

0000  LOCAL NOTYPE  

  2   000                                       080480d4  0000  0  0001 

0003  LOCAL SECTION 

  3   000                                       080480e8  0000  0  0002 

0003  LOCAL SECTION 

  4   000                                       08048108  0000  0  0003 

0003  LOCAL SECTION 

  5   000                                       08048138  0000  0  0004 

0003  LOCAL SECTION 

  6   000                                       08048164  0000  0  0005 

0003  LOCAL SECTION 

  7   000                                       0804816c  0000  0  0006 

0003  LOCAL SECTION 

  8   000                                       0804819c  0000  0  0007 

0003  LOCAL SECTION 

  9   000                                       080481ac  0000  0  0008 

0003  LOCAL SECTION 

 10   000                                       080481dc  0000  0  0009 

0003  LOCAL SECTION 

 11   000                                       08048229  0000  0  0010 

0003  LOCAL SECTION 

 12   000                                       08049240  0000  0  0011 

0003  LOCAL SECTION 

 13   000                                       08049240  0000  0  0012 

0003  LOCAL SECTION 

 14   000                                       080492e0  0000  0  0013 

0003  LOCAL SECTION 

 15   000                                       080492f4  0000  0  0014 

0003  LOCAL SECTION 

 16   000                                       00000000  0000  0  0015 

0003  LOCAL SECTION 

 17   000                                       00000000  0000  0  0016 

0003  LOCAL SECTION 

 18   000                                       00000000  0000  0  0017 

0003  LOCAL SECTION 

 19   000                                       00000000  0000  0  0018 

0003  LOCAL SECTION 

 20   001   e.c                                 00000000  0000  0  -015 

0004  LOCAL FILE    

 21   005   _DYNAMIC                            08049240  0000  0  0012 

0017  GLOBALOBJECT  

 22   014   abc                                 080481dc  0077  0  0009 

0018  GLOBALFUNC    

 23   018   __bss_start                         080492f4  0000  0  -015 

0016  GLOBALNOTYPE  

 24   030   printf@@GLIBC_2.0                   080481bc  0057  0  0000 

0018  GLOBALFUNC    

 25   048   _edata                              080492f4  0000  0  -015 

0016  GLOBALNOTYPE  

 26   055   _GLOBAL_OFFSET_TABLE_               080492e0  0000  0  0013 

0017  GLOBALOBJECT  

 27   077   _end                                080492f4  0000  0  -015 

0016  GLOBALNOTYPE  

 28   082   fopen@@GLIBC_2.1                    080481cc  0053  0  0000 

0018  GLOBALFUNC    

 

18  .strtab               STRTAB               0       870    99     

0     0       1      0

STRTAB

 

/*

printf("hi\n")

abc

call 80481bc

This call is in the plt. at 80481bc is the instruction as

plt

80481bc : jmp *0x8049230

8049230  belongs to the got section *8049230 will result in 080481c2

got

8049230 : 80481c2

80481c2 is again in the plt section with the instructions as

plt

80481c2: 68 00 00 00 00   - pushl $0x0

 

The push instruction carries with it the offset as per the relocation Area. Since the offset is 0 , it means the first relocation entry will be looked at is 0. In other words the relocation entry is pushed on the stack The next instruction in plt is as follows

 

80481c7: e9 e0 ff ff ff     - jmp 80481ac

This jump instruction points to the begin of the plt section

 

80481ac: ff 35 28 92 04 08  - pushl 0x8049228

 

pushl instruction looks at the second entry in the got. As of now, while the file is lying idle on disk, the values are 0 but when this executable is alive or being executed, the loader replaces this second member in the got with its address location, as in where in memory the loader has been loaded. Normally the address is 40000000. After the push, the next instruction looked upon is

80481b2 : ff 25 2c 92 04 08 - jmp 0x804922c

 

This entry is again in the got , pointing to the third member. When this code gets executed, the loader replaces this value of all 0s with the address location of fixup - symbol or a function which finally relocates the addresses of the external symbols. The fixup function when called has two addresses on the stack. The first push is that of the relocation offset and the second is of the loader. The relocation entry gives the details as in which offset in the got has  to be reloacted  and the second int/word gives info of the symbol to be relocated.

 

Relocation entry :

8049230  00000107 

address  00000107 >> 8 = 00000001 = 1 entry in dynsym = printf

in got                        value = 80481bc

Example case - fopen

text

fopen(0,0)

804820a         e8  bd ff ff ff    call 80481cc

plt

80481cc :       ff 25 34 92 04 08  jmp *0x8049234

got

8049234 : 80481d2

plt

80481d2 : 68  08 00 00 00    pushl $0x8

relocation entry

8:    

8049234 :  00000207  

addr        00000207 >> 8 = 00000002 = 2 entry in dynsym = fopen

in got                    value 80481cc

plt

80481d7 : e9 d0 ff ff ff       jmp 80481ac

80481ac : ff 35 28 92 04 08    pushl 8049228

got

8049228      00000000         (address of loader at run time)

plt

80481b2  : ff 25 2c 92 04 08   jmp *0x804922c

got

804922c      00000000          (address of fixup)

*/

 

Firstly, the file passed as parameter e is opened. Then the file size is determined. Thereafter, memory is allocated for the entire file using the malloc function and the file is read into memory.

 

The pointer h points to the initial 52 bytes and is used to display this initial header. Since the file is an executable file and not an object file, there are 5 program headers of size 32 bytes each. Every program header is then displayed in the similar manner as the section header.

 

A program header is used by the dynamic linker to load the executable  into memory. The first member type can have upto 9 different values. The value PT_LOAD specifies that this segment must be loaded in memory. The type interp gives the path name of the interpreter to be used. The type Dynamic has the information required by the dynamic linker and PHDR is the program header itself.

 

The offset member holds the value as to where in file the segment begins and vaddr is its address in memory. The p_addr member is ignored here as it is used in systems using physical addressing. The file size and mem size is the size on disk and in memory.

 

The alignment decides on the byte boundary this segment can begin from. Also, the pointer s now points to the start of the section headers. Then there is the string table that gives the names of entities. So, at first using the shstrndx member of the main header, the section header for the string table is looked at. Thus, s1 points to this section responsible for the string table and from thereon the offset of the string table is retrieved which is then stored in the e1 variable.

 

All the section headers are displayed using a loop along with their contents for some sections only. The type is displayed as a string followed by the flags and then the rest of the fields. Then comes a huge switch statement that displays further information of the section.

 

Firstly, the sections whose type is 1 or PROGBITS is looked at. This section holds information that is defined by the program and no one but the program can make sense of it. There is no ELF specification on the nature of information stored in such sections.

 

The type 2 is for a symbol table and a type 3 is a string table, 4 for relocation entries and 5 for a hash table. Type 7 is for dynamic linking and 8 for leaving notes behind. The type 8 occupies no space in the file but represents PROGBITS. Type 9 is also for relocations but without explicit addins and type 10 is reserved and when used makes the ELF file incompatible. Type 11 is for dynamic symbols and the rest are reserved.

 

Lets first display some PROGBIT sections. A section with the name interp holds the path name of the program interpreter which is libc.so.1 in our case, the function interp simply displays this name. The section .text holds the actual code that will be executed. The text function simply displays the raw hex bytes with an enter after displaying 16 numbers.

 

The next section in focus is the one that contains dynamic symbols. For eg, functions like printf and fopen are found in the so files and they have not been written by us. The dynsym section gives a list of functions whose code is not supplied by us. The section symtab contains the rest of the symbols like the function abc. Finally the .dynamic section contains information for dynamic linking. There is a call instruction call 80481bc which is nothing but a call to the printf function.

 

Since the printf function is called thrice, the call instruction too is seen thrice in the output. The only concern is that the address of the printf function is in libc.so and the address 80481bc is in our code.

 

The objdump disassembly for the plt section clearly shows that this address falls within the plt area. It starts with a jump instruction jmp *0x80492e0. Which means that the call to the printf function calls code in the plt due to the above jump instruction. The address 0x80492e0 refers to the GOT or the Global Offset Table.

 

The GOT section is nothing but a series of address. The address at 0x80492e0 is 0x80481c2. This address happens to be the next instruction after the call 0x80481bc instruction in the PLT. The star in the jmp ensures that the next instruction called is at 0x80481c2 and not 0x80492e0.

 

As it is the first entry in the PLT, a 0 is pushed on the stack. This 0 is also the offset into the relocation table. Meaning that, calling code beginning at 80481ac is actually executing code which at the start of the plt section. The code at the beginning of the plt simply pushes a value 0x80492d8 on the stack. This value is the second entry in the GOT which is zero as of now.

 

On running the program, this second entry will be 400000, the address where our program is loaded in memory. After this push is an indirect jump at 0x80492dc. This location is the third entry in the GOT and its value is zero as of now. The dynamic loader puts the address of a fix up function that figures out the address of the printf function. This code will place the actual address of printf so that the above  complex process does not happen. This functions has two values on the stack, the first one an offset of 0 and the second is the address where our program is loaded in memory.

 

As the offset is zero, the first int is picked up which is 00000107. A right shift by 8 bits gives 1, which is the first entry in the dynamic symbol table, the entry for printf. The value is 80481bc which is the value to be used to call the printf function in the PLT.

 

a.c

#include <stdio.h>

main()

{

int **p;

printf("hi\n");

printf("bye\n");

printf("Good\n");

fopen("e.c","r");

p = 0x8049658;

printf("%x=%x\n",p,*p);

p++;

printf("%x=%x\n",p,*p);

p++;

printf("%x=%x\n",p,*p);

p++;

printf("%x=%x\n",p,*p);

p++;

printf("%x=%x\n",p,*p);

printf("printf=%p\n",printf);

printf("fopen=%p\n",fopen);

}

 

Output

hi

bye

Good

8049658=804957c

804965c=40015a38

8049660=4000bcb0

8049664=42015490

8049668=4204f0e0

printf=0x80482a4

fopen=0x80482b4

 

08049658: 0804957c

0804965c: 00000000

08049660: 00000000

08049664: 0804829a

08049668: 080482aa

0804966c: 080482ba

08049670: 00000000

 

The program e.c is an extension of the above. The abc function is replaced with main and the last output is running our program a with e. The GOT in this case starts at 8049658 whereas the next two ints are zeroes.

 

Back to the main page