Using Binary Ninja’s HLIL for Config Extraction

emotet, binary ninja, malware
Extracting Emotet C2 data using Binary Ninja’s HLIL.
Published

February 1, 2022

Static Emotet Configuration Extraction

The goal here is to reproduce this technique from VMRay’s post using Binary Ninja. This post from Open Analysis was also very helpful. With those posts as the foundation I was able to focus on the Binary Ninja API.

This turned out to be much easier than anticipated, Binary Ninja’s High Level Intermediate Language did most of the work once I figured out how to access it.

Sample used: c688e079a16b3345c83a285ac2ae8dd48680298085421c225680f26ceae73eb7

Why HLIL

The VMRay post linked above does an excellent job of explaining why using an Intermediate Representation is a good way to tackle configuration for samples like this. I highly suggest you read that before continuing.

As you can see below there is a ton of garbage happening in this function.

unmodified disassembly

Switching to HLIL makes it clear most of the instructions in this function is junk and can be ignored. This makes it relatively easy to extract the data we need.

HLIL

from ipaddress import ip_address, IPv4Address
import struct
from typing import Tuple

import binaryninja

Sample

sha256: c688e079a16b3345c83a285ac2ae8dd48680298085421c225680f26ceae73eb7

Functions containing IPs are distinct variables.

view = binaryninja.open_view("./c688e079a16b3345c83a285ac2ae8dd48680298085421c225680f26ceae73eb7")

Create Target Function List

I used the following criteria for selecting functions to process.

  • function signature int64_t sub_XXXXXXX(int32_t*, int32_t*)
  • must have exactly one basic block

I thought more criteria would be needed but this worked just fine, granted this was only used on a single sample.

def check_parameter_types(params):
    for param in params:
        # if params don't match criteria bail from loop
        if not (param.type and param.type.get_string() == 'int32_t*'):
            return False
        
    return True
c2_funcs = []

for f in view.functions:
    if len(f.parameter_vars.vars) == 2 and len(f.basic_blocks) == 1:
        if check_parameter_types(f.parameter_vars.vars):
            c2_funcs.append(f)

display(c2_funcs)
[<func: x86_64@0x180001000>,
 <func: x86_64@0x180001528>,
 <func: x86_64@0x180001d48>,
 <func: x86_64@0x180001f58>,
 <func: x86_64@0x180002f64>,
 <func: x86_64@0x180003bd0>,
 <func: x86_64@0x180004868>,
 <func: x86_64@0x180004c38>,
 <func: x86_64@0x180004d58>,
 <func: x86_64@0x180004e50>,
 <func: x86_64@0x180006f38>,
 <func: x86_64@0x1800074a0>,
 <func: x86_64@0x1800081e4>,
 <func: x86_64@0x180008bc8>,
 <func: x86_64@0x180008dcc>,
 <func: x86_64@0x180008ee0>,
 <func: x86_64@0x180009820>,
 <func: x86_64@0x18000b610>,
 <func: x86_64@0x18000b9e4>,
 <func: x86_64@0x18000bcf8>,
 <func: x86_64@0x18000dab0>,
 <func: x86_64@0x18000dc28>,
 <func: x86_64@0x18000e70c>,
 <func: x86_64@0x18000f454>,
 <func: x86_64@0x18000fbc4>,
 <func: x86_64@0x18000fd58>,
 <func: x86_64@0x1800101bc>,
 <func: x86_64@0x18001092c>,
 <func: x86_64@0x180010d88>,
 <func: x86_64@0x180013278>,
 <func: x86_64@0x180014644>,
 <func: x86_64@0x18001519c>,
 <func: x86_64@0x180015598>,
 <func: x86_64@0x180015690>,
 <func: x86_64@0x180015c7c>,
 <func: x86_64@0x180016a10>,
 <func: x86_64@0x18001975c>,
 <func: x86_64@0x18001c5b4>,
 <func: x86_64@0x18001c9f8>,
 <func: x86_64@0x18001cf80>,
 <func: x86_64@0x18001d5c8>,
 <func: x86_64@0x18001e274>,
 <func: x86_64@0x18001e360>,
 <func: x86_64@0x18002103c>,
 <func: x86_64@0x18002339c>,
 <func: x86_64@0x180023584>,
 <func: x86_64@0x180025264>,
 <func: x86_64@0x180025498>,
 <func: x86_64@0x18002557c>,
 <func: x86_64@0x180025678>,
 <func: x86_64@0x1800264fc>,
 <func: x86_64@0x1800268c0>,
 <func: x86_64@0x1800269d0>,
 <func: x86_64@0x1800294d8>,
 <func: x86_64@0x180029c14>,
 <func: x86_64@0x18002a0f8>,
 <func: x86_64@0x18002a340>,
 <func: x86_64@0x18002a4cc>,
 <func: x86_64@0x18002aa74>,
 <func: x86_64@0x18002b054>,
 <func: x86_64@0x18002bf80>,
 <func: x86_64@0x18002c2b8>,
 <func: x86_64@0x18002c8b0>,
 <func: x86_64@0x18002caf8>]

Extract C2 values from Function Body

  1. Get the address where each parameter value is set.
  2. Iterate over the HLIL instructions to find our reference address.
  3. Extract constant and convert.
def extract_values(func) -> Tuple[IPv4Address, int]:
    # Get references to parameters
    ip_arg = func.parameter_vars.vars[0]
    port_arg = func.parameter_vars.vars[1]
    
    # Address where parameters are referenced
    ip_ref = func.get_hlil_var_refs(ip_arg)[0]
    port_ref = func.get_hlil_var_refs(port_arg)[0]
    
    
    for instr in func.hlil.instructions:
        if instr.address == ip_ref.address:
            # and instr.operation == HighLevelILOperation.HLIL_ASSIGN: 17
            ip = IPv4Address(instr.operands[1].value.value.to_bytes(4, byteorder='little'))
        elif instr.address == port_ref.address:
            port_bytes = instr.operands[1].value.value.to_bytes(4, byteorder='little')
            port = struct.unpack('<H',port_bytes[2:4])[0]
                
    return (ip, port)
for func in c2_funcs:
    try:
        ip, port = extract_values(func)
        print(f'{ip}:{port}')
    except (IndexError, TypeError) as err:
        print(f'{func.name} failed')
206.189.28.199:8080
164.68.99.3:8080
51.91.76.89:8080
185.4.135.165:8080
58.227.42.236:80
213.241.20.155:443
1.234.21.73:7080
159.65.88.10:8080
160.16.142.56:8080
91.207.28.33:8080
216.158.226.206:443
63.142.250.212:443
150.95.66.124:8080
51.91.7.5:8080
188.44.20.25:443
94.23.45.86:4143
82.165.152.127:8080
0.0.0.0:0
163.44.196.120:8080
5.9.116.246:8080
153.126.146.25:7080
103.75.201.2:443
172.104.251.154:8080
185.157.82.211:8080
79.137.35.198:8080
201.94.166.162:443
45.176.232.124:443
189.126.111.200:7080
209.250.246.206:443
51.254.140.238:7080
209.126.98.206:8080
103.43.46.182:443
167.99.115.35:8080
131.100.24.231:80
72.15.201.15:8080
151.106.112.196:8080
45.235.8.30:8080
27.54.89.58:8080
103.132.242.26:8080
146.59.226.45:443
101.50.0.91:8080
102.222.215.74:443
1.234.2.232:8080
183.111.227.137:8080
45.118.115.99:8080
77.81.247.144:8080
149.56.131.28:8080
196.218.30.83:443
103.70.28.102:8080
134.122.66.193:8080
203.114.109.124:443
173.212.193.249:8080
46.55.222.11:443
197.242.150.244:8080
209.97.163.214:443
212.24.98.99:8080
167.172.253.162:8080
119.193.124.41:7080
185.8.212.130:7080
110.232.117.186:8080
107.182.225.142:8080
212.237.17.99:8080
129.232.188.93:443
158.69.222.101:443

Struct Example

sha256: 9a4d2c6776f97e0afb4a0a99bfd40b34ac7ac2932693e587161536bb9acc9497

In this slightly more complex example C2 data is referenced via a struct which requires a bit more setup than the first example. Other than the struct everything else is similar.

struct_example

view2 = binaryninja.open_view("./9a4d2c67-emotet.dll")

Create Target Function List

Same idea as above with the appropriate function signature.

c2_funcs_2 = []

for f in view2.functions:
    if len(f.parameter_vars.vars) == 1 and len(f.basic_blocks) == 1:
        if check_parameter_types(f.parameter_vars.vars):
            c2_funcs_2.append(f)
            
display(c2_funcs_2)
[<func: x86_64@0x180001648>,
 <func: x86_64@0x180001e70>,
 <func: x86_64@0x180001f44>,
 <func: x86_64@0x180002054>,
 <func: x86_64@0x180002150>,
 <func: x86_64@0x180002288>,
 <func: x86_64@0x1800027e4>,
 <func: x86_64@0x1800028e8>,
 <func: x86_64@0x180003db8>,
 <func: x86_64@0x1800054d4>,
 <func: x86_64@0x180006684>,
 <func: x86_64@0x180006bf8>,
 <func: x86_64@0x18000700c>,
 <func: x86_64@0x180009594>,
 <func: x86_64@0x180009fc4>,
 <func: x86_64@0x18000a790>,
 <func: x86_64@0x18000c858>,
 <func: x86_64@0x18000cb6c>,
 <func: x86_64@0x18000cc6c>,
 <func: x86_64@0x18000e3c8>,
 <func: x86_64@0x18000ea8c>,
 <func: x86_64@0x18000eb84>,
 <func: x86_64@0x18000f18c>,
 <func: x86_64@0x18001045c>,
 <func: x86_64@0x180010564>,
 <func: x86_64@0x1800106f8>,
 <func: x86_64@0x180011798>,
 <func: x86_64@0x1800118ac>,
 <func: x86_64@0x180011a30>,
 <func: x86_64@0x180011c20>,
 <func: x86_64@0x180011d1c>,
 <func: x86_64@0x180011ec4>,
 <func: x86_64@0x180011f94>,
 <func: x86_64@0x180012368>,
 <func: x86_64@0x18001268c>,
 <func: x86_64@0x180012780>,
 <func: x86_64@0x180012a98>,
 <func: x86_64@0x180012cc8>,
 <func: x86_64@0x180012e98>,
 <func: x86_64@0x1800134b4>,
 <func: x86_64@0x180013f3c>,
 <func: x86_64@0x1800141b0>,
 <func: x86_64@0x1800156fc>,
 <func: x86_64@0x1800157e8>,
 <func: x86_64@0x180015f74>,
 <func: x86_64@0x1800161a0>,
 <func: x86_64@0x1800162b8>,
 <func: x86_64@0x180016444>,
 <func: x86_64@0x180016a04>,
 <func: x86_64@0x180016cc4>,
 <func: x86_64@0x18001721c>,
 <func: x86_64@0x180017664>,
 <func: x86_64@0x180019f80>,
 <func: x86_64@0x18001cbb0>,
 <func: x86_64@0x18001dd04>,
 <func: x86_64@0x18001debc>,
 <func: x86_64@0x18001e498>,
 <func: x86_64@0x18001f784>,
 <func: x86_64@0x180020178>,
 <func: x86_64@0x180021e88>,
 <func: x86_64@0x180021f94>,
 <func: x86_64@0x1800239dc>,
 <func: x86_64@0x180023ad4>,
 <func: x86_64@0x180023cdc>]

Create the struct

The structure for c2 data in C is defined below:

struct C2 {
    int32_t ip
    int32_t port
}

Creating this in binary ninja is relatively simple with the proper documentation. Once the struct is created it needs to be registered with the binary view. As the documentation says, the structure and it’s name are defined separately. That mapping is held in the view, hence the need to register the struct with define_user_type.

# Create the struct
c2_struct = Type.structure(members=[Type.int(4, False, 'ip'),Type.int(4, False, 'port')])
# Register with view
view2.define_user_type('c2_struct', c2_struct)

Extract Values

This function is very similar to the first extract values function with a few key changes.

  • Instead of getting the reference to each parameter, we first set the type to a pointer to the c2_struct type and get references to that.
  • Use offsets to determine which value we are dealing with. There might be a better way to do this but I’m relatively new to Binary Ninja and haven’t figured it out yet.
  • UPDATE: Use instr = func.get_low_level_il_at(ref).high_level_il to get our instruction directly rather than iterating over all instructions in the function. I have not been able to get this to work on the original sample.
def extract_values_2(arch, func) -> Tuple[IPv4Address, int]:
    c2 = func.parameter_vars.vars[0]
    c2.name = 'c2_struct'
    # set the type
    c2.type = Type.pointer(arch, c2_struct)
    # get references
    refs = [ref.address for ref in func.get_hlil_var_refs(c2)]
   
    
    for ref in refs:
        # go through low level IL to get high level
        instr = func.get_low_level_il_at(ref).high_level_il
        if instr.operands[0].offset == 0:
            ip = IPv4Address(instr.operands[1].value.value.to_bytes(4, byteorder='little'))
        elif instr.operands[0].offset == 4:
            port_bytes = instr.operands[1].value.value.to_bytes(4, byteorder='little')
            port = struct.unpack('<H',port_bytes[2:4])[0]
                          
    return (ip, port)

Process Functions

Once setup is complete, just process the list of functions and print the C2 data.

for func in c2_funcs_2:
    try:
        ip, port = extract_values_2(view2.arch, func)
        print(f'{ip}:{port}')
    except (IndexError, TypeError) as err:
        print(f'{func.name} failed')
73.255.26.122:12165
65.117.236.252:35278
108.164.161.77:11218
202.29.237.114:8080
110.251.57.20:56526
29.57.112.54:60474
25.75.78.170:23341
112.93.218.223:49565
60.130.68.172:14401
58.147.102.233:8030
90.243.35.105:28965
115.94.169.146:41714
30.56.33.4:32895
17.10.201.147:48369
69.124.172.227:64455
100.94.178.152:50457
198.211.118.165:443
62.15.215.195:32086
142.93.76.76:7080
108.43.184.195:16490
102.37.33.173:31347
165.227.153.100:8080
51.247.86.38:970
0.0.0.0:0
73.204.243.234:58720
30.130.55.57:48205
44.90.202.236:30791
86.1.1.244:52762
89.26.150.193:51072
118.22.245.177:14480
57.238.78.1:61990
51.0.140.144:52538
37.76.233.246:43490
86.156.61.105:10249
109.14.95.27:9268
75.202.215.95:76
68.22.90.86:60712
90.115.43.9:24380
98.99.175.223:18253
29.176.5.126:1313
20.85.29.7:33464
159.65.163.220:443
99.249.232.6:54384
106.162.95.165:42569
20.242.46.35:11211
47.130.199.198:40935
203.217.140.239:8080
110.34.174.151:33232
37.213.70.226:5112
100.163.237.26:50860
52.235.208.107:57114
91.143.35.250:37303
128.199.93.156:7080
114.26.45.46:56163
99.17.231.243:39482
90.188.179.131:63754
198.27.67.35:8080
37.8.5.45:42542
41.190.19.230:11714
38.218.214.127:24861
76.202.173.217:13855
75.55.249.32:60920
87.180.53.36:33400
116.125.120.88:443