Many Trojans employ anti-analysis techniques. Some of these, such as VM detection, can be cleanly defeated with our automated countermeasures. Others require a little elbow grease. Among the latter is the use of obfuscated strings, as seen in the recently discovered banking Trojan “Shifu”. In this post, we will follow a particular sample (MD5 371cdeb618d2170419f02fc3d644ef43) and shed some light on the process we used to deobfuscate these strings.

The first hurdle was to obtain a snapshot of the running process without triggering any anti-debugging or anti-virtualization traps that may be present therein. Fortunately, our True Bare Metal (TBM) analysis system automates this chore, allowing our human reverse-engineers to go straight to the part that matters: unraveling the functionality of Shifu’s payload.

Once we had the process dump, we noticed tell-tale signs of string obfuscation: in particular, blocks of garbage repeatedly referred to in the disassembly:

blobs

We identified the decryptor routine:

xor_pre

Shifu’s author made use of ordinary XOR. The key consists of one byte, given as an argument to the decryptor on each invocation:

encoded1 encoded2

More of the same:

bitcoin_post

This information proved sufficient for crafting a simple IDA Pro script, which decrypts the strings and annotates the disassembly. All we had to do was to walk the invocations of the decryptor and parse out the arguments, afterwards applying the XOR:

from idautils import *

# max steps backward
max_search_depth = 5

# code offset factor
delta = 0x37000000

# table of irregular places. we did these by hand.
irreg = {0x37215CC : [0x0C, 0x80],
         0x3723768 : [0x97, 0x97],
         0x37229FC : [0x0A, 0x93],
}

def find_xor_args(ad):
    plaintext = ""
    edx = False
    params = []
    a = ad
    for i in range(0, max_search_depth):
        a = idaapi.prev_head(a, idc.MinEA());
        op = idc.GetMnem(a)
        val0 = idc.GetOperandValue(a, 0)
        val1 = idc.GetOperandValue(a, 1)
        if (op == 'mov') and (val0 == 2) and (edx == False):
            edx = val1
            special = irreg.get(edx)
            if special:
                params = special
                print "!"
        elif (op == 'push') and (len(params) < 2):
            params += [val0]
    # done?
    if edx and (len(params) == 2):
        blob_len = params[0]
        xor_key = params[1]
        if (blob_len == 0) or (xor_key == 0) or (xor_key > 0xFF):
            plaintext = "NOT DECODED: 0x%X : is sad" % ad
        else:
            s = ""
            edx += delta
            for j in range (0, blob_len):
                s += chr(Byte(edx + j) ^ xor_key);
            plaintext = "DECODED = '%s'" % s
    else:
        plaintext = "NOT DECODED: 0x%X : is extra-sad" % ad
    MakeComm(ad, plaintext)
    print plaintext    
        
## Decryptor function
kd_ea = LocByName("xor_decryptor");

# Find all code references to decryptor
for ref in CodeRefsTo(kd_ea, 1):
    find_xor_args(ref)

Certain locations in Shifu invoked the decryptor with programatically generated arguments, triggering the ‘sad’ message in the above script. The simplest possible approach was to decode the constants in question and enumerate these cases manually (the ‘irreg’ dictionary.)

The following is an excerpt of the strings found by the script:

blobs

From these annotations, it became possible to walk the disassembled process image and shed some light on the functionality of Shifu’s payload. In this fragment, you can see some hints of keylogging and communications with a command-and-control server. Other strings (not shown for the sake of brevity) suggest a wide variety of other potential features in this sample such as anti-analysis techniques, generating non-malicious network traffic as a smokescreen, searching for unencrypted Bitcoin and Litecoin wallets, and stealing data from various financial applications running on the machine.


Perhaps reflecting Shifu’s patchwork nature, we also found other encrypted strings not using the same decryptor above. We used similar techniques to handle these. One set of these strings was a list of security software, which presumably comes with mechanisms to defeat them. However, the most interesting one was this list of strings related to Italian banks:

banking_post

Of course, decrypting strings is only the tip of the iceberg. Keep an eye on our blog for future posts as we delve deeper into this and other malware.