Cerber is a popular ransomware that it's still active. In this blogpost, we will analyze and dump Cerber's config using the Cuckoo Sandbox for it.

Prior analysis of Cerber already exist (like this one by Hasherezade).
As state by Hasherezade, Cerber stores it's configuration in an RCDATA resource bundled in the PE header. This RCDATA resource is encrypted and cerber uses a dedicated function to decrypt it.

We will begin analyzing said binary.

CRC32: EF4C42F6  
MD5: 9A7F87C91BF7E602055A5503E80E2313  
SHA-1: 193F407A2F0C7E1EAA65C54CD9115C418881DE42  

If we analyze the function after which call a clear-text configuration is loaded in memory we can see it is using the RC4 cryptographic algorithm.

char __usercall rc4@<al>(_BYTE *key@<eax>, unsigned __int16 len, void *configReadAddr, unsigned int configLen, void *configWriteAddr)  
{
  _BYTE *K; // edi@1
  char result; // al@1
  int n; // ecx@6
  char *SboxI; // eax@6
  char *Sbox_iterator; // ecx@8
  signed int keyLenght; // esi@8
  char S_i; // dl@9
  char *swapAux; // eax@9
  unsigned __int16 v13; // kr02_2@12
  _BYTE *config; // esi@13
  char *S_j; // eax@14
  char S__i; // dl@14
  char *aux; // ecx@14
  char Sbox[256]; // [sp+4h] [bp-A4h]@6
  unsigned __int16 v19; // [sp+104h] [bp+5Ch]@6
  unsigned int bytesLeft; // [sp+108h] [bp+60h]@13
  unsigned __int8 i; // [sp+10Eh] [bp+66h]@6
  unsigned __int8 j; // [sp+10Fh] [bp+67h]@6
  char PRGA_i_index; // [sp+11Bh] [bp+73h]@12

  K = key;                                      // seed for RC4, lenght should be 1 < len < 256
  result = canRead(key, len);
  if ( result )
  {
    result = canRead(configReadAddr, configLen);
    if ( result )
    {
      if ( !configWriteAddr )
        configWriteAddr = configReadAddr;
      result = canWrite(configWriteAddr, configLen);
      if ( result )
      {                                         // RC4 Crypto
        i = 0;
        j = 0;
        v19 = 0;
        n = 0;
        SboxI = Sbox;
        do                                      // Fill S-box with secuential values
          *SboxI++ = n++;
        while ( (unsigned __int16)n < 0x100u ); // First loop until 0x100 (256)
                                                // 
                                                // KSA block
        Sbox_iterator = Sbox;
        keyLenght = 256;
        do
        {
          S_i = *Sbox_iterator;
          j += *Sbox_iterator + K[i++];         // j = (j + S[i] + K[i])
          swapAux = &Sbox[j];                   // Swap S[i] and S[j]
          *Sbox_iterator = *swapAux;
          *swapAux = S_i;
          if ( i == len )
            i = 0;
          ++Sbox_iterator;
          --keyLenght;
        }
        while ( keyLenght );                    // Second loop to key lenght - KSA
                                                // 
                                                // PRGA block
        result = HIBYTE(v19);
        v13 = v19;
        j = v13 >> 8;
        PRGA_i_index = v13;
        if ( configLen > 0 )
        {
          config = configWriteAddr;
          bytesLeft = configLen;
          do
          {
            S_j = &Sbox[(unsigned __int8)++PRGA_i_index];
            S__i = *S_j;
            j += *S_j;                          // j = j + S[i]
            aux = &Sbox[j];                     // Swap S[i] and S[j]
            *S_j = *aux;
            *aux = S__i;
            result = config[(_BYTE *)configReadAddr - (_BYTE *)configWriteAddr] ^ Sbox[(unsigned __int8)(*S_j + S__i)];// t = (S[i] + S[j])
            *config++ = result;
            --bytesLeft;
          }
          while ( bytesLeft );                  // Enc / dec loop - PRGA
        }
      }
    }
  }
  return result;
}

Where they key for the RC4 (de/en)cryption is passed as the first parameter.

Which means that the configuration is a RAW (RCDATA) resource encrypted using RC4.

If we observe call graph to this function we end up in this interesting looking function:

int decrypt_config()  
{
  int v0; // edi@1
  HRSRC config; // eax@1
  HRSRC _config; // ebx@1
  HGLOBAL configRes; // eax@2
  int rscSize; // eax@4
  void *configWriteAddr; // ebx@5
  _BYTE *key; // esi@6
  unsigned __int16 keyLen; // ax@6
  int v8; // ST08_4@6
  char v10; // [sp+Ch] [bp-14h]@6
  HGLOBAL hResData; // [sp+10h] [bp-10h]@2
  LPCSTR lpString; // [sp+14h] [bp-Ch]@6
  void *readAddr; // [sp+18h] [bp-8h]@3
  unsigned int configLen; // [sp+1Ch] [bp-4h]@4

  v0 = 0;
  config = FindResourceW((HMODULE)0x400000, (LPCWSTR)0x8894, (LPCWSTR)0xA);
  _config = config;
  if ( config )
  {
    configRes = LoadResource((HMODULE)0x400000, config);
    hResData = configRes;
    if ( configRes )
    {
      readAddr = LockResource(configRes);
      if ( readAddr )
      {
        rscSize = SizeofResource((HMODULE)0x400000, _config);
        configLen = rscSize;
        if ( rscSize )
        {
          configWriteAddr = (void *)customHeapAlloc(rscSize);
          if ( configWriteAddr )
          {
            lpString = (LPCSTR)decrypt_string(&configKey, 6u, 24, 0);
            key = (_BYTE *)decrypt_string(&configKey, 6u, 24, 0);
            keyLen = lstrlenA(lpString);
            rc4(key, keyLen, readAddr, configLen, configWriteAddr);
            v0 = sub_413DC7(configWriteAddr, &v10);
            sub_414804(configWriteAddr, v8);
          }
        }
      }
      FreeResource(hResData);
    }
  }
  return v0;
}

It is clear that it is loading a resource (and we know config is stored as a resource), then allocating some memory, decrypting a string (the string decryption function was identified by Hasherezade in it's analysis) and using that decrypted string as a key to decrypt the config resource. Now, the configKey is a global variable in a fixed location, but as suggested by the code it will be encrypted. We can also set a breakpoint in the call to rc4 and observe the key (decrypted) value (which, in this case is "cerber" - big surprise).

Again, thanks to Hasherezade we know the decrypt_string function has the following parameters:

decrypt_string(char* input_buffer, DWORD input_lenght, DWORD key, BOOL is_unicode)  

And, as we can observe from decompilation they are all hardcoded (i.e: string length is 6 and decryption key is 24). The only thing left to know is the decryption algorithm. Here it is:

int __cdecl decrypt_string(void *input_buffer, unsigned int input_lenght, char key, char is_unicode)  
{
  unsigned int v5; // edi@3
  int zeroedMem; // esi@3
  int v7; // eax@8
  int v8; // ecx@9
  int *i; // eax@9
  void *heapAddr; // eax@14
  int _ret; // eax@19
  int ret; // [sp+4h] [bp-4h]@3

  if ( !isString(input_lenght, (const CHAR *)input_buffer) )
    return 0;
  ret = 0;
  getProcId();
  v5 = sub_40C656((char *)input_buffer, input_lenght);
  enterCritical(&decryptionMutex);
  zeroedMem = dword_41BF6C;
  if ( dword_41BF6C )
  {
    do
    {
      if ( *(_DWORD *)zeroedMem == v5 )
        break;
      zeroedMem = *(_DWORD *)(zeroedMem + 16);
    }
    while ( zeroedMem );
    if ( zeroedMem )
      goto LABEL_16;
  }
  zeroedMem = heapAllocZero(20);
  if ( zeroedMem )
  {
    v7 = dword_41BF6C;
    *(_DWORD *)zeroedMem = v5;
    if ( v7 )
    {
      v8 = v7;
      for ( i = (int *)(v7 + 16); *i; i = (int *)(*i + 16) )
        v8 = *i;
      *(_DWORD *)(v8 + 16) = zeroedMem;
    }
    else
    {
      dword_41BF6C = zeroedMem;
    }
    heapAddr = (void *)customHeapAlloc(input_lenght + 1);
    *(_DWORD *)(zeroedMem + 8) = heapAddr;
    if ( heapAddr )
    {
      rc4(&key, 4u, input_buffer, input_lenght, heapAddr);
      *(_BYTE *)(input_lenght + *(_DWORD *)(zeroedMem + 8)) = 0;
    }
LABEL_16:  
    if ( is_unicode )
    {
      if ( !*(_DWORD *)(zeroedMem + 12) )
        *(_DWORD *)(zeroedMem + 12) = sub_40BD2D(input_lenght, *(LPCSTR *)(zeroedMem + 8));
      _ret = *(_DWORD *)(zeroedMem + 12);
    }
    else
    {
      _ret = *(_DWORD *)(zeroedMem + 8);
    }
    ret = _ret;
  }
  leaveCritial(&decryptionMutex);
  return ret;
}

We can gather that cerber is checking some memory locations and, if certain conditions are met, it is calling the rc4 function over the string. If we do further dynamic analysis we can gather that when Cerber decrypts a string it is moved to a particular zone, so it does not need to decrypt it again, but in the end, strings are also RC4 encrypted.

Other variants

Let's explore other Cerber 1 samples and see how much they change, if any.

CRC32: 70A6E916  
MD5: 17FCD7A7162298225B06D85D1D5A90EA  
SHA-1: 656E1E318B63FEDCF8B9D6A9FD907365A6A68AF6  

The sample is packed using the Nullsoft Scriptable Install System (NSIS), this is quite common for cerber samples. Unpacking instructions are available here. Once unpacked we are left with the payload.

Hashes for the payload:

CRC32: DBCA4B33  
MD5: 30E502B80E15F200A29762035374ABDA  
SHA-1: 57A6835FC55F5FD88162820FB64A2F9A848EFDBF  

Now we need to identify something that looks like the configuration decryption function in the new binary. I tried to use Diaphora for it, but the config decrypt function had a very bad match. Upon examination, we can conclude that Diaphora's match is not what we are after as it does use any resource-loading functions.

With that in mind, we can look for usages of FindResource in order to identify the configuration decryption function. Said function is called in two different places (sub_403742 and sub_401000). Both usages look somewhat alike to the first sample (they fetch a resource, lock it and measure size, then they call unnamed functions). We can set a breakpoint in both and see if any of the two does configutation handling.

As a result of setting up the breakpoints we can conclude that sub_403742 does not get called (at least not in normal execution) and sub_401000 gets called once. As reading the configuration seems like something you should always do let's go for the second one.

Pseudo-code:

void *__cdecl sub_401000(HMODULE hModule, int a2)  
{
  HRSRC rsc; // eax@1
  HRSRC _rsc; // esi@1
  HGLOBAL loadedRsc; // eax@2
  void *_loadedRsc; // ebx@2
  void *rscAddr; // edi@3
  DWORD rscSize; // eax@4
  void *res; // [sp+4h] [bp-4h]@1

  res = 0;
  rsc = FindResourceW(hModule, (LPCWSTR)0x8894, (LPCWSTR)0xA);
  _rsc = rsc;
  if ( rsc )
  {
    loadedRsc = LoadResource(hModule, rsc);
    _loadedRsc = loadedRsc;
    if ( loadedRsc )
    {
      rscAddr = LockResource(loadedRsc);
      if ( rscAddr )
      {
        rscSize = SizeofResource(hModule, _rsc);
        *(_DWORD *)a2 = rscSize;
        res = sub_4088BD(rscAddr, rscSize);
      }
      FreeResource(_loadedRsc);
    }
  }
  return res;
}

Everything is pretty clear but sub_4088BD. Upon initial examination it is a function that allocates heap and copies what it is given to it to that heap zone. We can conclude that this is a function to copy a resource to heap, not much to do with decryption though. We'll name it rscCopy.

The configuration decryption must be in some way related to this function as it is the only call to FindResource in the whole sample. If we check for calling functions we get only one - sub_401068.

Pseudo-code:

void *sub_401068()  
{
  void *newAddr; // eax@1
  int _newAddr; // ebx@1
  UINT _ucb; // edi@2
  const CHAR *heapPtr; // esi@2
  UINT ucb; // [sp+10h] [bp-8h]@1
  void *err; // [sp+14h] [bp-4h]@1

  err = 0;
  newAddr = rscCopy(hModule, (int)&ucb);
  _newAddr = (int)newAddr;
  if ( newAddr )
  {
    _ucb = ucb;
    heapPtr = XORsStuff(newAddr, ucb);
    if ( heapPtr )
    {
      err = sub_413892(_ucb, heapPtr, &ucb);
      heapFree((int)heapPtr);
    }
    heapFree(_newAddr);
  }
  return err;
}

This function copies a resource into a new memory zone and then does lot of XOR, AND and displacement operations over the memory zone (which is why I wanted to remark with the name XORsStuff). This is begining to look a lot more like config decryption. Let's set a breakpoint on XORsStuff.

Bingo! We can observe how XORsStuff is returning a pointer to the begining of the decrypted configuration.

Common ground

Although decryption mechanisms are different on both samples, they both load the configuration in heap and then free it using HeapFree (this was pointed by Hugo Gascón), so we can hook the HeapFree system call to dump the configuration.

One caveat though, if we check the pointer given to HeapFree we will notice that there are 8 weird bytes on the beginning and end of the heap. This is because HeapFree and HeapAlloc are wrapped on custom functions that insert / check this padding.

int __usercall _HeapAlloc@<eax>(int n@<eax>, DWORD dwFlags)  
{
  int payloadSize; // esi@1
  DWORD lastErr; // ebx@1
  void *heap; // ecx@1
  int ret; // edi@3
  int size; // esi@5
  char *heapPtr; // eax@7

  payloadSize = n;
  lastErr = GetLastError();
  if ( !checkCurrentPID() )
    heapCreate(heap);
  ret = 0;
  if ( payloadSize )
  {
    if ( payloadSize == -2 )
      size = 0;
    else
      size = 4 * payloadSize + 4;
    heapPtr = (char *)HeapAlloc(hHeap, dwFlags, size + 12);
    if ( heapPtr )
    {
      ret = (int)(heapPtr + 8);
      *(_DWORD *)heapPtr = size;
      *((_DWORD *)heapPtr + 1) = 0xABBABABA;
      *(_DWORD *)&heapPtr[size + 8] = 0xABBABABA;
      if ( heapPtr != (char *)-8 )
        SetLastError(lastErr);
    }
  }
  return ret;
}

In this case the 0xABBABABA constant gets inserted in every beginning and end of allocated heap memory. This is probably for overflow/integrity checking, but we don't care much as everything we need is start reading at ptr + 8 and read size - 16.

This is the case for sample 17FCD7A7162298225B06D85D1D5A90EA (the second one).

Moar samples

I had another Cerber 1 sample with slightly different behaviour.

CRC32: 9EED2315  
MD5: D34A409B09A8B08813A4975885E11A0D  
SHA-1: B28BA986B6A70FEC5705239F613E7F0B5EA64B2D  

This sample is also NSIS packed but could not be extracted by the same methods I used to unpack other NSIS-packed samples. Seems like it is packed by a modified NSIS. Anyhow, Hasherezade helped me unpacking it (credits to her - once again) and it turns to be an exact copy of sample #2 (as far as the configuration handling goes).

Payload hashes:

CRC32: BCB3E6FC  
MD5: 483620759C5F07F14A3A3AA5CDE70A1C  
SHA-1: 1709721736CA25F464B81ADE5FE3C8689AC62366  

Yara signature

Given the previous analysis I came with the following Yara rule

rule Cerber1  
{
    meta:
     author = "FDD"
        description = "Cerber v1"
    strings:
        $c0 = { 53 56 57 8B F0 FF 15 ?? ?? ?? ?? 8B D8 E8 ?? ?? ?? ?? 84 C0 }
        $c1 = { 50 FF 74 24 ?? FF 35 ?? ?? ?? ?? FF 15 ?? ?? ?? ?? 85 C0 74 1C B9 BA BA BA AB 8D 78 08 89 30 89 48 04 89 4C 30 08 85 FF }
        $s0 = "CryptBinaryToStringA"
        $s1 = "CryptImportPublicKeyInfo"
        $s2 = "CRYPT32.dll"
        $s3 = "FindResourceW"
    condition:
        all of them
}

rule Cerber1_A  
{
    meta:
        author = "FDD"
        description = "Cerber 1 A variant"
   strings:
        $s1 = "cerber"
    condition:
        Cerber1 and all of them 
}

rule Cerber1_RC4  
{
    meta:
        author = "FDD"
        description = "Cerber 1 variant with RC4 encryption"
    strings:
    $c0 = { 33 C9 8D 85 ?? ?? ?? ?? BA 00 01 00 00 88 08 41 40 66 3B CA 72 F7 8D 8D ?? ?? ?? ?? 8B F2 0F B6 ?? ?? 8A 04 38 8A 11 02 C2 00 ?? ?? 0F B6 ?? ?? FE ?? ?? }
    condition:
        Cerber1 and all of them
}

Cuckoo integration

Soon