Introduction

It’s often surprising how many malware programmers forget to do the simplest things. Mostly because many are so concerned with functionality, stealthiness and other production concerns, that details slip easily of their minds – a clear advantage to forensics. One of these details is the Program DataBase (PDB) information added by Visual Studio, which most malware authors used for Windows development. While it may seem innocuous, this string reveals a lot about the operating system used by the author, its user name and most notably, symbols that can be used by IDA and ease understanding of the disassembly. This information allows to link multiple pieces of malware together, by using the username for example. Of course, this also allows for the creation of signatures. Thus, removing this information will add a hurdle to the analysts.

Contents

The Program Database File

The Program Database (PDB) is a binary file used to store debugging information about DLL and EXE files. The PDB file is created when you build your project and stores a list of symbols  their addresses along with the name of the file and the line number on which the symbol was declared. PDB files is also used for services collecting crash data to send it to developers for resolution.

Debugging Information

In Visual Studio, you can select to build your project in Debug or Release mode. In Debug mode, VS will include debugging information with your executable. In Release mode, no debug information is included by default, but in some cases is enabled so that if the program crashes, information can be retrieved and sent to the author for fixing. However for some reasons, some developers don’t really bother to use the Release mode, and simply use the executable generated by the Debug mode. Generally, you don’t want that if you are making malware (or any program really!). If left within the executable, a path to the PDB file will be included and can be extracted:

Path to the PDB file
Path to the Program Database (PDB) file used by Visual Studio for debugging purposes, extracted using the “strings” program.

Within the strings, you can determine that:

  1. The program was developed on Windows 7+ (because of C:\Users folder),
  2. The username of the developer is SUPPORT_23e45RT
  3. The source, or part of it, can be found on Github
  4. The original name of the program is CaitSithTest

These indicators can be useful to link this specific program with others and provide a common link between multiple malware. Additionally, the username could potentially be used to conduct open source research and find linked accounts or forum posts. But wait, there’s more…

If you leave the debugging information, you may be able to restore all the original names of variables and functions of the source code using IDA. IDA will first detect debugging information and ask the analyst if he wants to retrieve it, either via Microsoft – http://msdl.microsoft.com/download/symbols (not browseable)- or by looking locally.

IDA detected that debugging information is available and ask if the user wishes to retrieve it.
IDA detected that debugging information is available and ask if the user wishes to retrieve it.

If for some reason, the user is able to retrieve the information, he will have access to the names of the original symbols, which will make reverse engineering much more easier.

Since symbol information is available, the original names of the variables are displayed.
Since symbol information is available, the original names of the variables are displayed.

Compare the information from the figure above to the figure below, in which debugging information has been stripped at build time:

IDA could not find any debugging information and thus used its own labelling system to identify variables.
IDA could not find any debugging information and thus used its own labelling system to identify variables.

You can see that the variables defined in the first figure, such as ClipboardData, isProcessElevated and isDebugged have been preserved. By keeping information about the symbols, reverse engineering is much more easier compared to figure 2, in which information about the code is lost.

Disabling Debugging Information

To prevent VC from including this information in your executable, right click on your project, go to Project Properties > Linker > Debugging configuration menu. Select No in the Generate Debug Info option.

Removing debugging information in Visual Studio.
Removing debugging information in Visual Studio.

After doing this, rebuild your project and rerun the string extraction program against your binary, the path to the PDB file should not be present in the executable anymore.

no_debug_info
The path to the PDB file is not included in the executable once debugging information has been omitted.

Doing so makes it a bit more difficult to fingerprint the malware and hides information about the author’s system.

Conclusion

This is a simple tactic that is often omitted not only by malware author, but penetration testers, which are often Google programmers, i.e. copy-pasting code snippets from Stack Overflow or googling functions 😉 If you attempt to hide your malware into the System32 folder, looking for this information in the EXE or DLL files will quickly tell you which files are bad, since legitimate files will rarely have this info, or have legitimate looking one. As such, if you want to make sure, create a legitimate-Microsoft-looking user (Bill.Gates) on your machine and put your code into a Microsoft-looking project and path (C:\users\Bill Gates\Documents\HTA\Release\).