My friends at CyberBlog decided to analyze the GM Bot Android Malware as exercise aiming to receive feedback sand suggestions from the security community.
The sample explored is confirmed as a variant of the GM Bot Android malware – who’s source was released publicly in early 2016. The code appears to have been forked by a second author and has additions that target the Danske Bank MobilePay application and the popular Danish Nem ID two factor authentication (2FA) system.
This article shows the process of walking through Static and Dynamic analysis to unlock the packed source code for the malware.
We see how even with basic static analysis a full picture of the intent of the malware can be readily assembled, and with a little debugging we can quickly get to readable source code.
As part of my journey into Cyber Security I thought it would be interesting to see how modern mobile malware operates. I chose the following sample at random based on an article here.
A quick google search for these hashes will lead you to the file used if you would also like to explore this sample.
The article above demonstrates that the analyst has gone from sample to source code, but it is not clear how this is achieved. There are references to suggest that the code has been packed, but again no information on how it was unpacked for analysis.
This post will break down the process I used to analyse this sample, hopefully with enough detail to provide some tips and guidance for others wishing to attempt similar. The process I followed can be logically broken into the following stages:
Public Analysis – What can we find out using existing public sources of information? What analysis has already been performed (automated or manual)?
Static Analysis – What can we determine from the sample without actually running it in an emulated environment?
Packer Debugging – Assuming the sample is packed (to frustrate analysis), how do we debug the unpacker to understand what is being loaded /run?
DEX Extraction and De-compilation – Once we have mapped out the function of the unpacker, how do we then recover the main code for the malware and reverse it?
Functional & Dynamic Analysis – once we have the extracted and reversed code, what do we see and how does this correlate with behavior in a safe emulated environment
Stage 1 – Public Analysis
First off let’s see what we can find about this in the public domain. Searching for the file hashes on Virus Total, where we see approximately 50% of AV products have identified it as malicious:
However, we also note that all classify it heuristically as a generic strain of malware – either a Trojan, Dropper, Fake Installer etc. Nothing to suggest it is in fact GM Bot Android, or any specific type of malware. Other than this we don’t see much from google with either the SHA256, or MD5 hashes.
The original Security Intelligence article references IBM X-Force research, so this is the next stop – but again nothing immediately obvious with regards to this sample could be located.
A wider search of the internet reveals some history of GM bot, originally built and sold by Ganga Man on dark web forums. Following a dispute the source code for both client APK and C2 server were released publicly. A copy is hosted here on Github and will provide useful for cross referencing with this sample later in the analysis.
First up we are going to unpack the APK file using APK tool. This will unzip the contents, as well as providing a disassembly of the DEX code into Smali:
apktool d da88bdcb3d53d3ce7ab9f81d15be8497.apk
The results of this can be seen below and the tool has also provided a human readable version of the AndroidManifest.xml file.
First stop is to take a look at the Android Manifest file, that should provide an overview of the components of the application and permissions requested.
Manifest Analysis – AndroidManifest.xml
Initial analysis shows a broad range of permissions that indicate malicious behavior including permissions to:
control all SMS messages (send, receive, read, write, delete)
list running applications
read the phone’s state, contacts, SD card data
request to be a device administrator enabling remote wiping of the device with no warning to the user
A summarized view of referenced class files for the main application, activities (15) and services (2) can be seen below:
In addition, we see 4 further classes mapped as Broadcast Receivers which will process event messages (Android system Intents) as shown below:
From this we can see the application is capable of:
Executing code when the phone is powered on (starting the application automatically)
Receive notification when Device Admin is granted, requested or a request to disable admin is received (and hence interfere, or nag the user to enable it)
Receive notification of a new inbound SMS – with high priority flag to ensure the code can intercept it first and potentially stop any further alerts (can be used to steal 2FA tokens)
Before proceeding with any reverse engineering of the code, the next step is to explore the other files in the APK for clues.
Files of interest
The following files were noted as of interest:
A binary file with no immediately obvious format. Possible code to be unencrypted / unpacked at run time?
English language strings for the application, as shown below:
The strings clearly indicate that this malware is targeting capturing victims credit card information. It is interesting to note that:
The resource keys here are all in English, suggesting the original developer may be English speaking
There are specific strings that are in Danish, despite this resource file being intended for English language
In addition to English language strings we also see several other targeted countries:
This file contains a list of country codes and specifically a group that are “non vbv”. This is understood to mean that they do not use the “Verified by Visa” process which is used to enforce additional verification checks during online purchases. It is likely that the attackers would seek to obtain additional VBV credentials via the malware in order to allow online purchases with the card details (or avoid these countries).
Images and icons/logos including:
Sample photo of Danish “Nem Id” – https://en.wikipedia.org/wiki/NemID
Icon for Danske Bank mobile pay
Mastercard secure code
Icon for verified by visa
Flash icon (main application icon)
Additionally there are png images prefixed “overlay_”, indicating a possible use in fraudulent overlay activity.
Decompiling to Java source code
Next we attempt to reverse engineer the DEX file back to original Java source code. For this we use dex2jar as follows to translate the DEX file (in the APK) into a Java Class file archive:
The resulting jar file can then be disassembled using JD-GUI as follows:
The resulting java classes that we see in JD-GUI show that there are only 4 java classes contained in the application. This is in direct contrast to the 16 different classes we saw declared in the application manifest. This confirms that there must be additional code that is loaded dynamically at run time – it is most likely that these four classes are in fact an unpacker.
Examining the code we see that it is heavily obfuscated and has been crafted in a way to prevent clean decompiling of the code. This aside, we can start to get an understanding of the function of these four classes by examining the system classes that are imported (and therefore used) when the application is first executed.
After exporting the java source from JD-GUI and unzipping to a new folder, we can extract the imported classes from these files:
Essentially we have a very small set of libraries that are being imported and used. These consist of functionality for:
General Android application and context classes (expected and needed for all android apps)
File related classes (in red) – for access, reading and writing local files
Java reflection classes (in green) – for creating new classes and instances and invoking methods dynamically
This confirms the hypothesis that we are most likely dealing with an unpacker that unpacks it’s executable code from a local file resource (as opposed to pulling dynamically from network for example).
Stage 3 – Unpacker Debugging
As the Java code cannot be readily decompiled (due to protections injected by the malware author) we will instead debug the executable against the Smali assembly code. Smali is a disassembly of the DEX code used by the Dalvik Virtual Machine.
The Smali/Baksmali plugin for Android Studio is required, and then the output from Apktool is imported as a new project. We next set the breakpoints as required across the three classes that we are interested in (a,b,c):
We will initially debug the calls to interesting reflection methods identified, which are as below:
a.smali (a line that creates a new instance of a class based on a java.lang.reflect.Constructor instance)
b.smali (a line that invokes a method on an object via reflection)
c.smali (a line similar to that described above for a.smali)
Now we install the application to the emulator (via ADB to ensure it doesn’t start automatically as in some emulators).
To enable the debugger to connect to the application, we perform the following prior to starting the application:
Enable developer options by repeatedly clicking the build number in Settings > About device
In developer options, choose “Select debug app” and choose the malicious application – “Adobe Flash”
In developer options, enable the “wait for debugger”
Now start the application from the launcher, you will be prompted to attach the debugger:
In Android Studio, attach the debugger using the icon. Choose the malicious application process. The debugger then stops at our first breakpoint as shown below:
Note you should now set some variables to watch – as per above I have set v0 through v10 and p1 through p3. Our first breakpoint is hit and we see we are about to execute a method by reflection. Noting that we have not yet called newInstance() we can assume this is calling existing (loaded) classes – either one of the four loaded by the application, or some other Android framework classes.
Next we forces step into the method to see which method it is calling (the smali debugger seems a little buggy and we can’t at this point see the parameters being passed).
An initial call to get the current context object -presumably to start retrieving local resources from the APK. We now allow the debugger to continue, and repeat this exercise several times to build up a flow of the reflected method calls:
//expected 2 arguments, got 1 – error in malware code, or to throw off debugging?
//Several more of these not shown
IllegalArgumentException java.lang.IllegalArgumentException(String s)
void Java.lang.reflect.setAccessible(boolean flag)
// returns /data/user/0/com.kzcaxog.mgmxluwswb/app_ydtjqjava.io.File.getAbsolutePath()
ContextImpl android.app.getImpl(Context context)
//filename is fytluah.dat
InputStream android.content.res.AssetManager.open(String fileName)
Pausing here, we can see the code is attempting to load the file that we had previously flagged as of interest in the static analysis section. Continuing we see the file is read, presumably decrypted and then written out again as a jar file:
int android.content.res.AssetManager.read(byte b)
//className = java.io.File
Class java.lang.Class.forName(String className)
//args = String “/data/user/0/com.kzcaxog.mgmxluwswb/app_ydtjq/gpyjzmose.jar”
T Java.lang.reflect.Constructor.newInstance(Object.. args)
void java.io.FileOutputStream.write(byte b) #25
Finally a DexClassLoader is invoked to load the additional code into the system:
//className is dalvik.system.DexClassLoader
Looking at the API for the DexClassLoader we can see that it takes two arguments – the location of the file to load, and a writeable area that it will use to re-write an optimised version of the code for the specific machine architecture – eg the Android Run Time (ART). Further information on this can be seen in the Android API documentation:
We can see the exact location of the jar file in the debugger below, and the next step is to recover this file via ADB command line.
After execution of the classloader, connecting via ADB shell we see the two files, the original and the DEX optimised code:
We copy these files to /sdcard/Download (+chmod) and then pull the .jar file to local machine for further analysis with adb pull.
Extracting the jar file we find the classes.dex file.
Repeating the steps to convert this to a jar file using dex2jar and decompiling with JD-GUI, we confirm we now have the full (un-obfuscated) source code for this malware sample.
Stage 5 – Dynamic and Functional Analysis
Upon initial analysis we can see the codebase bear remarkable similarities with the leaked source identified in the static analysis. However there are significant differences, and the code has been customised to specifically target the Danske Bank MobilePay application.
As the code is basically un-obfuscated, I’ll now briefly walk through the key functionality of this malware, starting from first installation.
Upon first installation and execution the application will perform two primary functions. It will initially harvest a range of the users data, including phone contacts, all SMS messages and other key data and send this to the C&C server. The C&C server then returns a unique installation identifier that is then used for all future communication to uniquely identify the compromised device.
Secondly the malware will then nag the user to accept the software as a device administrator. If the user declines the request is re-triggered, making it very difficult for most users to escape this screen without accepting. With this permission in place, the malware achieves two objectives:
The application cannot be un-installed by the user easily, without de-activating the device administrator. Attempting to do this will trigger the launching of overlays that prevent removing the device admin
At some point in the future, once further data has been stolen from the phone, the C2 server can issue a command to wipe the device, removing evidence of the infection and restoring the device to a factory state
Ongoing Operations – including after each reboot
The malware maintains a regular heartbeat to the C2 server, which provides a mechanism for the attacker to issue specific commands to the device. Each hearbeat contains the installation ID and the current screen status. It is hypothesised that the attacker would ideally choose to execute malicious activities when the screen was off, and the user was not watching the phone.
Firstly we see the ability to “lock” and “unlock” the phone. This simulates an Android software update screen, and effectively hides any other activity that is occurring behind the screen overlay (such as sending, receiving or deleting SMS messages). Additionally this could be used to disable the user, and prevent them from using the phone whilst their accounts or cards are being compromised in real time.
Next we see another function that is intended to intercept and forward SMS messages to the C2 server, and specifically trying to remove evidence that they ever existed by deleting them. This is used to steal 2FA credentials.
Next from a C2 server perspective we see two “reset” commands. The first, a “soft” reset, is used to reset the internal flag to re-attempt stealing Nem ID credentials. The second is the “hard” reset that performs and immediate wipe of the device data.
Finally, we see the ability to send an arbitrary SMS message to a mobile defined by the attacker and a function to launch a customised push notification to another application on the device. It was not clear what this could be used for.
SMS Remote Control
By listening for incoming SMS messages the malware could also trigger a fake Android update screen that would then harvest, forward and attempt to delete messages as they arrived on the phone. This mode could be enabled and disabled by customised SMS command messages delivered to the phone via SMS.
Automating Data Theft
As per the original article and many of the indicators from the static analysis, the primary purpose of the application is to steal data by performing overlays on top of legitimate applications. The malware targets three specific classes of applications:
Danske Bank’s MobilePay application, with specific intent to steal Nem ID credentials
Applications that trigger an attempt to steal credit card details via a custom overlay
Applications that trigger an attempt to steal the users mobile phone number (possibly for triggering the “admining” mode described above)
Danske Bank MobilePay
Upon launching the MobilePay application the overlay attempts to steal the users CPR number (unique social security type id), mobile number and Nem pass code. It then asks the user to take a photo of their Nem ID passbook, containing one time use codes which can be used by the attacker to then log into MobilePay (and other Danish systems) and issue payments.
Stealing Credit Card Details
Upon launching one of the targeted applications, a credit card overlay is displayed with a configurable icon depending on the application launched. After basic card details are collected, the application then attempts to recover the Verified by Visa password for the user. These details are then forwarded to the C2 server.
Stealing Phone Numbers
Finally we see the functionality that is targeted to capture the user’s phone number, presumably to enable further abuse of the victims account via abuse of text message 2FA.
The sample appears to be a specifically customised variant that is being used in a campaign to target the Danske Bank MobilePay application. We see evidence that it is probably not the original GM Bot authors work – the coding style compared with the public source code is different, and the mix of languages in the resource files implies the sample has been adapted in a “quick and dirty” fashion to achieve the objectives.
This is a good example of how once released, complex code can be quickly and easily forked by less skilled authors and a pattern we also see today with the release of the Mirai botnet code. Quickly we see a spread of variants of the codebase that become harder to trace and detect and importantly attribute to any individual or group.