Android Reverse Journey - Static Analysis Techniques to Crack APK

2015/11/28

Android in Reverse – Cracking the APK Using Static Analysis Techniques

This article marks the start of our journey into cracking. The previous articles explained how we can secure our APK so it cannot be cracked by other people. Here start to crack our APK and apply the appropriate cracking techniques to the encryption methods used before. In Android, cracking can be divided into static analysis and dynamic analysis. These two approaches can be further divided into Java (DEX and smail) and native layer (so). Today we will explain how to use static analysis to crack our APK. In this article, we will use the cracking of the Java layer and native layer as examples. Some preparations are in order before we start today's article.  
First, some basic knowledge:
1. An understanding of the APK file structure in Android.  
2. An understanding of the Smail syntax and DEX file format.  
3. APK signature mechanism.  


We will not go into the details of these three knowledge points here. Students who do not already understand can study them on the web where there are a lot of resources.  
Second, some important tools:
1. Apktool: Powerful De-compiler  
2. Dex2jar: Convert DEX to JAR ADVERTISEMENT
3. Jd-gui: A great tool for viewing JAR files  
4. IDA: the most comprehensive paid cracking tool (can be used for analyzing both DEX and SO) Extra:  The four above are the most basic tools but there are now some better tools on the Internet such as JEB and GDA. These tools, however, only build on the four above tools so we really only need these four. I've provided a download address for IDA. The other tools can be found in the examples we provided. Now that the preparations are out of the way, let's take a look at the cracking method we will be using today: In Android cracking, static analysis is both important and not important. Why do I say that? When we go into dynamic analysis later on, we will discover that static analysis is virtually useless for APK cracking. Programs are now becoming increasingly better protected making static analysis almost useless. Static analysis is therefore essential. But is static analysis really useless? In certain scenarios, only static analysis can crack the door. Dynamic analysis cannot even start until we have the results from static analysis. We will explain this in the following example. Static analysis and dynamic analysis therefore go hand-in-hand during the cracking process. We need both to move forward. Below we will show how to crack an APK. ADVERTISEMENT using static analysis. 
First, the static analysis process


1. Use apktool to decompile the APK. We will discover during this process that some APKs are easily decompiled but others report all kinds of errors each time. This is normal because they have been secured. There are many APK encryption methods on the web that block decompiling such as Androidmanifest and DEX files. apktool needs to analyze these important resources but if these documents are encrypted them it aborts. We will therefore assume that the APK here can be decompiled because our focus today is on how to use static analysis for cracking. I will explain decompiling failures and list some examples in another article at a later date.


2. Get the program's smail source code and AndroidMainifest.xml file. We know that all of an Android program's entry information such as the application and entrance activities will be in AndroidManifest.xml. We naturally analyze this file first to get the information we need. There is a frequently used command here to keep in my mind: adb shell dumpsys activity top - this gives the activity information of the current program. We then analyze the smail code to modify the code logic.


3. Directly extract the APK file to get the classes.dex file; use the dex2jar tool to extract the JAR and view it in the jd-gui tool. Since we acquired the smail source code in step 2, viewing the code is easy and we can now analyze the program. While the smail syntax is not very complicated - it's simpler than assembly - but it's not as easy read as Java code. So we use the jd-gui tool to view theh code logic then make changes to the smail code. The JEB tool mentioned above improves on jd-gui by translating smail source code directly into Java code. With JEB, there is no need to view the code in jd-gui first before making changes to the smail source code.  


4. If the native layer is involved in the program then we can use IDA to open the designated file. We must however look at the Java code first to find the designated SO file then use IDA to conduct a static analysis.  
Second, useful techniques


The static analysis process was introduced above. We will look at several static analysis techniques below. When using static analysis to crack the APK, we must first find an opening in the form of key classes and functions. This takes experience and there is no magic formula. Some techniques can however speed up the cracking process.  


1. Global search of key strings and log information. This technique depends completely on the human eye. When we run the program certain strings will appear such as text boxes, button text, and information displayed by toast. These may all be important information. We then run a global search of this string in the jd-gui tool and this will quickly give us the logical location we want.  

Another important point is the log information in Android. In large projects with multiple developers, each module will have its own developer and everyone will add some log messages for debugging. Not everyone remembers to turn off all log messages before the project is published however. This is a bad habit during project development. At this point, the log information printed by the program while it’s running will give us an opening to use. Android's log can be filtered by application or we can use the string in the log message to run a global search in jd-gui.


2. Code injection technique 

In the first method we use global search of key strings to find an opening. This might not always work though so we need to add in some code for observation. A common approach is to add our own log code to rack the code's execution logic. As we are talking about static analysis here, so we use code injection to track the execution logic. When we come to dynamic analysis techniques later on things become a lot simpler because we can use arbitrary break points for debugging. Here adding code means modifying the smail code. It's enough to just add our log message. This will be explained in the following example and this is the technique that we use most often. 


3. Use system hook to inject cracking process and acquire the execution logic of critical functions 

We will not go into the details of process injection and hook technology in Android. Students who do not understand these techniques can go here: The two articles introduce these two techniques. We will not actually use them the way they are introduced in the articles however because they only introduce the basic principles and the techniques are really well developed. There are two well-known frameworks on the web for these two techniques that are very well developed and very practical – Cydia and Xposed. There is a lot of information available on the web and they are very easy to use so we will not go into details here. We do not make much use of this method during actual process because they are rather inefficient. They are therefore only used in certain scenarios. 


4. Using IDA for statics analysis of SO file 

We finally come to the use of the IDA tool. I personally think this tool is awesome. It can view the code logic in SO but what we see might be the assembly instructions. So there is another problem – we need to master another skill when cracking SO – the ability to read assembly instructions. Otherwise, using IDA for cracking programs will be really taxing. We encountered assembly instructions in university but we did not really pay much attention to it because it felt hard and was used in only a few places. That's not true though. You must know assembly in order to be a good programmer.  


The sight of these assembly instructions is a real head-banger. But once you use it a lot and do a lot of cracking then it's not a problem. In the left column, we can see our functions. All we need to do is find the definitions of specific functions. The most powerful feature of IDA is dynamic debugging of SO files. The next article will introduce how to dynamically debug SO files. IDA can of course view APK files directly:  

We can view all of the files in the APK file such as classes. dex. 

We might run into a problem here though – if the program is too big, it may take a long time to open and IDA might stop responding. So you just have to wait: 

Once it's open, we can see our classes and functions. Using Ctrl+F to search class and function names is supported, as is search by string (Shift+F12).  

IDA is great for analyzing JAVA code as well, so it's a really awesome tool! Having introduce the static analysis techniques for cracking above, we will now use an example below to illustrate the use of static analysis.


First, static analysis of the Java (smail) code 

We start by using the apktool.jar tool to decompile the APK we want to crack: 

java -jar apktool.jar d xxx.apk 

This APK was easily decompiled so it appears to be not protected in any way. This makes things easy. We just change its AndroidManifest.xml to adjustable mode as this is a prerequisite for the dynamic debugging we plan. In a release APK this value in AndroidManifest.xml is false.  
Let's have a look in the AndroidManifest.xml file:  

We change this value to true then re-compile it. Now we can dynamically debug this APK. This shows that static analysis is a prerequisite for dynamic analysis. If this value is not changed, we cannot continue with the subsequent dynamic debugging. 

Once the change has been made, we re-compile it: 

cd C:\Users\jiangwei\Desktop\Static Analysis\apktool_2.0.0rc4 

del debug.sig.apk java -jar apktool.jar b -d 123 -o debug.apk 

java -jar .\sign\signapk.jar .\sign\testkey.x509.pem .\sign\testkey.pk8 debug.apk debug.sig.apk 

del debug.apk 

adb uninstall com.shuqi.controller 

adb install debug.sig.apk 

adb shell am start -n com.shuqi.controller/.Loading 

pause 

Here is a batch process written to simplify things. First, enter the folder then run the compile commands: 

java -jar apktool.jar b -d sq -o debug.apk 

Sq is the folder used for de-compiling and debug.apk is the de-compiled file. Debug.apk cannot be installed at this point because there is no signature. Android does not allow unsigned APKs to be installed. A signature is needed for the rest of the process and we can sign the file using the system's own signature file: 

java -jar .\sign\signapk.jar .\sign\testkey.x509.pem .\sign\testkey.pk8 

debug.apk debug.sig.apk 

Note: When use the IDE tool during the development of Android projects this is the signature file used for signing as well. IDE just does it for us. 


The next step is to install the apk then run it. All we need to know during this process is the name of the package and the entry activity. We can find this information in AndroidManifest.xml but we here we use the command to achieve the same result: adb shell dumpsys activity top:  

After re-compiling, we run the program and discover a problem – when we click on the program icon there is no response and it does not run. We check the log for error messages and discover that no errors were thrown. We can then conclude there must be some kind of internal verification. Generally speaking, if a re-compiled program does not run then internal verification is present. There are generally two types of verification: 

1. Verification of DEX to prevent any modifications to the DEX 

2. Verification of the APK signature to prevent re-packaging. 

We now need to re-examine the code to check for verification: 

When we are analyzing the code we must first see if it has its own defined Application. If there is a definition then we need to look at the Application class. Here we can see that he defined his own Application:

com.shuqi.application.ShuqiApplication 

We decompress the APK to get the DEX then convert it to JAR using dex2jar. We then examine this class using jd-gui:  


Here we can see he obfuscated his code but there are some system callbacks that cannot obfuscated such as onCreate. Here is how we usually look for it: 

1. First, we check to see if this class has any static methods and code blocks because this type of code is run before object initialization. SO may be loaded or encryption verification carried out here. 

2. Next, we look at the construction method of this class. 

3. Finally, we look at the life cycle method. 

Here we see that the core code called a lot of methods in onCreate. So maybe one of the methods here is the one? We then inject our code to trace which method is the problem. Some students ask - if these are all the methods there are, why don't you just apply them all in order? Well, this article is supposed to be an introduction to static analysis techniques so we need to use an example for demonstration. Let's now look at how we can add our own log message. It's actually simple. Adding to the log requires a change to the smail file. Let's go look at the smail source code:  

I personally do not think smail syntax is hard so everyone should go find some resources on the web to learn from. Here we can see very clearly what methods are called. We then add our own log message to each method. There are two ways of adding the log message here. The first is to call the system's log method directly from here but there are two problems:  


1. Need to import the package and modify the smail. 

2. Need to define one or two parameters (tag, msg) in order for the log to print normally. 

This method is obviously a little cumbersome so here we just define our own MyLog class then re-compile it to get the smail file for MyLog. Add this to the root folder of ShuqiApplication.smail then just call it directly in the code. It's placed under the root folder so you can call it from the code without having to import it first, similar to some of the static method calls in ShuqiApplication.smail.  

For editing the log class MyLog, there is no need to paste in code. We create a new project, re-compile it to get the MyLog.smail file then place it in the folder:  

When we receive this file make sure you remove the package name information for MyLog.smail. As we placed it under the root folder this means this MyLog class has no package name. Be sure to note this because if you add it then it'll throw up an error. 

We insert our log method into the onCreate method of ShuqiApplication:

invoke-static {}, LMyLog;->printV  

When we are adding the code be sure to add it to the right place. Here finding the right place means adding it after where the last method finishes, for example: 

invoke-virtual,invoke-static

These commands cannot have the following behind them either:

move-result-object,

This is because this command is for collecting the value returned by the method. We generally add code like this: 

Once our log code has been added we re-compile and run it. If we encounter smail syntax error during this process just edit the corresponding file. When we have the re-compiled APK we can de-compile it again to see the Java code: 

 

Here we can see how the code we added prints a message after each method. We now run the program and also enable the tag for our log: ad logcat -s JW.  

We now see the printed log. We see that three logs were printed but what we need to note here is that the three logs were all in different processes. Since only one log was printed for a process so we can deduce that the problem occurred in the vr.h method.  

We examine the source code for this method: 

As expected, signature verification is carried out by this method. If the signature does not match then it aborts the program. If we want a normal operating procedure, we can just comment out this line of code: vr.h (this) 

Next, we re-compile it and it no longer reports an error while running, so we will end the demonstration here: 

Above is how code injection is used to trace the problem. This method is used frequently and is very practical. 

Second, statistic analysis of native code 

Below, we will introduce how to use IDA for static analysis of native code. Familiarity with assembly instructions is essential or the code will be hard to read. We see after de-compiling that there is code for loading SO in the OnCreate method.  

Looking at the code:  

The method for getting the password is native so we take a look at the getDbPassword method by using IDA to open the libpsProcess.so file: 

We look at how this function is implemented. We usually go straight to the BL/BLX messages, jump logic and the return value. At the end of the function, we discover an important item: BL_android_log_print. This calls the log function on the native layer. We go up from there and see: tag is System.out.c.  

We run the program to look at the log. At this point, we can also add a log to the Java layer: we run a global search for this method and it's called from the class yi.  

We modify the yi.smail code: 

Re-compile, run and open log:  

adb logcat -s JW 

adb logcat -s System.out.c 

We discovered that the returned password is the same for the Java and native layers. This proves that our static analysis works for native as well. Well, that's all of our content for today. There are many other static analysis methods that can be used with APK of course but here I cover the methods I use. 


1. How to fix errors during decompiling with apktool 

As I said at the beginning, the reason why there is an error here is usually a protected APK. I will talk about how to solve problems like this separately. 

2. How to prepare an APK for debugging 

We saw above that to make an APK ready for debugging the value android:debug must be changed. Sometimes, this change fails and the program does not run. There will be a separate article describing several ways of preparing a published APK for debugging. 

We introduced in this article how to use static methods to crack an APK. When we crack an APK we are basically changing some of the code so it can run and deliver the functions we want. These are usually: 

1. Comment certain functions such as ad display. 

2. Get a method's return value such as get the user's password. 

3. Add our own code such as including our own monitoring code and ads. When we apply static analysis to the code we must follow this general outline: 


1. First, we must be able to de-compile it to get the AndroidManifest.xml and find the program entry code. 

2. Find the code logic we want. This usually involves interface analysis. For example, if we want to let the user login successfully then we must get the activity of the user login interface. Here we can use the command adb shell dumpsys activity top to get the Activity name. We then use the built-in program of Eclipse as the GUI analysis tool: find the control variable name, or extract the layout file from the code. The usual approach is to use the place where the setContentView method is called then combine the layout file with the code to get the user login logic then make the changes. 

3. Use the code injection technique at key points to trace the code execution logic. 

4. Check the return values of methods and identify code make more obvious conditional judgments. 

5. The source code of some APK may contain its own encryption algorithm. We then need to get this encryption algorithm. If the encryption method is more complicated then we need a large amount of test data to find the logic of this encryption method. Generally speaking, the input and output serves as one test example. The first question in the 1st Alibaba Security Challenge for example can be cracked using static analysis. It contains an encryption algorithm that we can crack using test data. 

6. For code that use System.loadLibrary to load SO files, we just need to find the SO file then open it with IDA for static analysis. This is because some APKs place the encryption algorithm in the SO. Here we can use test data to extract the encryption algorithm as well. 

7. Based on the above examples, we can arrive at this conclusion - a lot of APKs now perform verification and the code usually contains the string: "signature." If we do a global search with this it might give us some important information. 

 

That's it for this article. I've been wanting to write an article on cracking because cracking is more interesting then securing. At the very least, a successful cracking attempt gives a sense of achievement. This article mainly describes how to use static analysis for cracking. It introduces some tools along with the cracking process and techniques. The most approaches are code injection and global search for key strings. We know however there are a lot of APKs on the market that can no longer be cracked using static analysis alone and are very difficult to crack with dynamic analysis as well. There are a lot of things to learn and I will introduce the techniques and commonly asked questions in dynamic cracking over several future articles. Cracking with static analysis is still very important though. As it is the prerequisite to dynamic analysis, as crackers we must master both of these techniques.

Source