Hackers of India

How my SV Machine nailed your Malware

 Nikhil P Kulkarni 

2017/11/20

Abstract

As we know the Android Application Industry from a security perspective, it is also quite well known that the Android platform is succeptible to malicious applications. And with the recent trend where all the vendors and customers going completely mobile, android has now become an attack surface for most of the malicious attacks. Moreover, the mechanisms used for android malware detection comprise of several known methods, and we also know that most of these mechanisms are permission based or based on API usage. But, when we go deeper in the analysis, we also realise the fact that these mechanisms are open to instruction level obfuscation techniques. Hence, we decided to bring in the approach of Machine Learning to the Android Malware analysis such as using the functional call graphs, and Hash Graph Kernel (Hido & Kashima) method which could be combined to implement a mechanism that could be used to find the similarities among the binaries while being stringent against these obfuscations used. This Project implementation is based on well known machine learning algorithm which is Support Vector Machines for solving the problem of android malware analysis. This method involves the mechanism of detection of android malware by effeciently embedding the functional call graphs along the feature map. The gamechanger in this concept would be the optimal utilization of the SVM Algorithm(Support Vector Machine) that proves to be better than other approaches with a minimalistic amount of false positives found and a higher detection rate. With the help of clean & real malware android application samples, an explicit classification model is developed. The functional call graphs are extracted out of the android applications, then the linear-time graph kernel based explicit mapping is deployed in order to efficiently map all the call graphs to the explicit feature space. After the above methods are implemented, the SVM algorithm is then trained to thoroughly differentiate between the real and the malicious applications.