Short Survey on  Malware

Abstract:

 Malwares are created to misuse something by someone.The
Internet has become breath of this era , that is become a  part of daily life as more and more people
use services from classroom to government 
services uses the internet. A few works have attempted to  touch the 
research area of malware .  Cybercriminals are not only stealing the soft
data and information only by sending the malwares but also they are putting
forth every possible effort to make malware difficult to detect. we have a
Looking forward for designing such a 
technique to identify malwares
based on their Executables file signature. Before developing counter measures
against malware, it is important to taking the research and follow the
root   causes  to 
understand their behavior , affect and 
tools to  deploy them. This
paper  gives  overview of 
different survey and 
techniques  as well as the tools
that aid  to understand the a malware
instance’s behavior and the future scope in techniques  as well as the tools against the malware.

 

Keywords:Malware,
Detection ,Classification, Cybercymes ,malcode

 

Introduction

A number of
applications make use of identifying a program features including malware
classification, a software theft detection, plagiarism detection and code clone
detection.

  Innovations
of  new technologies to misuse something
is creating by  someone makes the path of
malwares.  creating the malwares is also
art of programming , now it is followed by new generations but creating a  professional virus and let them out is a most
dangerous  part  handle by experience developers. A malware
classification is the process of  determining
if a program is malicious. One approach to perform classification is to obtain
a fingerprint of malware based on program feature extraction. This fingerprint
creates an invariant signature that can be used to  identify evolutionary malware variants. For

 

 

 

detection of completely
novel malware ,program features can be extracted to create feature vector

which can be  subsequently used in machine learning
algorithm and statistical classification.

Malware classification
helps fight the threat of malicious software. Such malicious software presents
a significant challenges to modern desktop computing. Detection of malware
before it adversely affects computer system is highly desirable. Static
detection of malware is still the dominant technique to secure computer
networks and systems against untrusted executable content. Detecting Malware
variants improves signature based detection methods. The size of signature
database is growing exponentially, and detecting entire  families of related malicious software can
prevent the blowout in number of stored malware signatures. Detecting
entire  families of malware by using
similarity measures indexed of  exact matching  makes malware detection less fragile and more
robust in the face of malware  evolution
and change.

The purpose of this
work is  to determine the best feature
extraction, feature representation, and classification methods that result in
the best accuracy when used on the top of Cuckoo Sandbox. Specifically, k-Nearest-Neighbors,
Decision Trees, Support Vector Machines, Naive Bayes and Random Forest  through the survey.

 

Literature review:

Traditional
malware detection techniques  require
analysis of huge datasets   most of them
are based on  signature and behavior  also requires a  long 
time  for result . There are a lot
of techniques for classification, the problem is that the classifiers are not
comparable in general. For the malware study 
when we look for the root of the malware in the history  we found that 
it was a  ‘Creeper’,  The First 
virus in the network which 
came  in  between 1970-71.  Creeper was only throwing the message  like “I’m the creeper, catch me if you
can! “.  It was not type of cyber attack
by the Hackers rather  it was  an academic experiment as a trial of
demonstrating the possibility of self-replicating program. The programmed was
developed by  Bob Thomas as a  experimental self-replicating program .

 

In [2] Rama Priyanka  researcher 
found the  drawbacks of the
previous system  in case of malware
present in the images that until the malicious activity starts from image we
can’t detect the presence of malware. And hence 
researchers works for developing 
steganalysis tool for malware detection using  scanning 
the  images  which was also able to inform victim about
embedded malicious  payload . In [4] researcher
presented  a data mining classification
approach to detect malware behavior. And 
proposed different classification methods   for 
detection of  malware. It was
also  based on feature and behavior
of  malware  and in the research he found that  for classification of malware  the regression classification method gives
best performance.

 

Table
1: Malware examples

Malware

Developer

Type/year

Target

“Elk
Cloner”

Rich Skrenta

Virus-1981-
1982

first known
virus targeted an Apple computer

Chameleon

Ralf Burger

Worm-

All System

a Wabbit
(Rabbit).

John Walker

Trojan horse -1974

All System

Brain boot
sector

Basit Farooq
Alvi, Amjad Farooq Alvi

Virus-1986

All System

Baza

1996-Virus

infect Windows
95,
to  infect Linux.

CIH

Chen Ing Hau

Virus-1998.

Outlook
Express and Internet Explorer on Windows 95 and 98

ILOVEYOU

Virus-2000

deleting files
in JPEGs, MP2, or MP3 formats

Anna
Kournikova

Anna
Kournikova

Virus-2001

Microsoft
Outlook

LFM-926

Virus-2002

attack on
Flash files 

Beast or RAT.

 

Virus-2002

All  versions of Windows OS

MyDoom or Novang

 

Worm-2004

 

OSX/Leap

 

Virus-2006

Mac OS X 

Zeus

 

Worm-2007

All Systems

Koobface

 

Virus-2008

social
networking  as   Facebook and MySpace users

Kenzero

 

Virus-2010

Online systems

Cryptolocker

 

Trojan horse -2013

 machine
and demands a ransom to unlock the files

 

 

In the master
theses of  Bc. Jan Tomášek researcher was
proposed  the use of computational
intelligence methods, namely various types of classifiers like neural
net-works, support vector machines and random forests for the malware
classification and  detection of  malware family also  effectively find the treatment for it[3.]Hacker
attack on the system  in the network
using malware for to run a bot . A bot is a remotely-controlled piece of
malware that has infected an Internet connected computer system[5]. The bot
tries to overload some webpage or another type of service by repeating
requests. Such a bot can be either distributed voluntarily between members of
the association or be part of some malware distributed on user’s
computers.   Botnet is a  master of all   collective set of bots  which have controlling power over all  bots.. The users get trapped by the Hackers
without any clue or hints that they are part of antigovernment offence[3].

 

Daniel Gibert proposed a
Convolution Neural Networks (CNNs) based approach  in his work of malware  classification. At the conclusion  he shown that his method is able to detect malware
binaries. Lakshmanan

Nataraj[8]
focus on the malware present in the images. He considers the images ware
infected by malware families. As compare to  instructions  images  
more informative. Structure of the malware  is more clear in image format. He found that
packed and unpacked  with variant packers
and  specific packers shows the similar
images.  one a benefit of image of
informative base but on the other hand there is a limitation  on the method of  image base malware analysis  is that this method works only for existing
malwares and hence  it is difficult to
prevent zero day attack. Secondly when characterization is consider then images
are unable to give more information   about the signature of the malware.

 

Figure 1: Example of Malwares

Viruses  reproduces  part of the program and the function of  worm sends a copy of itself  or replication. These worms may be
email-worms, Im-worms and many more types. Some Trojans’ like backdoor
,exploit, rootkit  etc are not authorized
by the user. They are able to perform add, delete or update the blocks. These Trojans’
are classified according to type of actions they perform. Figure 1 shows
the  different malware detection
technique  that are classified in anomaly
,specification and signature base and each have again classified in Dynamic
static and hybrid mode. Malware Image contain  information like  code ,ASCII 
text, Uninitialized data, Initialized data etc. on the basis of features
computed in the image malware get classified.

Different  methods of classification  and feature extractions are used to
classify  like K-NN, SVM, Euclidian
Distance, and Gabor wavelets. In [9] uses the gabor wavelet method  to detect the pattern of the Trojen malware
in the image and KNN classifier to classify. Limitation of  static analysis methods are  it does not   visualize the malicious code  of hacker  for repeated and  modified malicious code.

A key advantage
of anomaly-based detection is its ability to detect zero-day attacks but also
Author stated limitations of this technique is its high false alarm rate and
the complexity involved in determining what features should be learned in the
training phase. [5].

 

 

Figure 2: Basic Classification of
malwares

Figure 2: shows the
classification of malwares Detection Techniques,. Detected  functions are divided into the following
classes:  Anomaly –based,
Specification-based and the Signature-based. class includes the following
sub-classes: Dynamic static and hybrid of each type. Figure 3 shows the
analysis method of these three type.

Figure 3: Different way of analysis

Summary

From the survey be taken
out  the summary as shown in Table 2 give
the brief idea of different type of  analysis
, methods of analysis and  different classifiers.
Number of developers chosen these methods to classify the malwares[10].

Table 2: summarize different type, methods, classifiers

Analysis type

Analysis Methods

Classifiers

Image base

Visualization Method

SVM Classifier

Signature base

Texture base

Data Mining

Pattern base

Wavelet transform

Entropy feature extraction

Feature base

Genetic Algorithm

Feed f/w NN

 

Conclusion:

 survey we observed  some important point   like 
Malware analysis can be done using static and dynamic  approaches. Function length and the variable
length can also plays a role in while dismantling of types of malware. Some malware
instances may exhibit the characteristics of multiple classes. It is important to
note Detecting entire families of malware by using similarity measures indexed of
exact matching makes malware detection less fragile and more robust in the face
of malware  evolution and change.

 

Reference

[1].  Silvio
Cesare and Yang  Xiang  “Software Similarity and Classification”-book

[2].  Rama Priyanka ,
P.K. Sahoo, ‘Scanning Tool For Identification of Image With Malware’ Copyright,
International Journal of Advance Computing Technique and Applications (IJACTA),
ISSN : 2321-4546, Vol 4, Issue 1, June 2016

[3].  Bc. Jan Tomášek ‘Computational
Intelligence for Malware Classification’ 
Department of Theoretical Computer Science and Mathematical Logic-  Prague 2015

[4].
Monire
Norouzi,1 Alireza Souri,2 and Majid Samad Zamini3 ‘A Data Mining Classification
Approach for Behavioral Malware Detection’ Journal of Computer Networks and
Communications Volume 2016 (2016), Article ID 8069672, 9 pages http://dx.doi.org/10.1155/2016/8069672.

[5]. 
N Idika and  A.P. Mathur, 2007. A
survey of malware  detection techniques.
Purdue University.

[6]. 
Daniel
Gibert “Convolutional Neural Networks for Malware Classification”-thesis
2016

[7].  Mansour
Ahmadi  et al.“Novel Feature Extraction, Selection and
Fusion for Effective Malware Family Classification” arXiv:1511.04137 v2[cs.cr]
10March 2016

[8].  Lakshmanan
Nataraj  et al.“Malware Images: Visualization and Automatic Classification”,
International Symposium on Visualization for Cyber Security (VizSec) , Jul.
2011

[9].  
Makandar A., Patrot A. (2018) Trojan Malware
Image Pattern Classification. In: Guru D., Vasudev T., Chethan H., Kumar Y.
(eds) Proceedings of International Conference on Cognition and Recognition.
Lecture Notes in Networks and Systems, vol 14. Springer, Singapore.

[10]edited by D. S. Guru, T.  Vasudev 
et al Proceedings of International conference on Cognition and
Recognition: ICCR 2016-ebook