big data and information security

I.              
Abstract

This paper proposes the
threat that lies over the huge piles of data registered, stored by numerous
Enterprise. Each Enterprise big or small, generates a huge stack of data which
can be in regards to data subject to customer’s personal information or it can
be some crucial information regarding the profits and losses of an Enterprise
or in fact regarding some information on company’s private and top secret
policies etc. This data is very crucial to any Enterprise and can decide the
future of a company to rise or demolish at one’s. Hence, this data needs to be
preserved and needs to be protected from getting captured by anyone who could
misuse it.

These days cloud storage
is widely popular as it reduces the junk in system and makes the data available
to various system without occupying space in any. But, this data stored in
cloud face numerous cyberattack in order to capture them and drain the crucial
information out of it. Hence our paper focuses on this side of the data, it’s
security, the threats it face and the measures to keep it safe.

II.            
Introduction

Our Research revolves around the factors of Big Data, We all know what
it refers to, huge stacks of information embedded bit by bit to form a complete
database. We can find big data everywhere, in colleges, hospitals, banks, stock
markets etc. Why is it called big data? This is because it deals with huge sums
of information. Complex data bytes all set together to represent information.
But, this crucial data has also attracted some attackers from cyber world, who
try on capturing this data to extract the confidential data and make huge
profits either directly by making some huge blunders in accounts and
transactions or by encrypting the files and demanding a ransom such as bitcoins
in return to release the data. Some, do it just to annoy and create a chaos,
like by making the data public gaining nothing out of it.

Now, it’s the need of an hour to secure this data and avoid leaking of
personal information as it can prove out to be disastrous at some point of
time. Various conditions can be held responsible for the attack to occur and
some of them can be easily suspected and categorized. Big data has proved out
to be a loss for it’s own security. Its own complexity and its own quality of
holding large sums of data has made itself more prone to attacks. But, there
are always few ways that can help us to detect an attack before it has actually
occurred so as we can take some extra measures in order to stop it or at least
reduces its ill-effects.  Always has
there been a need to protect and so is it needed in future. The future
techniques need to be more vigorous though in order to fight against the
dominating powers of attackers and the all new varieties of attacks that they
have developed.

III.          
Big Data and its sensitivity to attacks:

An enterprise, including your telephone service provider to Google a
famous search engine keeps track of your searching habits to App merchants that
can access some sensitive and personal data of yours via their application/user
agreement license. The more they get in close to end-user’s personal data the
more they hold the personal information of their customer’s/users. This
personal information can tell a person a lot about the user and can be a root
cause to compromise with security and privacy of that user, and some hackers
look for this information in order to plant a malware for that user. So, this
is possible only if they can sneak in a user’s internet usage habits which is
possible if they can hack this data and get the info about numerous users.
Hence, big data looks like a big stack of honey to this greedy bee sort hackers
and hence is more often to experience a cyberattack. Now, in order to prevent
the interest of their fellow users it is the responsibility of these data
holders to keep this personal information secure and as confidential as
possible.

Reasons for these attacks to happen :

·      
User – This can refer to end user authentication and security, it can trace the
data such as users working choice ,user’s taste preferences, his/her location
information as well as the types of browsing trend they have inhabited. They
most widely keep a track as to what site’s are mostly visited by the user and
if any of the can be made prone to a cyber attack or of any of them that makes
it easier for them to attack.

·      
Content – The content present in the data also decides it’s vulnerability to
attack. The type of file/document, password. Patterns suck as (11 characters
together can be an account no, 4 digit together can be an ATM pin). The
attacker could run an algorithm that checks the device repeatedly, basically to
check the information, data patterns they are looking for. These algorithm can
crash the security measures easily as to what they are doing is simply checking
the file type which is not a possible ransomware.

·      
Customers – This would be more crucial fro he customers of a debit
card or a credit card company as well as any company that could involve process
of premium payments. The attackers would not just attack the database of an
ice-cream parlour to get the database of ice-cream prices. They would continue
to look for some database that helps them to gain privacy of people for which
they can demand something in return. Hence, databases of banks and above
mentioned companies get more prone to attacks and therefore would need special
attention for information security.

·      
Networks – They type of network including minute details of it together can play
a crucial role in determining the attack. The Source and Destination as well as
the Time Zone (Date as well as Time), the bandwidth of the network and the
activity. This can be briefly summed up as, that a database of a bank in some
African country is more prone to attacks than compared to that of any Canadian
bank. Reason behind being, that the security and network strength varies.

·      
Device – Together including the software and types constitute of it. Whether
the software is updated on regular basis or not and if the security certificates
are revised on regular basis or not determine the same. If a device uses an
outdated software then, surely it gets more prone to attacks, as the attacker
might have software times more advanced and times more faster than the device.

 

Tricks to identify this attack:

1.      Irregular
trends in transaction – If a
system or network makes too many request for a transaction like it had never
made before, can be a determining factor in a security measure. As, it is
possible that this request is a part of an attack made on the database.

 

2.      Anonymous
IP Addresses making request to network – If an IP Address of a request is found to be unusual or hidden, there
is a higher possibility that it’s an attack and the Sender is too smart and
knows that IP Address can get him behind the bars and therefore this can make
an attack from a non-detectable source, PC.

 

3.      Unusual
traffic in the network, can also cause congestion – When there are suddenly too many request on
a particular, though it can be a normal scenario. But. There are fair chances
that it’s an attack and has been planted to jam the site/ overload the server
to make it inactive to prevent against the attack.

 

4.      Suspicious
software making transaction request – Sometimes the software/ technology can also help us to detect an attack.
What if we are receiving a request via a system software, never interacted
with. Can it be someone’s innovation to ease his attack on our data base.

Protecting Big Data:

In the year 2017, on July 26 an attack over the database of Arkansas
Oral Center took place also specialising in facial surgery where it made the
X-ray files as well as the documents along the emails became encrypted. Though
the database of the patients was safe and could not be encrypted by the
attacker. The attack was though soon prevented but, it was figured out that due
to this the hospitals and patients had to bear the chaos for almost up to 3
weeks.

Though the attack here was cured and the losses here couldn’t be
accounted as too big. But, what if the attack was made on a bank and what if
it’s database was encrypted, hacked by the hacker. It could have easily
resulted in losses of precious and confidential customer data including ATM
pin, Card No, Account no etc. That can even sum up to losses of millions of
dollars together. Hence, here comes in picture the concept of Big Data and it’s
security.

What makes Big Data more prone to attacks is:

·      
Complexity – Big data knowingly is too complex as it holds huge piles
of information about the customer, Some of them are so informative that if this
information falls in the hands of a wrong person that it’s just like that
person can use that information and cause huge losses to the person information
belongs to.

 

·      
Huge Pile of data –  When
we call it a Big Data it means that we are talking about the data of a lots and
lots of people. Just imagine, hacking email id of a person to get his personal
info and hacking a bank’s server to get the data of almost 0.1 million
customers. Of course, the second one is more beneficial for the hacker which
means big data gets more prone to such attacks.

 

 

·      
Storage – Data so big can’t just fit into any system or any other electronic
device hence they need to be stored on cloud storage/servers. Secondary reason
behind is that such databases needs to be accessible globally as numerous
systems might have rights as well as needs to access that data. Hence the data
is stored on clouds. The only thing preventing such data could be
authentication key which if fooled via complex hacking algorithms can lead to
disasters of leaking information and making it public

 

IV.           
Scope of improvement:

 

It is very important to realise the fact that safeguarding the data,
especially when it comes to big data, where information of huge sums of people
is been on risk, the safe and secure environment for the data is our primary
need. We know if the personal information of a person leaks out or if it gets
in public just like the confidential information in our Adhaar system, it will
be almost similar to cloning a person as you have his/her all identifications,
info. Etc. We could figure out some point during our research that could be
very crucial in starting future research. They are as follows:

·       We need to find out some ways that can actually
tell us about the strength of a network as to how much secure is it.

·       What if we could maintain a record of the conditions,
every time an attack happened, as we could actually analyse it to find the weak
point in our system, network and it could also help us to categorize the IPs,
systems that can be an attacker.

·       Also, finding out techniques that could be more
efficient in securing data and reclaiming it in case the attack has already
happened.

 

V.             
Conclusion:

 

There is no way we can say that if big data is so prone to attacks,
let’s just avoid it. After all, Big Data has now become a part of Computer
Science and a necessity as anything that comes into picture gets registered on
systems and this adds to the already existing stack of data producing Big Data.
The more we shift to technology the more data we have. More we make it digital,
more we produce the data on keyboards. This data shall be kept confidential as
it shows the identity of a person, if made public then the Enterprise will not
be able to preserve the interest of its customers and people won’t be able to
share their info freely and would be more prone to any cyber information
misuse.

If this information is misused then it can get an innocent person to be
guilty in any crime just coz his personal data was misused. This can also cause
a loss of millions of dollars if the attack is over a bank or a stock exchange
market. The misuse of transactions would debit huge sums from the pockets of
innocents. One’s an attack occurs if the control is not reclaimed timely, it
can increase the level of ransomware destruction and can increase the losses.
Therefore, to keep our data systems safer we must keep upgrading our softwares
and security features in order to keep the user info. As confidential as
possible in order to avoid loss of data and resources.