Cyber Security using data Mining Techniques: A Survey
Siddharth Nayak1, Shivani Verma2, Deepak Kumar Deshmukh3
1Network Engineer, Benzfab Technology Pvt. Ltd., Bhubaneswar, Odisha, India.
2Lecturer, Department of Computer Science, Central College of IT, Raipur (C.G.), India.
3Assitant Professor, SoS in Computer Science and I.T., Pt. Ravishankar Shukla University, Raipur (C.G.), India.
*Corresponding Author E-mail: siddharthnayak0223@gmail.com, shivaniverma2303@gmail.com, deepakdeshmukh.bit@gmail.com
ABSTRACT:
KEYWORDS: Cybersecurity, Data Mining, DoS, Machine Learning, Cyber Crimes.
1. INTRODUCTION:
Cybersecurity:
For reliable cyber infrastructure against deliberate and potentially malicious threats, a growing community-oriented exertion between cybersecurity experts and researchers from foundations, private industries, the scholarly world, and government organizations has occupied with misusing and designing an variety of digital protection frameworks. Cybersecurity scientists and architects aim to keep up the secrecy, respectability, and accessibility of data and information management systems through different digital barrier frameworks that shield PCs and systems from programmers who might need to barge in on a framework or take monetary, clinical, or other personality-based data. Traditional cybersecurity frameworks address different cybersecurity dangers, including infected virus, trojans, worms, spam, and botnets. These cybersecurity frameworks battle cybersecurity dangers at two levels and give system network and host-based guards. System based resistance frameworks control arrange stream by organize firewall, spam channel, antivirus, and arrange interruption location methods. Host-based defense systems control up and coming information in a workstation by firewall, antivirus, and intrusion detection procedures should be present in the system. In any case, and, after its all said and done these strategies are insufficient to spare the cyber infrastructure as hackers continue finding new ways consistently to harm the cyber infrastructure, however with the assistance of data mining techniques, we can comprehend the patterns of hackers attacks associate with them and we can spare from those assaults.
The below figure shows the parameters of cyber security [12].
Figure 1: Parameters of Cybersecurity.
Data Mining:
Because of the accessibility of large amounts of information in cyber infrastructure and the quantity of cybercriminals attempting to access the information, data mining, AI, measurements, and other interdisciplinary abilities are needed to address the difficulties of cybersecurity. Intrusion Detection System uses data mining. Data mining is the extraction, or "mining," of information from a huge measure of information. The solid examples or rules detected by data-mining strategies can be used for the nontrivial forecast of new information. In nontrivial expectation, data that is verifiably displayed in the data, however, was previously unknown is found.
There are two approaches to data mining techniques: supervised and unsupervised. Supervised data mining techniques to foresee a hidden work utilizing preparing information. The training data have sets of input variables and output marks or classes. The output of the technique can foresee a class mark of the input variables. Instances of supervised mining are classification and prediction. Unsupervised data mining is an endeavor to recognize covered up patterns from given information without introducing training data. Classic examples of unsupervised mining are clustering and associative rule mining.
Data mining is also a fundamental piece of knowledge discovery in databases (KDDs), an iterative procedure of the nontrivial extraction of data from very huge source of database and can be applied to creating secure cyber infrastructures[1].
The below figure depicted the structure of data mining in ethical hacking tools[13].
Figure 2: Structure of Data Mining in Ethical Hacking Tools
2. LITERATURE SURVEY:
2.1. Prof. Rukaiya Shaikh et al. presented a methodology for identifiying the DOS attack using mining technique.
This paper tells us about cyber crimes which has a great space in the cyber world. Due to the increase of cybercrime there act should be built by the government of the nation. Mostly DoS attack is used by cybercriminals in the present era which will be the vast problem of today. This DoS attack can put the server under a critical situation where the server cannot serve the service to the authorized users. Due cure from this situation cybersecurity is needed to prevent information from unofficial activities. Here the approach of data mining should be used for prevention purpose, which deals with the new pattern of unpredicted data that are invisible. K-mean algorithm is used under the clustering technique to identifying probabilities of attacks which should be discussed under this paper [2].
2.2. Anna L. Buczak et.al. presented a survey of data mining and machine learning methods for cyber security intrusion detection.:
In this paper mentioned about the latest methods of achieving cybersecurity with the assist of firewall, antivirus and Intrusion Detection System (IDS). In this security violation, both the internal and external IDS have been involved. Various cyber investigating techniques for IDS such as misuse-based, anomaly-based, and hybrid which thoroughly described under this literature. Different approaches have been for cybersecurity that are:- Artificial Neural Networks that deals with pattern recognization. Newly versions of ANNs, Bayesian Network, and clustering can help in Misuse Detection, Anomaly Detection, and Hybrid Detection. This paper also described the new and advanced learning methods needed to prevent the exploitation and inconsistency of data information [3].
2.3. Mr. G. Sivaselvan et.al. presented a methodology on applying data mining techniques in cyber crimes.:
2.4. Mr. G. Sivaselvan et.al. presented a methodology on applying data mining techniques in cyber crimes.:
This methodology describes the cybersecurity techniques, says that Data mining techniques are existing users in analytical research and furthermore in business, for the most part, to collect estimations and gainful information to improve customer relations and advertising methods. Data mining is a technique that incorporates investigating data, foreseeing upcoming patterns, and making proactive, learning built choices based on deference to large datasets. Data mining also termed as Knowledge Discovery in Database (KDD). KDD includes:- Pre-handling, changing, mining, pattern assessment, and involving new techniques with Artificial Intelligence and Database systems. Further security purposes for malicious in Data mining observation methods used such as Anomaly observation, Misuse observation, and Hybrid Observation, which has been detailed described in this paper. The advantage of these methods' ability to accept harmful programming has ability to recognize both known and zero-day assaults [5].
2.5. Shivangi Gupta et.al. presented a methodology on cyber security threat intelligence using data mining techniques and artificial intelligence.:
This methodology describes cybersecurity act about genuine enthusiasm for the various association since there limits of utilizing internet associated information gadgets which are opening portals doors for cyber strikers. Each association is probably going to be experienced by the influent of strikers and cybercriminals. They target both large associations just as little association in the general population and private areas. For more security, these large organization uses petabyte for more secures and insensitive data. Cyber Security protections are conveyed in a treat for basic assaults, advanced assaults, and emerging assaults. This progress required the requirement for dependable data security and frameworks. This survey incorporates factual strategies to test risk insight dependent on information mining systems. Further, the extent of the paper can be expanded utilizing man-made consciousness calculations utilizing information mining systems to accept unknown examples of risk [6].
2.6. U.U.Veerendra et.al. presented a methodology on data mining for security applications.:
This publication has focused on internet security techniques that involve national security. Data mining is additionally being applied to give arrangements such as interruption discovery and investigation. The dangers to national security incorporate assaulting structures and wrecking basic frameworks such as power matrices and media transmission frameworks have been discussed in this paper. The mechanism to be applied to protect the country's security systems, we have to comprehend the kinds of dangers that are real-time dangers and non-real time dangers that are thoroughly explained in this paper. By including data criminals which mean cyber violence just as security breaking through access control and different method. Mostly crimes should be done by the uses of cyberspace becoming a vast area for criminal business. To cure such situation some detecting techniques, related to this research and evolution were mention in this paper. We are upgrading the procedures we have created to lesser distorted positive and distorted negatives. Besides, we are investigating the relevance of our systems to disseminated and extensive situations [7].
2.7. Sudha Nagesh presented a methodology on roll of data mining in cyber security.:
As described in this literature that for the security purpose their exclusive software are used for vast information records. Special involvement of inconsistency assault, unmask strike, pattern discovering, bunching, representation, characterization, and affiliation find rules. It has a big issue for the country's security as well as in cyberspace. The dangers to national security incorporate assaulting structure and destroying basic foundations, such as power matrices and media transmission framework. The traditional way to deal with making sure about system frameworks against digital dangers is to structure components, for example, firewalls, confirmation instruments, and virtual private systems that make a defensive shield. Defining intrusion detection methodology included cyber alarm, insider warning, and External strikes. Its involvement to strike on the framework with spiteful strikes, both related informal and non-informal. The main objective is to investigate data mining and correlated information system mechanism to investigate and protect from structural attacks [8].
2.8. S.Padmapriya et.al. presented a methodology on enhanced cyber security for big data challenges.:
In this duration of technologies, our digital media is more responsible for the misuse in the extension of big data. That deals with somehow contribution of large digital data for investigating, visualizing and to draw a bit of knowledge for the expectation and prevention of cyber assaults. The organization has a huge load of information means big data has a testing job. Due to cause hacking of secret data to different privacy factors, such application of SQL injection had been used to disconnect the access of databases from unauthorized users. Advanced techniques using the code parsing technique for secured data accessing and correct user recognization has been focused in this paper because of question arises for the protection of big data. This literature review had also getting knowledge about some probable solution to get a path for undertaking to achieve benefit by accurately sure about the organization data. As the scrambled database transforms the first database programmer exactly searches its hardness to embedded spiteful code or strike the data [9].
2.9. Huaglory Tianfield presented a methodology data mining based cyber-attack detection.:
Recognizing cyber assaults no doubt has become a big data issue. This literature study about cyber assaults on data mining. Cyber assaults have a large implementation. The space of cybersecurity is naturally a powerfully transforming one latest strike is more powerful than the ancient strikes which create a war between strikers and protectors. Cybersecurity can be implemented to protect the framework. While getting more powerful strikes different cyber control space and various wellsprings of security data information are applied. Cyber assault recognition includes the examination of enormous information, that has a big problem if investigating methods can’t be taken utilization of creditably bogus alerts are particularly dangerous. Cybersecurity awareness had expressed as the situational awareness applied for cybersecurity in data framework which includes various methods is described in this paper. Data mining located on cyber assaults location includes various stages. Identification framework alarms the framework head that an assault has happened to utilize a few strategies which likewise conclude or control assaults by shutting system ports or executing forms. Covering progressed cyber assault issues, one of the specialized difficulties lying ahead is the means by which the models of regulated and unregulated learning. The model of cyber assaults discovery and the disconnected and online framework modes need to be interlinked all together into a natural powerful cyber assault location framework [10].
2.10. Muni raj Choopa presented a methodology on data mining and security in big data.:
The information has been becoming quickly because of the digitization of the world. Association is putting away this huge measure of information in its database as large information that has been keeping up as clusters that are not giving security and classification of information being put away. Data aggregation has been expanded is anticipated to increment by around 650 % throughout the following hardly such years. We have to keep security issues in data mining and the processors who are giving a result of the data mining should ensure, not to uncover the personality of the customers. Multiplex security with concealing calculation for giving security of information present as large information. Data mining is the way toward removing information, in these various clients will involve in stage-wise. Clients incorporate data suppliers, data gatherer, and data miner and conclusion. This paper we investigate security prospect of big data. Security is imperative to this information from unapproved utilized also includes secretly conserve with the company. Further security also addresses multi-stage privacy with masking theory which described in this paper. This paper also provides the consequences of the preparation time and measure of information handled in the beneath table which has 300 hub groups the consequences of the preparation time and measure of information handled in the beneath table which has 300 hub groups. In this, the fundamental worry on giving data security to information obtain and in each stage from source information to give a resolution. This also presented multi-stage with hidden that target the information from the split of security and access to information unapproved clients. Future research on this idea will go further with improved calculation [11].
Cybersecurity plays an important role in the present world of cyber. Due to the reason for the big set of data more security must be needed to secure our data. By data mining, we can secure data as much as possible, and both techniques of data mining such as supervised and unsupervised are using with the help of machine learning that includes k-mean algorithm for pattern finding and clusters are use for known pattern of assaults and outliers are represents new assaults are provide very reasonable security in cyber world but in the future, with the merging amount of data more awareness and technique need to be invent to provide more security in cyber world.
1. Sumeet Dua and Xian Du, “Data Mining and Machine Learning in Cybersecurity,” Auerbach Publications Taylor and Francis Group, ISB N 13: 978-1-4398-3943-0, 2011.
2. Prof. Rukaiya Shaikh, Aman Memon, Manoj Kumar, and Ismaeil Pathan, “Preventing Cyber Crimes by Using Data Mining Techniques”, IJIRCCE, Vol. 5, Issue 10, October 2017.
3. Anna L. Buczak, Member, IEEE, and Erhan Guven, Member, IEEE, “A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection,” IEEE COMMUNICATIONS SURVEYS and TUTORIALS, Vol. 18, No. 2, Second Quarter 2016.
4. Mr. G .Sivaselvan, Dr.V.Vennila, R.Senbagavalli, E.Shanmugapriya, K.Umadevi, and S.Suganthi, “Applying Data Mining Techniques in Cyber Crimes,” IJAREC, ISSN 2348-2079, Vol. 6, Issue 2.
5. Mrs. A. Meena, “Data Mining Techniques Used in Cyber Security,” International Journal on Future Revolution in Computer Science and Communication Engineering, ISSN 2454-4248, Vol. 4, Issue 11, pp. 19-21.
6. Shivangi Gupta, A. Sai Sabitha, and Ritu Punhani, “Cyber Security Threat Intelligence using Data Mining Techniques and Artificial Intelligence,” International Journal of Recent Technology and Engineering, ISSN 2277-3878, Vol. 8, Issue 3, September 2019.
7. U.U.Veerendra, B.Ravitheja, and K.Veeresh, “Data Mining For Security Applications,” International Journal of Recent Technology and Engineering, ISSN 2348-9510, Special Issue, pp. 38-44, 2017.
8. Sudha Nagesh, “Roll of Data Mining in Cyber Security,” Journal of Exclusive Management Science, ISSN 2277–5684,Vol. 2, Issue 5, May 2013.
9. S.Padmapriya, N.Partheeban, N.Kamal, A.Suresh, and S.Arun, “Enhanced Cyber Security for Big Data Challenges,” International Journal of Innovative Technology and Exploring Engineering, ISSN 2278-3075, Vol. 8, Issue 10, August 2019.
10. Huaglory Tianfield, “Data Mining Based Cyber-Attack Detection,” System Simulation Technology, ISSN 1673-1964, Vol. 13, No. 2, pp. 90-104, Apr. 2017.
11. Muni raj Choopa, “Data mining and Security in Big data,” International Journal of Advanced Research in Computer Engineering and Technology, Volume 4 Issue 3, pp. 1064-1069, March 2015.
12. https://www.researchgate.net/profile/Parashu_Pal2/publication/321528686/figure/fig1/ :631621552177191@1527601721595/Parameters-of-Cyber-Security-III-LITERATURE-REVIEW-In-2013-Preeti-Aggarwal-proposed.png , D.o.A.- 31-01-2020
13. https://www.google.com/search?q=architecture+of+data+mining+in+ethical+hacking+tools&tbm=isch&ved=2ahUKEwif9L7snOPnAhW4CrcAHVi1DxcQ2-cCegQIABAA&oq=architecture+of+data+mining+in+ethical+hacking+tools&gs_l=img.12...49697.69991..83884...1.0..0.357.9085.0j34j12j1......0....1..gws-wiz-img.......35i39j0i131j0j0i67j0i30j0i5i30j0i8i30j0i24.U2IlzondF0g&ei=1xpQXt-sEriV3LUP2Oq-uAE#imgrc=PpblD5VcmLfXcM, D.o.A.- 31-01-2020
Received on 18.10.2020 Accepted on 20.12.2020 © EnggResearch.net All Right Reserved Int. J. Tech. 2020; 10(2):138-142. DOI: 10.5958/2231-3915.2020.00025.5 |
|