Sage Journals: Discover world-class research

Abstract

Nowadays, computers, as well as smart devices, are connected through communication networks making them more vulnerable to attacks. Honeypots are proposed as deception tools but usually used as part of a proactive defense strategy. Hence, this article demonstrates how honeypots data can be analyzed in an active defense strategy. Furthermore, anomaly detection based on unsupervised machine learning techniques allows to build autonomous systems and to detect unknown anomalies without the need for prior knowledge. However, the unsupervised techniques applied for honeypots data analysis do not value the advantages of these tools’ data, particularly the high probability that they include a large number of previously unseen anomalies with unexpected and diverse patterns. Therefore, in the present work, the aim is to improve the unsupervised anomaly detection in honeypots data by varying the data feature subset and the parameterization of the anomaly detection algorithm. To this purpose, an outlier ensemble with LOF (Local Outlier Factor) as a base algorithm is proposed. The ensemble outperforms existing solutions as depicted in the experiments where a detection rate higher than 92% is achieved.

Keywords

Outlier ensembles network security anomaly detection honeypots

Get full access to this article

View all access options for this article.

References

Grégio

Santos

and Montes

, Evaluation of data mining techniques for suspicious network activity classification using honeypots data, in: Defense and Security Symposium International Society for Optics and Photonics, 2007.

Lazarevic

Ertoz

Kumar

Ozgur

and Srivastava

, A comparative study of anomaly detection schemes in network intrusion detection, in: Proc. of the 2003 SIAM Int. Conf. on Data Mining, 2003.

Lazarevic

and Kumar

, Feature bagging for outlier detection, in: KDD ’05 Proc. of the Eleventh ACM SIGKDD Int. Conf. on Knowledge Discovery in Data Mining, 2005, pp. 157–166.

Zimek

Campello

R.J.G.B.

and Sander

, Ensembles for unsupervised outlier detection: challenges and research questions a position paper, ACM SIGKDD Explorations Newsletter 15(1) (2013), 11–22.

Aggarwal

C.C.

, Outlier ensembles: position paper, ACM SIGKDD Explorations Newsletter 14(2) (2012), 49–58.

Aggarwal

C.C.

and Yu

P.S.

, Outlier Detection for High Dimensional Data, in: Proc. of the ACM SIGMOD Int. Conf. on Management of Data, 2001, pp. 37–46.

Fraunholz

Zimmermann

Hafner

and Schotten

H.D.

, Data mining in long-term honeypot data, in: IEEE International Conference on Data Mining Workshops (ICDMW), 2017.

Pouget

and Dacier

, Honeypot-based forensics, in: AusCERT Asia Pacific Information Technology Security Conference, 2004.

Sadasivam

G.K.

Hota

and Anand

, Detection of severe SSH attacks using honeypot servers and machine learning techniques, Software Networking (1) (2017), 79–100.

10.

Kayacik

H.G.

Zincir-Heywood

A.N.

and Heywood

M.I.

, Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Intrusion Detection Datasets, in: Proc. of the 3rd Annual Conference on Privacy, Security and Trust, 2005.

11.

http://www.takakura.com/Kyoto_data/, Accessed June 2018.

12.

Song

Takakura

Okabe

Eto

Inoue

and Nakao

, Statistical analysis of honeypot data and building of kyoto 2006

+

dataset for nids evaluation, in: BADGERS ’11 Proceedings of the First Workshop on Building Analysis Datasets and Gathering Experience Returns for Security, 2011, pp. 29–36.

13.

Goldstein

and Uchida

, A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data, PLoS One 11(4) (2016).

14.

Husák

and Kašpar

, Towards Predicting Cyber Attacks Using Information Exchange and Data Mining, in: 14th International Wireless Communications & Mobile Computing Conference (IWCMC), 2018, pp. 536–541.

15.

Breunig

M.M.

Kriegel

H.P.

R.T.

and Sander

, LOF: Identifying Density Based Local Outliers, in: SIGMOD ’00 Proc. of the 2000 ACM SIGMOD Int. Conf. on Management of Data, 2000, pp. 93–104.

16.

Thonnard

and Dacier

, A framework for attack patterns’ discovery in honeynet data, Digital Investigation 5(1) (2008), 128–139.

17.

Gogoi

Bhattacharyya

D.K.

Borah

and Kalita

J.K.

, A survey of outlier detection methods in network anomaly identification, The Computer Journal 54(4) (2011), 570–588.

18.

Owezarski

, A Near Real-Time Algorithm for Autonomous Identification and Characterization of Honeypot Attacks, in: Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security – ASIA CCS ’15, 2015.

19.

Owezarski

, Unsupervised classification and characterization of honeypot attacks, in: 10th Int. Conf. on Network and Service Management (CNSM) and Workshop, 2014, pp. 10–18.

20.

Almotairi

Clark

Mohay

and Zimmermann

, A technique for detecting new attacks in low-interaction honeypot traffic, in: Fourth International Conference on Internet Monitoring and Protection, 2009, pp. 7–13.

21.

Almotairi

Clark

Mohay

and Zimmermann

, Characterization of attackers’ activities in honeypot traffic using principal component analysis, in: 2008 IFIP Int. Conf. on Network and Parallel Computing, 2008, pp. 147–154.

22.

Lance

, Honeypots: Tracking Hackers, Addison Wesley, 1st edn., 2002.

23.

Nanda

Zafari

DeCusatis

Wedaa

and Yang

, Predicting Network Attack Patterns in SDN using Machine Learning Approach, in: 2016 IEEE Conf. on Network Function Virtualization and Software Defined Networks (NFV-SDN), 2016.

24.

Chandola

Banerjee

and Kumar

, Anomaly detection: a survey, ACM Computing Surveys 41(3) (2009).

25.

Fan

Fernandez

and Villagra

V.A.

, Enabling an anatomic view to investigate honeypot systems: a survey, IEEE Systems Journal (99) (2017), 1–14.

26.

Zhan

and Xu

, Characterizing honeypot-captured cyber attacks: statistical framework and case study, IEEE Transactions on Information Forensics and Security 8(11) (2013), 1775–1789.

27.

Zhan

and Xu

, Predicting cyber attack rates with extreme values, IEEE Transactions on Information Forensics and Security 10(8) (2015), 1666–1677.