Open Access Open Access  Restricted Access Subscription Access

Hybrid Big Data Architecture for High-Speed Log Anomaly Detection

Natawut Nupairoj,
Pittayut Tangsatjatham,


Anomaly detection in network traffic can be very challenging, especially for environments with high-speed networks and lots of servers. In these environments, log data of network traffic is usually large, coming at high-speed, and have various formats, the classic case of big data problem. This makes anomaly detection very difficult due to the fact that to get good accuracy, large amount of data must be processed in real-time. To solve this problem, this paper proposes a hybrid architecture for network traffic anomaly detection using popular big data framework including Apache Spark and Apache Flume. To demonstrate the capabilities of our proposed solution, we implement a SARIMA-based anomaly detection as a case study. The experimental results clearly indicated that our proposed architecture allows anomaly detection with good accuracy in large-scale environment effectively.


Big data; Real-time; Log processing; Hybrid processing; Lambda architecture

Citation Format:
Natawut Nupairoj, Pittayut Tangsatjatham, "Hybrid Big Data Architecture for High-Speed Log Anomaly Detection," Journal of Internet Technology, vol. 18, no. 7 , pp. 1681-1688, Dec. 2017.

Full Text:



  • There are currently no refbacks.

Published by Executive Committee, Taiwan Academic Network, Ministry of Education, Taipei, Taiwan, R.O.C
JIT Editorial Office, Library and Information Center, National Dong Hwa University
No. 1, Sec. 2, Da Hsueh Rd. Shoufeng, Hualien 97401, Taiwan, R.O.C.
Tel: +886-3-931-7017  E-mail: