Telecom Operators face more complex operational challenges than ever before, which often translates into longer periods of incident investigation and recovery. The continuous proliferation of new services contributes to increasing network complexity and 5G promises to add additional challenges to the already hectic day-to-day operations. Network engineers are expected to handle, manage, optimize, monitor, forecast and troubleshoot multi-layered, multi-technology and multi-vendor networks, while the market is systematically demanding higher reliability and lower time to resolution. With the continuing growth in operational complexities, network intelligent visibility becomes a formidable challenge. Without a complete, correlated, multi-perspective network view, engineering teams lack the insights and guidance they need to ensure the network delivers optimal performance and meets customer expectations.
Over recent years, Machine Learning (ML) promised to solve complex problems in the Telecommunication Operational domain much faster, cheaper, and with more accuracy than the best human experts could achieve. In fairness, Machine Learning applied correctly has the potential to predict, identify, and quickly isolate anomalies yielding positive business outcomes. For example, ML applied to Network Observability, has the potential to improve customer satisfaction by reducing Mean Time to Repair (MTTR) and lower operational costs by offloading incident early detection and classification to machines, not humans. This is possible now because of high-quality data sets, advances in computing processing and the popularization of Machine Learning frameworks and libraries simplifying the data analysis process. However, despite the expectations, not always realistic, and the significant investment in the area, the adoption of Machine Learning in Network Operations remains in an immature stage. Prove of that is the modest presence of truly ML-powered applications in the Network Operations Centre (NOC). Normally, the predominance of ML implementations consists of experimental efforts with serious challenges to scale at production level and generalize insights to other networks or simply, adapt to the network evolution. For more information, please see our paper “Why Machine Learning won’t take you to the promised land”
The use case presented in this document is born from a real requirement of gaining better visibility on the applications and traffic behaviour for overcoming a number of performance and operational challenges. The results are obtained by using Thingbook, an Ultra-Fast, Highly Scalable, Zero-touch, Multi-Perspective Anomaly Detection platform, alongside Smart Flow, a traffic flows and telemetry data acquisition, preparation and visualization engine created by Auben Networks. The consumption and analysis of traffic flows and telemetry data, typically serve many vital insights for Network Automation, enabling NOC engineers to rapidly identify problems and proactively spot trend change situations by isolating and correlating multiple relevant anomalies indicating potential threats and/or performance degradation. Both companies have jointly developed this initiative which is commercially available under the name of ADN Smart-flow solution
The dataset used for this experiment contains 15 days Sflow capture from a tier 1 Telecom Operator Data Centre and holds a DoS attack performed during this period. It serves as an example to validate Thingbook’s Multi-Perspective Anomaly Detection capabilities in detecting, correlating and isolating network-relevant anomalies. At its core, multi-Perspective is achieved by deploying and correlating numerous Anomaly Detector Engines across the network as multi-variate Sensors. In its broad range of applicability, our joint solution should be able to detect, isolate and forecast the development of the attack as abnormal behaviour and track the Anomaly life cycle. Thanks to the capability to scale up to thousands of Sensors simultaneously, the solution presented in this document enables network engineers to correlate the information from all affected Sensors and isolate the Root Cause of the Anomaly in seconds. It is worth mentioning that despite Thingbook is not designed to be a DoS attacks detector, the existence of the attack and the previous preparation activities remained unknown prior to the execution of Thingbook + Smart-Flow, which proves the valuable contribution Anomaly Detection can provide to the CyberSec space.
See demo here
Access to the full document here