Every mobile operator’s nightmare: Excited fans are pouring into a 5G-hosted live stream of the World Cup finals. Three minutes into the match, the network collapses. Viewers are irate and social media is going crazy. Worst of all, there’s no end in sight as frantic efforts are made to isolate and address the issue.
What if on the evening before the match, the operator instead ran a series of high-scale active assurance tests that used machine learning (ML), based on historical major events, to identify end-to-end streaming performance issues. Signature threshold violations would automatically trigger a workflow that deploys additional active test agents at key network interfaces along the end-to-end path to identify where the problem originates – maybe, a backhaul network. The traffic is re-routed to the secondary link and within minutes, the required performance is restored. Eight hours later, the finals begin, and the operator’s network performs without a hitch.
We can all agree that the latter scenario is the type that brings sweet dreams. Knowledge-inspired confidence counts for everything during make-or-break moments when performance matters most.
As we noted in a , finding issues before the customer experiences them is critical to ensuring Service Level Agreement (SLA) adherence, whether for low latency network slices or any of 5G’s other new capabilities. Today, rapid problem identification and resolution are possible with the right balance of active assurance, ML, and automation.
Active assurance: 5G’s eyes and ears
Active assurance provides end-to-end visibility into complex 5G networks.
It proactively simulates users on the network by deploying active test software agents anywhere in the network—whether owned by the operator, a hyperscaler, or an enterprise. Large or small amounts of traffic can be generated that precisely emulate the traffic users would generate.
As in the World Cup example, more test agents can be deployed to find the root cause of an issue through network segmentation.
AI/ML: rapid intelligence and insights
ML is the workhorse of artificial intelligence (AI), creating the neural nets and prediction models that are then fed into AI systems. AI is the arbiter of the data and takes industry requirements and other factors into consideration to draw conclusions and initiate actions.
By using history to learn performance thresholds for each part of a network—and keep learning as conditions change—ML enables the detection of network issues with amazing speed and accuracy.
Automation: execution speed
Why are active test and AI/ML so important for 5G? Some say that 5G is 50 times more complex than previous generation networks. This should not be surprising, as network functions, network slices, encryption, edge devices, applications and infrastructure are all in a continuous state of change.
Continuous integration/continuous delivery (CI/CD) methods mean a constant stream of updates to the production network. These updates need to be validated before they go live. As network resources are dynamically instantiated in the network, they also need to be tested.
Fall behind and the network becomes a chaotic free-for-all that can’t deliver on the high quality and performance that emerging use cases require. Manual processes would be impossible.
Automated assurance: a move toward self-healing networks
Traditional (manual) methods for performance monitoring, root cause analysis, and SLA management are no match for the complexity of 5G and the volume of traffic going through it.
Using AI and ML-triggered workflows, coupled with network topology information, active assurance systems work with the network orchestrator to insert active test agents wherever and whenever they are needed in the network, to:
Perform validation testing of newly activated functions, network slices and infrastructure (before, during and after they go live).
Proactively and continuously monitor critical links and services from an end-user perspective.
Troubleshoot any issues by deploying end-to-end and segment-level tests to isolate underlying root causes.
Confirm fixes with change management use cases, which completes the assurance cycle: activate, monitor, isolate fault, and validate change.
AI/ML, together with the ease of automatically deploying and redeploying agents where they are needed, leads to faster issue resolution and avoidance of customer-impacting issues. It is a major step toward closed-loop orchestration and self-healing networks.
Read our eBook