Theoriginated from the realization by leading telcos and networking vendors that they could learn something from the success of the web-scale world and apply those learnings to their business and network evolution. Whether it is 5G core, edge computing, SASE deployments or enterprise digital transformations, they are all embracing cloud-native principles in their network designs. The networking industry in turn is delivering hyper-converged infrastructure, cloud-native network functions (CNF), service mesh layers, SDN-enabled network orchestration and automation solutions to enable cloud-era networks.
Cloud-native principles and how they apply to cloud-native networks
With services, applications and microservices popping up across distributed domains to deliver an outstanding end-user experience for these applications, you need highly scalable, resilient, agile, and secure network connectivity to realize the promise of cloud-native networking.
The following are three key principles to keep in mind with cloud-native networking:
1. Small, stateless microservices
“Big iron” appliances or virtual appliances which were monolithic in nature are now transforming into a microservices architecture that is comprised of decoupled (little to no dependencies), function specific and smaller (code size, compute footprint) software components that make up a CNF.
Where possible, these discrete functions will be stateless (i.e., the container hosting the microservice doesn’t store any state in the container but in a common/shared database or storage) and immutable components (i.e., once the container is launched you do not modify it). These characteristics make scaling and upgrading/downgrading much faster and efficient because when a service has an issue there is no need to restoring the service to the last known state.
In addition, a microservice architecture delivers the flexibility to scale only the service(s) you need to scale, unlike everything you would in a monolithic application. Since smaller components are deployed in a container, the scale-in and out can be much faster as a result than in VMs.
Similarly with upgrades, you launch a new container instance with the updated software and turn off the old instance, for a seamless process with no downtime. Stateful services must address the consistency, and portability of state, which typically requires replication across one or more containers while making sure that consistency of the state is maintained during scaling and upgrading.
2. DevOps and automation
To enable agility and incremental innovation, cloud-native networks need to adopt the DevOps model.
As discussed earlier, cloud-native network functions with stateless, microservices architecture tend to be comprised of a substantially larger number of components. Furthermore, each microservice is designed to scale in and out independently, which leads to multiple instances of each microservice being deployed to handle the load of a CNF.
This means an instantiation of a CNF would require deploying as many as dozens or hundreds of containers. It is infeasible to carry out such deployments manually, so CNFs are always orchestrated in a way to automate the deployment process. Likewise, orchestration is required to automate operations such as scaling of the different microservices and healing failed instances, because these would be too complex and onerous to perform manually.
Typically, network functions offer hundreds or thousands of configuration options manipulated by CLIs and APIs. Configuration of traditional networks is defined procedurally by following a sequence of steps towards a desired configuration state.
By contrast, the cloud-native approach to deployment and configuration is declarative. The desired deployment and configuration are described in full in a structured document (YAML and HELM) and made available to the container orchestrator. Declarative approach to configuration provides far greater control over changes, reduces the likelihood of bad configuration being injected into the network, and greatly speeds up recovery as it can be automated in the DevOps pipeline.
Another key aspect to ensure a robust DevOps is to have well-defined APIs that can be used to interconnect different microservices and as well as CNFs from different vendors to create end-to-end service chains.
3. Platform agnostic deployments
At the end of the day, telcos and enterprise are looking for solutions that can be deployed anywhere, without vendor lock-in. Theoretically, cloud-native container architecture delivers the abstraction to run applications and networks in a “platform agnostic” way, directly on COTS hardware, on-premises or public-cloud.
Key challenges faced by cloud-native network operators
Clearly cloud-native networking is a promising transformation, however as with any innovation it is not without its challenges. In the next section, let’s discuss some of the key challenges that cloud-native network operators are facing:
Dealing with legacy baggage – The reality is that not everything can be cloud-native overnight. Many operators are dealing with legacy virtual networks and network functions as they transition. In addition, cloud-native network functions are not sufficiently decoupled microservices, making scaling and operationalizing the network tedious.
Many moving parts, many points of failure – A platform agnostic journey means everything is disaggregated, enabling changes to the network on-demand. This flexibility offers network operators many options for compute, NICs, container orchestrators (OpenShift, TANZU, Kubernetes), cloud instances (AWS EKS, GCP GKE, Azure AKS), along with various CNF options from multiple vendors, potentially leading to increased points of failure. Even a simple upgrade to a newer CPU or a new version on Kubernetes CNIs or Linux distribution can have unexpected consequences on performance. With so many moving parts, troubleshooting and root cause analysis becomes much more difficult.
Impact of elastic scaling – Cloud-native infrastructures enables dynamic scaling of workloads per load and demand. The total lifetime of the workload is reduced to minutes and hours from the previous span of months or years. This presents a new set of challenges in rightsizing cloud infrastructures and optimizing cost, while minimizing disruptions to end-user quality of experience (QoE) as cloud-native networks and services are deployed. In addition, in cloud-native environments, network functions share the underlying infrastructure with other services, which means when ensuring performance of CNFs and services , there is also a need to characterize the impact for multi-tenancy (“noisy neighbors”) in order to compensate for its impact.
Built-in security – The network security paradigm is changing, too. The agile nature of containerized environments means that the middle boxes that are servicing or securing the environment face a set of unique challenges. They now need to handle the elastic, dynamic nature of containers at a much faster and larger scale since cloud-native applications tend to have hundreds of services associated with them. Network operators need to ensure security and application policies automatically scale in and out with application assets as soon as they are created, tracking all changes until that resource no longer exists. DevSecOps integrates parts of security into CI/CD pipelines, encouraging teams to bring security into the development phase. In addition, there is a need to characterize the impact of these policies on performance and user experience.
Exercising resiliency –. Issues are inevitable in this dynamic environment, so architecting a distributed microservice solution and ensuring the cloud platform or the container orchestration engine can detect and mitigate infrastructure issues is essential. Microservices may restart, scale out, and even be redistributed to a different node. We can’t just assume they will just work, instead need to ensure we characterize the impact of microservice impairments (e.g., pod failures, node reboots, network latency, CPU spikes, etc.) on CNFs. The goal is to emulate real-world failures during preproduction to proactively uncover issues before CNFs are deployed into production networks. This will in turn enable setting the correct thresholds under which resiliency actions will take effect.
Keeping up with constant change – Lastly, the rate of change in a cloud-native environment is ten times greater than legacy networks. Vendors delivering cloud platforms, NFVI, and CNFs are continuously bringing new innovations to market and at much faster rates than ever before. Network operators need to validate and optimize their infrastructure components to understand the impact of change and updates they are incorporating by comparing baseline performance prior to the update) with new performance (after the update) to ensure robust continuous integration, continuous development, and continuous testing (CI/CD/CT) practices.
Understanding the impact of change in cloud-native environments
Implementing proactive testing and continuous validation is key to overcoming the above-mentioned challenges.to deliver scalable, realistic application workload emulation across on-prem, VM, public cloud and container environments, and help optimize performance, user experience and security of cloud-native networking solutions.
The following are some of the key use cases CyberFlood helps address to enable cloud-native network readiness:
Right-size cloud infrastructures, cloud-instances, and optimize with the right NIC drivers, Linux distributions and CNIs by testing the performance of various Kubernetes implementations like OpenShift, AWS EKS and others.
Validate performance and scale of Ingress Controllers by emulating external to cluster traffic emulation (north-south).
Benchmark performance and elastic scalability of NGFW, WAF, ELB CNFs with real application workloads.
Validate complex distributed, hybrid, containerized deployments of CNFs.
Validate security efficacy and the impact of security policies on performance and user experience.
Provide comprehensive API support for seamless integration into any CI/CD “pipeline” for automated, repeatable testing at any deployment phase of a new application or service.
Learn how CyberFlood can help with proactive testing and continuous validation to