Service meshes: Heard of them?
By now, you may have. Service meshes are becoming an increasingly important part of the container conversation.This article offers a brief overview of what service meshes do, then dives into what they mean for your enterprise’s security.
What Does a Service Mesh Do (And Why Does It Matter)?
The Connection Problem
To understand why service meshes exist, you have to start by thinking about network connections within container environments.
Consider what happens when you’re running an application that’s native to the cloud. If it’s of any size and complexity at all, it will typically consist of a large number of individual services which must be coordinated so that they will operate together as efficiently as if they were components of a monolithic desktop application.
Multiply this by the number of instances of each service operating at any given time, and the variations in the state and availability of those instances, and it isn’t hard to see how the simple act of connecting one service to another as required could turn into a nightmarish combinatorial problem.
Orchestration is Fundamental
The fact that cloud-native applications don’t collapse into chaos or freeze up from internal logjams is due in part to orchestration tools such as Kubernetes, which organize services and instances into neatly manageable and addressable units so that they can be found and accessed in a systematic manner.
These orchestration tools are a bit like a housing developer that lays out the streets and builds the homes in a new neighborhood—they set up the framework and the traffic routes, but for the most part, it isn’t their job to manage the details of traffic within the neighborhood.
Managing the Traffic
That’s where service meshes come in. When a service needs to make a request to another service, the service mesh provides a standardized interface which makes it possible for this to happen, and it manages the process.
A service meshes such as Istio and Linkerd typically act as a proxy for requests and other traffic between microservices, take care of service discovery and performs a variety of related tasks, including ingress, egress, load balancing and failure handling. When it receives a request for a service, it will find an available instance of that service which fits a configurable set of rules (covering such things as location, version, etc.) and route traffic between the requesting service and the target service.
Heavy Lifting
This means that you can move service discovery and most tasks associated with it out of your application design and code (as well as infrastructure scripting), and let the service mesh handle them. The requesting service only needs to make its request using an abstract identifier for the target service; the service mesh will take care of the rest.
A service mesh may handle much more than this, of course, including tracing, metrics, encryption, authentication and other performance- and security-related tasks. Istio and Linkerd can be used together, integrating the strongest features of both packages for optimum management of microservice-related traffic.
Service Meshes and Security
What does all of this mean for security at the enterprise level?
Do the security and overall traffic management features of platforms such as Istio and Linkerd provide adequate protection? Or, conversely, do they present new attack surfaces and new opportunities for backdoor attacks?
The truth is that any new element of control infrastructure is likely to do a little of both, of course. In the case of service meshes, features such as ingress/egress management, proxying and encryption add security-related elements to the system. At the same time, the mere fact that these platforms manage traffic and access, and are trusted by the application and other infrastructure elements, makes them tempting targets for exploits.
The overall effect of a service mesh is to provide some hardening at the perimeter (i.e., ingress rules) of your application, and to create efficient channels for traffic within that perimeter. In terms of enterprise security, this means that you need to be concerned about at least two (and possibly more) potential routes of attack:
Getting Past the Perimeter
What happens if an intruder gets past the service mesh’s basic perimeter defenses and is able to compromise even one instance of one service? If that service sends a request to or responds to a request from the service mesh, it may be able to inject a malicious payload into the system, taking advantage of the service mesh’s efficient traffic management to deliver the payload to a maximum number of potential targets. If the service mesh trusts a service to be what it appears to be, and the application trusts the service mesh to pass non-malicious data between services, any malicious actor that can present itself as a valid service can take advantage of that trust.
In practice, of course, platforms such as Istio and Linkerd do include features for maintaining secure traffic, including TLC authentication; Istio’s Role-Based Access Control (RBAC) provides flexible, customizable control of access at multiple levels. Intruders which get past these defenses, however, may still be able to move within the system and do damage.
Attacking the Service Mesh Infrastructure
A service mesh platform, like any other element of contemporary cloud-based infrastructure, is code, and it is as vulnerable to attack as any other kind of code. For an intruder, the most tempting attack surfaces might be the rules governing discovery and routing—if a request can be re-routed to an outside location, the entire system may be compromised.
Attacks at other points may be possible. Ingress, egress, proxying and even features such as load balancing might turn out to present previously undetected points of entry. The bottom line is that the more control an element of infrastructure has over the application and the system as a whole, the more tempting it is as a target of attack, and the more closely it must be watched.
Defending Against Attack
What’s the best strategy for dealing with security in relation to service meshes? The good news is that if you are using Twistlock or a similar first-rate modern security service, you are already following the best strategy.
Strong perimeter defenses such as whitelists work with the defenses provided by the service mesh itself, further hardening your application against intrusion. Internal anomaly detection provides an even stronger defense; any out-of-the-ordinary behavior within the program can trigger an automatic response. Network security monitoring can detect and neutralize attacks on the service mesh infrastructure itself.
In a world of cloud-based, containerized applications, service meshes are indispensable tools for enterprise computing. Used in combination with a full-featured, enterprise-level security service like Twistlock, they do not need to, and will not, compromise your organization’s data security.