A sample cloud IDS solution (part 3 in a series)
In the beginning of this series, I've discussed the rational behind the need for IDS and implementing it using a lift-and-shift visibility strategy. The last post introduced a cloud native visibility architecture by leveraging the public cloud to provide a centralized management service and docker container as the primitive building block for the agents. In this finale of the cloud IDS series, I will focus on an overview of logical components in an IDS, followed by the approach to adapt it for the cloud. An example of composing a cloud IDS solution out of CloudLens Public, Snort and ELK will be shown.
- Cloud-native network visibility architecture and docker containers help you compose a loosely coupled IDS solution.
- The solution lets you easily adapt to different runtime and orchestrators.
- The solution lets you easily adapt to different sensors and managed events aggregation services.
- The solution offers the opportunity to aggregate different event sources, hence allows analysis requiring complex data joins.
- More commercial IDS or packet tool vendors should offer a la carte docker image for their sensor engine to ease composing a custom stack.
- The code for the sample solution is open source and available for play: https://github.com/OpenIxia/sample-cloud-ids
An IDS stack can be logically sliced into three layers: Sensors, Event Aggregation and Presentation.
The Sensors layer is responsible for collecting network traffic coming to or leaving a network interface. As packets are collected, they are fed to the analysis engine to compare against configured patterns of suspicious activities. The engine will generate events when there are matches. There is usually more than one sensor in the network to scale with load and to minimize blind spots.
The Event Aggregation layer provides persistent storage of detected events from all the sensors. It also provides a query API for exploration and analysis of the data.
The Presentation layer offers an interface to a domain specific abstraction and interpretation of the events from the event aggregation layer. The interface can contain near real-time dashboards as well as batched reports by interacting with the query API in the event aggregation layer.
Below is a diagram depicting the logical layers and flows.
Most IDS solutions’ components can be mapped into these logical layers.
There are many open source and commercial IDS offerings. Snort is an open source, rules-based detection engine that performs real-time traffic analysis, logging and alerting. Snort being the component in the Sensors layer, relies on other tools to provide event ingestion, event storage and UI.
Below is a possible configuration of what a full stack implementation of Snort looks like. The user interface is Snorby. The event aggregation layer is Snorby DB which a user can use Postgres or MySQL as the underlying engine. Snort outputs detected events in its unified file format, while another process called ‘barnyard2’ reads events from the output file and ingest them into Snorby DB. ‘barnyard2’ offloads event persistence operations from Snort to improve packet inspection performance.
For other open source or commercial IDS stacks, you will find that there are specific tools written to complement the IDS sensor and support different event persistence. As I mentioned in an earlier post, you might end up using two different IDS stacks for different workloads. The different IDS stacks are likely to come with disjoint set of technologies. As your use cases and stacks grow, so do the pain to operate and maintain the systems. An analyst will likely need to master the different UI/API for each IDS stack. This makes it challenging for an analyst to join events from a variety of sources when doing complex analysis.
I will now go through an example scenario as shown below. We have a web application workload that happens to have vulnerabilities. To monitor if there are attackers exploiting the vulnerabilites, I will use a cloud-native network visibility service to help me get the packets hitting the application workload to my IDS tool. I will compose a custom IDS stack using snort as the sensor engine and using a generic event aggregation engine and UI for exploration of alerts.
The composed solution relies heavily on docker engine and docker-compose.
Components are containerized to docker images, which allow us to maintain loose coupling among cooperating processes. Docker Compose is used when we need to orchestrate a group of containers together to achieve a logical function or service. In this scenario, we have basically a monitored workload, a monitoring sensor, and an event aggregation engine and UI; each orchestrated via ‘docker-compose’.
Using docker as the primitive building block is nice because it allows you to be able to run it on your laptop, a physical or virtual server, or assets in the public cloud. You can easily adapt to other container orchestrators such as Kubernetes (where a Compose yaml can be remapped to Kubernetes Pods and other resources).
All the source code you need to bring this example scenario up is in GitHub.
You need to sign up for a CloudLens account. CloudLens is an implementation of the cloud-native network visibility architecture described in the last post. CloudLens is a paid service, but there is a free trial. Once that expires the cost for a small number of instances is basically negligible. After you logged into the CloudLens portal, create a CloudLens project and copy the project’s API Key enclosed in the red box below.
There are ample setup guides out there that can show you how to install the different tools you need to form your IDS implementation, so I will not spend too much time on that. Instead, let’s look at how to adapt Snort for the cloud. We need to feed Snort with packets from cloud infrastructure in a cloud-native way introduced in the last post, and to take advantage of existing tools to make an analyst’s queries easier on disparate event sources.
A lot of people are well-versed using various Linux command line utilities to analyze log files from IDS sensors. There are many mature tools today that aggregate log events and aid in data exploration and visualization. Instead of relying on Linux utilities or a UI like Snorby, I will use an ELK stack — Elasticsearch, Logstash, and Kibana — to aggregate the events generated by the logical IDS sensor app into a search base.
For ease of demo, we will run ELK as a container spanning the event aggregation and presentation layers. Logstash is responsible for ingesting alerts forwarded by lightweight forwarders running alongside each sensor. Logstash can optionally do the heavier work of transform and normalize the events before submitting them to Elasticsearch. Elasticsearch is your search base that holds indices and provides an API to query and search over your data. Kibana is a user interface that helps you explore, visualize and discover data held in Elasticsearch.
To get this up and running:
- Follow the instructions in the events_ui directory of the git repository. You should be able to bring this up on your laptop.
- Once up, make a note of the IP or hostname of the host where you launched this service.
Now that we got the events sink taken care of, we will focus on bringing up a sensor in the Sensors layer. In the diagram, we have a Logical IDS Sensor App block. The main components of this application are orchestrated via docker-compose.yml in https://github.com/OpenIxia/sample-cloud-ids/sensor:
- Snort container: this is the main application container that runs the Snort analysis engine
- CloudLens Public agent container: this is a sidecar that terminates the tunnels for monitored packets, and delivers them to an interface that snort can collect.
- Filebeat container: this is a sidecar log forwarder to pump data into Logstash in the ELK container.
To bring this layer up:
- Make sure you have your CloudLens project API key and your ELK hostname or IP from the previous steps in hand.
- Follow the instructions in the sensor directory of the git repository. Again, you should be able to run this on your laptop if it is a docker host.
- Go to the Cloudlens portal, click into your project. Verify that you have instances and define a tool group as shown below.
- Once you clicked on ‘Define a Group’, you will see the below. Follow the steps. The CloudLens management portal knows about this instance is a sample snort application because we tagged it (see the docker-compose.yml file)
Launch the sample source workload
To test this out, we need to have a sample source workload and have a CloudLens agent running with it to forward the monitored traffic.
To bring up the sample workload:
- Make sure you have your CloudLens project API key.
- Follow the instruction in the app directory of the git repository to start up a vulnerable application.
- Verify that your app containers are running, by running ‘docker-compose ps’
- Have a look at the vulnerable web app by opening a browser to the IP address of your docker host. If you click on links in green, you are hitting the endpoint that is vulnerable. If you click on links in red, you are exploiting the vulnerable endpoint.
- As in the CloudLens tool group creation earlier, create an instance group following the steps below. The CloudLens management portal knows about this instance is a vulnerable app because as part of the docker-compose definition to bring up CloudLens agent, we gave it explicit instruction to send a custom metadata tag named ‘workload’ with the value of ‘sample_vulnerable_app’.
- To get traffic pumping, you need to create a connection between the source and the tool in the CloudLens portal. Check that you have 2 instances, and 1 in each group. Drag and drop from the example_workloads group to the sample_snort_sensors group.
- After the connection is created, go back to your vulnerable web app, and click around the vulnerable links and exploit links.
- Access Kibana dashboard by pointing your browser to http://YOUR_DOCKER_HOSTNAME_IP:5601
- You can explore the events by issuing searches.
With a cloud-native network visibility service, tapping packets is simple regardless where your workload runs. And since we’ve dockerized our components, you can easily compose with different components and run in various environments, including environments not in the cloud (the sample given can be up and running in your laptop). You can add other types of IDS or any other packet analysis sensors in place of Snort and have a single analysis backend.
The solution gives you flexibility to minimize the toil of operating infrastructure if you choose to at some point to substitute the ELK container to a managed Elasticsearch or Splunk service. If you are the guy who is the developer, the ops and the analyst at the same time, this solution helps you focusing more time wearing the developer and the analyst hat. As an analyst, you can easily do data joins from disparate event sources, not limiting yourself to just a single IDS tool. You can also be more focused on improving and managing the detection rules and algorithms that are relevant for your workloads.
The advantage of using docker container as the building block is you have the freedom to manage the containers with various orchestration means, without heavy coding. The technique used to group a logical application together using Docker Compose is very similar to how you would group a set of cooperating containers together in a Kubernetes Pod. Horizontal scaling of sensors is achieved through the native scaling primitives available for the environment where you deploy your sensors. For example, if you create an Amazon AMI that contains docker engine, docker-compose and the sensor, you can scale in/out using AWS Auto Scaling Group. On Kubernetes, you can couple a HorizontalPodAutoscaler with a Deployment resource over the sensor Pod.
Most open source packet analysis tools’ Sensors layers can be isolated and dockerized without heavy source code modifications. For those of you using commercial stacks, it’s probably worthwhile to ask your vendor if there is a docker image offering for their sensors in an a la carte fashion as opposed to the full stack and lift-and-shift architecture they tend to push you toward.
If you are interested in seeing other examples of how flexible a cloud native network visibility architecture is, you can have a look at Kris Raney's threat hunt at scale series where he shows how to bring up Bro on top of Kubernetes using Cloudlens.