awslabs/aws-vpc-flow-log-appender

Name: aws-vpc-flow-log-appender

Owner: Amazon Web Services - Labs

Owner: AWS Samples

Description: Sample code to append additional information (e.g. Security Group IDs and geolocation data) to VPC Flow Logs for analysis in Elasticsearch.

Created: 2017-04-20 17:19:37.0

Updated: 2018-01-15 05:58:39.0

Pushed: 2017-11-03 20:14:39.0

Homepage: null

Size: 102

Language: JavaScript

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

aws-vpc-flow-log-appender

aws-vpc-flow-log-appender is a sample project that enriches AWS VPC Flow Log data with additional information, primarily the Security Groups associated with the instances to which requests are flowing.

This project makes use of several AWS services, including Elasticsearch, Lambda, and Kinesis Firehose. These must be setup and configured in the proper sequence for the sample to work as expected. Here, we describe deployment of the Lambda components only. For details on deploying and configuring other services, please see the accompanying blog post.

The following diagram is a representation of the AWS services and components involved in this sample:

VPC Flow Log Appender Services

NOTE: This project makes use of a free geolocation service (http://freegeoip.net/ that enforces an hourly limit of 15,000 requests. It is not intended for use in a production environment. We recommend using a commercial source of IP geolocation data if you wish to run this code in such an environment.

Getting Started

To get started, clone this repository locally:

t clone https://github.com/awslabs/aws-vpc-flow-log-appender

The repository contains CloudFormation templates and source code to deploy and run the sample application.

Prerequisites

To run the vpc-flow-log-appender sample, you will need to:

  1. Select an AWS Region into which you will deploy services. Be sure that all required services (AWS Lambda, Amazon Elastisearch Service, AWS CloudWatch, and AWS Kinesis Firehose) are available in the Region you select.
  2. Confirm your installation of the latest AWS CLI (at least version 1.11.21).
  3. Confirm the AWS CLI is properly configured with credentials that have administrator access to your AWS account.
  4. Install Node.js and NPM.
Preparing to Deploy Lambda

Before deploying the sample, install several dependencies using NPM:

 vpc-flow-log-appender/decorator
m install
 ../ingestor
m install
 ..
Deploy Lambda Functions

The deployment of our AWS resources is managed by a CloudFormation template using AWS Serverless Application Model.

  1. Create a new S3 bucket from which to deploy our source code (ensure that the bucket is created in the same AWS Region as your network and services will be deployed):

    s s3 mb s3://<MY_BUCKET_NAME>
    
  2. Using the Serverless Application Model, package your source code and serverless stack:

    s cloudformation package --template-file app-sam.yaml --s3-bucket <MY_BUCKET_NAME> --output-template-file app-sam-output.yaml
    
  3. Once packaging is complete, deploy the stack:

    s cloudformation deploy --template-file app-sam-output.yaml --stack-name vpc-flow-log-appender-dev --capabilities CAPABILITY_IAM
    
  4. Once we have deployed our Lambda functions, we need to return to CloudWatch and configure VPC Flow Logs to stream the data to the Lambda function. (TODO: add more detail)

Testing

In addition to running aws-vpc-flow-log-appender using live VPC Flow Log data from your own environment, we can also leverage the Kinesis Data Generator to send mock flow log data to our Kinesis Firehose instance.

To get started, review the Kinesis Data Generator Help and use the included CloudFormation template to create necessary resources.

When ready:

  1. Navigate to your Kinesis Data Generator and login.

  2. Select the Region to which you deployed aws-vpc-flow-log-appender and select the appropriate Stream (e.g. “VPCFlowLogsToElasticSearch”). Set Records per Second to 50.

  3. Next, we will use the AWS CLI to retrieve several values specific to your AWS Account to generate feasible VPC Flow Log data:

    COUNT_ID
    s sts get-caller-identity --query 'Account'
    
    I_ID (e.g. "eni-1a2b3c4d")
    s ec2 describe-instances --query 'Reservations[0].Instances[0].NetworkInterfaces[0].NetworkInterfaceId'
    
  4. Finally, we can build a template for KDG using the following. Be sure to replace <<ACOUNT_ID>> and <<ENI_ID>> with the values your captured in step 3 (do not include quotes).

    ACCOUNT_ID>> <<ENI_ID>> {{internet.ip}} 10.100.2.48 45928 6379 6 {{random.number(1)}} {{random.number(600)} 1493070293 1493070332 ACCEPT OK
    
  5. Returning back to KDG, copy and paste the mock VPC Flow Log data in Template 1. Then click the “Send data” button.

  6. Stop KDG after a few seconds by clicking “Stop” in the popup.

  7. After a few minutes, check CloudWatch Logs and your Elasticsearch cluster for data.

A few notes on the above test procedure:

Cleaning Up

To clean-up the Lambda functions when you are finished with this sample:

s cloudformation delete-stack --stack-name vpc-flow-log-appender-dev
Updates
Authors

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.