awslabs/lambda-refarch-fileprocessing

Name: lambda-refarch-fileprocessing

Owner: Amazon Web Services - Labs

Owner: AWS Samples

Description: Serverless Reference Architecture for Real-time File Processing

Created: 2015-09-16 20:48:06.0

Updated: 2017-12-30 00:31:59.0

Pushed: 2017-05-19 16:18:22.0

Homepage:

Size: 55

Language: JavaScript

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Serverless Reference Architecture: Real-time File Processing

README Languages: DE | ES | FR | IT | JP | KR | PT | RU | CN | TW

The Real-time File Processing reference architecture is a general-purpose, event-driven, parallel data processing architecture that uses AWS Lambda. This architecture is ideal for workloads that need more than one data derivative of an object. This simple architecture is described in this diagram and “Fanout S3 Event Notifications to Multiple Endpoints” blog post on the AWS Compute Blog. This sample application demonstrates a Markdown conversion application where Lambda is used to convert Markdown files to HTML and plain text.

Running the Example

You can use the provided AWS CloudFormation template to launch a stack that demonstrates the Lambda file processing reference architecture. Details about the resources created by this template are provided in the CloudFormation Template Resources section of this document.

Important Because the AWS CloudFormation stack name is used in the name of the Amazon Simple Storage Service (Amazon S3) buckets, that stack name must only contain lowercase letters. Use lowercase letters when typing the stack name. The provided CloudFormation template retrieves its Lambda code from a bucket in the us-east-1 region. To launch this sample in another region, please modify the template and upload the Lambda code to a bucket in that region.

Choose Launch Stack to launch the template in the us-east-1 region in your account:

Launch Lambda File Processing into North Virginia with CloudFormation

Alternatively, you can use the following command to launch the stack using the AWS CLI. This assumes you have already installed the AWS CLI.

cloudformation create-stack \
--stack-name lambda-file-processing \
--template-url https://s3.amazonaws.com/awslambda-reference-architectures/file-processing/lambda_file_processing.template \
--capabilities CAPABILITY_IAM
Testing the Example

After you have created the stack using the CloudFormation template, you can test the system by uploading a Markdown file to the InputBucket that was created in the stack. You can use this README.md file in the repository as an example file. After the file has been uploaded, you can see the resulting HTML and plain text files in the output bucket of your stack. You can also view the CloudWatch logs for each of the functions in order to see the details of their execution.

You can use the following commands to copy a sample file from the provided S3 bucket into the input bucket of your stack.

ET=$(aws cloudformation describe-stack-resource --stack-name lambda-file-processing --logical-resource-id InputBucket --query "StackResourceDetail.PhysicalResourceId" --output text)
s3 cp s3://awslambda-reference-architectures/file-processing/example.md s3://$BUCKET/example.md

After the file has been uploaded to the input bucket, you can inspect the output bucket to see the rendered HTML and plain text output files created by the Lambda functions.

You can also view the CloudWatch logs generated by the Lambda functions.

Cleaning Up the Example Resources

To remove all resources created by this example, do the following:

  1. Delete all objects in the input and output buckets.
  2. Delete the CloudFormation stack.
  3. Delete the CloudWatch log groups that contain the execution logs for the two processor functions.
CloudFormation Template Resources
Parameters
Resources

The provided template creates the following resources:

License

This reference architecture sample is licensed under Apache 2.0.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.