awslabs/aws-cfn-windows-hpc-template

Name: aws-cfn-windows-hpc-template

Owner: Amazon Web Services - Labs

Owner: AWS Samples

Description: This sample CloudFormation template will launch a Windows-based HPC cluster running Windows Server 2012R2 and supporting core infrastructure including VPC, domain controllers and bastion servers.

Created: 2015-10-07 20:51:31.0

Updated: 2017-10-04 04:49:59.0

Pushed: 2018-01-08 14:38:58.0

Homepage: null

Size: 37

Language: PowerShell

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

aws-cfn-windows-hpc-template

This sample AWS CloudFormation template will launch a Windows-based HPC cluster running Windows Server 2012 R2 and supporting core infrastructure including Amazon VPC, domain controllers and bastion servers.

This document presents the steps required to deploy and get the platform running.

What does it do, how does it work, why should I use it?

This platform has been published as a companion to the “(CMP306) Dynamic, On-Demand Windows HPC Clusters On AWS” session at AWS re:Invent 2015.

This session is available on:

Interesting tricks:

Prepare an Amazon EBS Snapshot for installation material
Create the volume

Launch an Amazon EC2 instance, and in the wizard create a second Amazon EBS volume.

If you already have an instance running, create a new volume and attach it to your instance. Use Windows Server Manager (Local Server / Storage Services) to bring the volume online and format it (this step is done automatically for new instances).

The recommended settings for this volume are to use an Amazon EBS General Purpose SSD volume of 10 GiB in size.

Get Microsoft HPC Pack 2012 R2 installation

Download the HPCPack2012R2-Full.zip file from http://www.microsoft.com/en-us/download/details.aspx?id=41630

Save it as D:\HPCPack2012R2-Full.zip.

Once the download has finished, extract the content to D:\HPCPack2012R2-Full

You can remove the D:\HPCPack2012R2-Full.zip file.

Prepare SQL Server installation

When asking for Microsoft HPC Pack to install in unattended mode, the Microsoft SQL Server installation wizard is not fully unattended, it tries to open a window on the desktop and gets stuck. To overcome this limitation we need to ask it to pre-extract the installation media and run SQL Server setup before installing Microsoft HPC Pack 2012 R2.

Open a command prompt (or a PowerShell prompt), go to D:\HPCPack2012R2-Full\amd64

Run the following command SQLEXPR_x64_ENU.exe /X:D:\SQLInstall

This will create a folder called SQLInstall on your D drive

Prepare for update AWS PV drivers (optional)

This step is optional, but you may want to check if you have the latest drivers.

Download the following file: https://s3.amazonaws.com/ec2-downloads-windows/Drivers/AWSPVDriverSetup.zip to D:\AWSPVDriverSetup.zip

Extract the content to D:\AWSPVDriverSetup

You can remove the D:\AWSPVDriverSetup.zip file.

Prepare for update Intel SRV-IO drivers (optional)

This step is optional, but you may want to check if you have the latest drivers.

Download the following file: https://downloadcenter.intel.com/download/23073/Network-Adapter-Driver-for-Windows-Server-2012-R2- to D:\PROWinx64.exe

Rename the D:\PROWinx64.exe file as D:\PROWinx64.zip

Extract the content to D:\PROWinx64

You can remove the D:\PROWinx64.zip file.

Make a Snapshot

Recommended: In Windows Server Manager (Local Server / Storage Services), select the disk associated with your Amazon EBS volume, and take it offline. This is recommended to ensure consistency os the data on the disk.

In the Amazon EC2 console, select the instance that you are using, in the Description tab, click on xvdf in the Block devices area, and click on the volume name beside the EBS ID value. Click on Actions / Create Snapshot, enter HPC Pack 2012 Installation as a Name and as a Description; click Create.

Wait for the snapshot to be created, and you are all set!

Publish your content

This GitHub reposoroty contains multiple resources (AWS Lambda Functions, PowerShell scripts, configuration files, and AWS CloudFormation templates). To run the platform you will need to publish them to one of your existing Amazon S3 buckets.

Download or clone the content of this repository, and run:

Where:

Make sure you have an AWS CLI (https://aws.amazon.com/cli/) configured for Unix environments, and an AWS Tools for Windows PowerShell (http://aws.amazon.com/powershell/) configured for Windows.

The script will give you the URL of the global AWS CloudFormation template to use for creating a full platform.

WARNING: MAKE SURE YOUR BUCKET IS IN THE REGION WHERE YOUR CLUSTER WILL RUN, THIS IS A REQUIREMENT FROM AWS LAMBDA WHEN USED IN AWS CLOUDFORMATION

Runing the template

In AWS CloudFormation, use the URL provided as an output of the publication script to start a new stack.

Choose a name, then fill te parameters.

Passwords: the template ask you for multiple passwords, that will be used in the platform. For security reasons we don't provide you with a default password.

Networking: you will have to enter multiple IP ranges for managing the platform.

Instances: some details about the instances.

Many more configuration options are available if you look at the details of the sub stacks (look in the cfn directory).

The template will get one single output:

Connecting and using the platform

Use the output Bastion from the main template to connect in Remote Desktop Protocol to the bastion host. Use the .\Administrator account and the password you specified as the BastionAdminPassword parameter to connect.

Once on the bastion host, run MSTSC.EXE (Microsoft Remote Desktop Connection) to connect to the machine named head-node. As all cluster machines are in a domain, and the bastion host is configured to use the DNS servers of that domain, you will connect to the head node instance with the name head-node.awslab.local. Use AWSLAB\HPCUser as a user name, and the password you entered as the HPCUserPassword parameter to connect.

On the head node, you can interact with the Microsoft HPC Cluster Manager, or use mpiexec or PowerShell to start using the cluster..

FAQ
My Lambda functions fail, why?

When starting your template, sometimes the AWS Lambda Functions fill fail and mark the entire stack as failed.

Look at the AWS CloudWatch Logs Log Group that is named in the error. You may find the following error in the content:

r occured while getting the object from S3. S3 Error Code: PermanentRedirect. 
rror Message: The bucket you are attempting to access must be addressed using 
specified endpoint. Please send all future requests to this endpoint.

If this is the case, make sure that the region in which your Amazon S3 bucket resides is the same than the one where you are running your cluster.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.