IBM/starcraft2-replay-analysis

Name: starcraft2-replay-analysis

Owner: International Business Machines

Description: A jupyter notebook that provides analysis for StarCraft 2 replays

Created: 2017-04-20 16:09:54.0

Updated: 2018-05-09 13:50:43.0

Pushed: 2018-05-17 20:00:54.0

Homepage: https://developer.ibm.com/code/patterns/analyze-starcraft-ii-replays-with-jupyter-notebooks/

Size: 2013

Language: HTML

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

StarCraft II Replay Analysis with Jupyter Notebooks

Read this in other languages: ???, ??.

Data Science Experience is now Watson Studio. Although some images in this code pattern may show the service as Data Science Experience, the steps and processes will still work.

In this Code Pattern we will use Jupyter notebooks to analyze StarCraft II replays and extract interesting insights.

When the reader has completed this Code Pattern, they will understand how to:

The intended audience for this Code Pattern is application developers who need to process StarCraft II replay files and build powerful visualizations.

Flow
  1. The Developer creates a Jupyter notebook from the included starcraft2_replay_analysis.ipynb file
  2. A Starcraft replay file is loaded into IBM Cloud Object Storage
  3. The Object is loaded into the Jupyer notebook
  4. Processed replay is loaded into Cloudant database for storage
Included components
Featured technologies

Watch the Video

Steps

Follow these steps to setup and run this developer Code Pattern. The steps are described in detail below.

  1. Sign up for Watson Studio
  2. Create IBM Cloud services
  3. Create the notebook
  4. Add the replay file
  5. Create a connection to Cloudant
  6. Run the notebook
  7. Analyze the results
  8. Save and share
1. Sign up for Watson Studio

Sign up for IBM's Watson Studio. By creating a project in Watson Studio a free tier Object Storage service will be created in your IBM Cloud account. Take note of your service names as you will need to select them in the following steps.

Note: When creating your Object Storage service, select the Free storage type in order to avoid having to pay an upgrade fee.

2. Create IBM Cloud services

Create the following IBM Cloud service by clicking the Deploy to IBM Cloud button or by following the links to use the IBM Cloud UI and create it.

Deploy to IBM Cloud

3. Create the notebook

4. Add the replay file
Add the replay to the notebook

Use Data (look for the 10/01 icon) and its Files tab. From there you can click browse and add a .SC2Replay file from your computer.

Note: If you don't have your own replays, you can get our example by cloning this git repo. Use the data/example_input/king_sejong_station_le.sc2replay file.

Create an empty cell

Use the + button above to create an empty cell to hold the inserted code and credentials. You can put this cell at the top or anywhere before Load the replay.

Insert to code

After you add the file, use its Insert to code drop-down menu. Make sure your active cell is the empty one created earlier. Select Insert StreamingBody object from the drop-down menu.

Note: This cell is marked as a hidden_cell because it contains sensitive credentials.

Fix-up variable names

The inserted code includes a generated method with credentials and then calls the generated method to set a variable with a name like streaming_body_1. If you do additional inserts, the method can be re-used and the variable will change (e.g. streaming_body_2).

Later in the notebook, we set replay_file = streaming_body_1. So you might need to fix the variable name streaming_body_1 to match your inserted code.

5. Create a connection to Cloudant
Create a database

Before you an add a connection, you need a database. Use your IBM Cloud dashboard to find the service you created. If you used Deploy to IBM Cloud look for sc2-cloudantNoSQLDB-service. If you created the service directly in IBM Cloud you may have picked a different name or used the default name of Cloudant NoSQL DB- with a random suffix.

Add a new connection to the project Create an empty cell Add the Cloudant credentials to the notebook

Note: This cell is marked as a hidden_cell because it contains sensitive credentials.

Fix-up variable names

The inserted code includes a dictionary with credentials assigned to a variable with a name like credentials_1. It may have a different name (e.g. credentials_2). Rename it or reassign it if needed. The notebook code assumes it will be credentials_1.

6. Run the notebook

When a notebook is executed, what is actually happening is that each code cell in the notebook is executed, in order, from top to bottom.

Each code cell is selectable and is preceded by a tag in the left margin. The tag format is In [x]:. Depending on the state of the notebook, the x can be:

There are several ways to execute the code cells in your notebook:

7. Analyze the results

The result of running the notebook is a report which may be shared with or without sharing the code. You can share the code for an audience that wants to see how you came your conclusions. The text, code and output/charts are combined in a single web page. For an audience that does not want to see the code, you can share a web page that only shows text and output/charts.

Basic output

Basic replay information is printed out to show you how you can start working with a loaded replay. The output is also, of course, very helpful to identify which replay you are looking at.

Data preparation

If you look through the code, you'll see that a lot of work went into preparing the data.

Unit and building groups

List of strings were created for the known units and groups. These are needed to recognize the event types.

Event handlers

Handler methods were written to process the different types of events and accumulate the information in the player's event list.

The ReplayData class

We created the ReplayData class to take a replay stream of bytes and process them with all our event handlers. The resulting player event lists are stored in a ReplayData object. The ReplayData class also has an as_dict() method. This method returns a Python dictionary that makes it easy to process the replay events with our Python code. We also use this dict to create a Cloudant JSON document.

Visualization

To visualize the replay we chose to use 2 different types of charts and show a side-by-side comparison of the competing players.

We generate these charts for each of the following metrics. You will get a good idea of how the players are performing by comparing the trends for these metrics.

Box plot charts

Once you get to this point, you can see that generating a box plot is quite easy thanks to pandas DataFrames and Bokeh BoxPlot.

The box plot is a graphical representation of the summary statistics for the metric for each player. The “box” covers the range from the first to the third quartile. The horizontal line in the box shows the mean. The “whisker” shows the spread of data outside these quartiles. Outliers, if any, show up as markers outside the whisker lines.

For each metric, we show the players statistics side-by-side using a box plots.

In the above screen shot, you see side-by-side vespene per minute statistics. In this contest, Neeb had the advantage. In addition to the box which shows the quartiles and the whisker that shows the range, this example has outlier indicators. In many cases, there will be no outliers.

Nelson rules charts

The Nelson rules charts are not so easy. You'll notice quite a bit of code in helper methods to create these charts.

The base chart is a Bokeh plotting figure with circle markers for each data point in the time series. This shows the metric over time for the player. The player charts are side-by-side to allow separate scales and plenty of additional annotations.

We add horizontal lines to show our x-bar (sample mean), 1st and 2nd standard deviations and upper and lower control limits for each player.

We use our detect_nelson_bias() method to detect 9 or more consecutive points above (or below) the x-bar line. Then, using Bokeh's add_layout() and BoxAnnotation, we color the background green or red for ranges that show bias for above or below the line respectively.

Our detect_nelson_trend() method detects when 6 or more consecutive points are all increasing or decreasing. Using Bokeh's add_layout() and Arrow, we draw arrows on the chart to highlight these up or down trends.

The result is a side-by-side comparison that is jam-packed with statistical analysis.

In the above screen shot, you see the time/value hover details that you get with Bokeh interactive charts. Also notice the different scales and the arrows. In this contest, Neeb made two early pushes and got an advantage in minerals. If you run the notebook, you'll see other examples showing where the winner got the advantage.

Stored replay documents

You can browse your Cloudant database to see the stored replays. After all the loading and parsing we stored them as JSON documents. You'll see all of your replays in the sc2replays database and only the latest one in sc2recents.

8. Save and share
How to save your work:

Under the File menu, there are several ways to save your notebook:

How to share your work:

You can share your notebook by selecting the ?Share? button located in the top right section of your notebook panel. The end result of this action will be a URL link that will display a ?read-only? version of your notebook. You have several options to specify exactly what you want shared from your notebook:

Sample output

The sample_output.html in data/examples has embedded JavaScript for interactive Bokeh charts. Use rawgit.com to view it with the following link:

Sample Output

Troubleshooting

See DEBUGGING.md.

License

Apache 2.0


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.