IBM/db2-event-store-clickstream

Name: db2-event-store-clickstream

Owner: International Business Machines

Description: Sample notebooks demonstrate a use case of clickstream analysis with IBM Db2 Event Store using Scala APIs to ingest and analyze web event data.

Created: 2018-05-04 16:17:49.0

Updated: 2018-05-23 22:07:00.0

Pushed: 2018-05-23 22:07:01.0

Homepage:

Size: 771

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Clickstream Analysis with IBM Db2 Event Store

IBM Db2 Event Store offers high-speed ingestion and real-time analytics for large volumes of streaming data. The platform enables event-driven applications to persist event data at scale and powers high performance Spark analytics on all data for quick insights. In this Code Pattern, we will see how a retail business uses IBM Db2 Event Store to capture and analyze clickstream data from its web channels. The clickstream analysis helps the business to closely track customer browsing patterns and better understand their changing interests. Acting on these insights, the business offers a personalized experience for every customer with targeted offers to drive sales.

Sample notebooks demonstrate the use case of clickstream analysis with IBM Db2 Event Store using Scala APIs to ingest and analyze web event data. Credit goes to Siva Anne of the IBM Data Science Elite Team for the original Jupyter Notebooks.

When the reader has completed this code pattern, they will understand how to:

Flow
  1. Add a CSV file as a data asset
  2. Run a Jupyter Notebook using Scala to ingest data from the CSV file into Event Store
  3. Run a Jupyter Notebook using Scala and the Brunel visualization language to analyze the data from Event Store
Included components
Featured technologies

Steps

Run locally
  1. Install IBM Db2 Event Store Developer Edition
  2. Clone the repo
  3. Add the CSV file as a data asset
  4. Import and run the Jupyter Notebook to ingest data
  5. Import and run the Jupyter Notebook to analyze the data
  6. See the results
1. Install IBM Db2 Event Store Developer Edition

Install IBM® Db2® Event Store Developer Edition on Mac, Linux, or Windows by following the instructions here.

Note: This code pattern was developed with EventStore-DeveloperEdition 1.1.4

2. Clone the repo

Clone the db2-event-store-clickstream locally. In a terminal, run:

clone https://github.com/IBM/db2-event-store-clickstream
3. Add the CSV file as a data asset

Use the Db2 Event Store UI to add the CSV input file as a data asset.

  1. From the drop down menu (three horizontal lines in the upper left corner), select My Notebooks.

  2. Click on add data assets.

  3. Click browse and navigate to the data directory in your cloned repo. Select the file clickstream_data.csv.

4. Import and run the Jupyter Notebook to ingest data
Import the notebook

Use the Db2 Event Store UI to create the notebook.

  1. From the drop down menu (three horizontal lines in the upper left corner), select My Notebooks.

  2. Click on add notebooks.

  3. Select the From File tab.

  4. Provide a name.

  5. Click Choose File and navigate to the notebooks directory in your cloned repo. Select the file ingest_clickstream_events.ipynb.

  6. Scroll down and click on Create Notebook.

Run the notebook
  1. Edit the HOST constant in the first code cell. You will need to enter your host's IP address in place of the XXX.XXX.XXX.XXX value.

  2. Run the notebook using the menu Cell > Run all or run the cells individually with the play button.

This notebook demonstrates how to:

5. Import and run the Jupyter Notebook to analyze the data
Import the notebook

Use the Db2 Event Store UI to create the notebook.

Run the notebook
  1. Edit the HOST constant in the first code cell. You will need to enter your host's IP address in place of the XXX.XXX.XXX.XXX value.

  2. Run the notebook using the menu Cell > Run all or run the cells individually with the play button.

This notebook demonstrates how to:

6. See the results

Sample output

See the notebook with example output and interactive charts here.

Links

Learn more

License

Apache 2.0


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.