IBM/node-red-dsx-workflow

Name: node-red-dsx-workflow

Owner: International Business Machines

Description: This journey helps to build a complete end-to-end analytics solution using IBM Watson Studio. This repository contains instructions to create a custom web interface to trigger the execution of Python code in Jupyter Notebook and visualise the response from Jupyter Notebook on IBM Watson Studio.

Created: 2017-08-01 14:38:47.0

Updated: 2018-03-22 01:54:03.0

Pushed: 2018-03-22 01:54:05.0

Homepage: https://developer.ibm.com/code/patterns/orchestrate-data-science-workflows-using-node-red/

Size: 2145

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Orchestration of the analytics workflow in IBM Watson Studio using a custom web user-interface built with Node-RED

Data Science Experience is now Watson Studio. Although some images in this code pattern may show the service as Data Science Experience, the steps and processes will still work.

IBM Watson Studio can be used to analyze data using Jupyter notebooks. There is no mechanism exposed by Watson Studio to trigger execution of the notebook cells from outside. If this capability is added, we can build a complete end to end analytics solution using IBM Watson Studio.

The below two requirements are addressed by this journey to help build a complete analytics solution with IBM Watson Studio.

We will use Node-RED to invoke the analytics workflows in Jupyter notebooks on IBM Watson Studio and also to render a custom web user-interface with minimal programming.

What is Node-RED?

Node-RED is a tool for wiring together APIs and online services on IBM Cloud. The APIs and online services are configured as nodes that can be wired to orchestrate a workflow. It is also a web server where the UI solution can be deployed. It has nodes that support integration with many database services, watson services and analytics services.

Node-RED reduces a lot of development effort. It is easy to improve the solution using other services with Node-RED. It opens a world of possibilities for developers.

When the reader has completed this journey, they will understand how to:

The intended audience for this journey are developers who want to develop a complete analytics solution on Watson Studio with a custom web user interface.

  1. The Object storage stores the data.
  2. Data is utilized as csv files.
  3. The Jupyter notebook processes the data and generates insights.
  4. The Jupyter notebook is powered by Spark.
  5. The Node-RED hosts a websocket server that is a medium of communication between the Jupyter notebook on IBM Watson Studio and Web UI.
  6. The Node-RED hosts a web server that renders the Web UI.
Included components
Featured technologies

Watch the Video

Steps

Follow these steps to setup and run this developer journey. The steps are described in detail below.

  1. Sign up for Watson Studio
  2. Create IBM Cloud services
  3. Import the Node-RED flow
  4. Note the websocket URL
  5. Update the websocket URL
  6. Create the notebook
  7. Add the data
  8. Update the notebook with service credentials
  9. Run the notebook
  10. Analyze the results
1. Sign up for Watson Studio

Sign up for IBM's Watson Studio. By signing up for the Watson Studio, two services will be created - Spark and ObjectStore in your Bluemix account.

2. Create IBM Cloud services
3. Import the Node-RED flow

4. Note the websocket URL

The websocket URL is ws://<NODERED_BASE_URL>/ws/orchestrate where the NODERED_BASE_URL is the marked portion of the URL in the above image.

Note:

An example websocket URL for a Node-RED app with name myApp is ws://myApp.mybluemix.net/ws/orchestrate, where myApp.mybluemix.net is the NODERED_BASE_URL.

The NODERED_BASE_URL may have additional region information i.e. eu-gb for the UK region. In this case NODERED_BASE_URL would be: myApp.eu-gb.mybluemix.net.

5. Update the websocket URL in HTML code

Click on the node named HTML.

Click on the HTML area and search for ws: to locate the line where the websocket URL is specified. Update the websocket URL with the base URL that was noted in the Section 4:

var websocketURL = "ws://NODERED_BASE_URL/ws/orchestrate";

Click on Done and re-deploy the flow.

6. Create the notebook

7. Add the data
Add the data to the notebook

8. Update the notebook with service credentials and websocket URL
Add the Object Storage credentials to the notebook

Update the websocket URL in the notebook

9. Run the notebook

When a notebook is executed, what is actually happening is that each code cell in the notebook is executed, in order, from top to bottom.

Each code cell is selectable and is preceded by a tag in the left margin. The tag format is In [x]:. Depending on the state of the notebook, the x can be:

There are several ways to execute the code cells in your notebook:

For this Notebook, you can simply Run All cells. The websocket client will be started when you run the cell under 7. Start websocket client. This will start the communication between the UI and the Notebook.

10. Analyze the results

The UI can be accessed at the URL: http://<NODERED_BASE_URL>/dsxinsights. The <NODERED_BASE_URL> is the base URL noted in section Note the websocket URL.

Troubleshooting

See DEBUGGING.md.

License

Apache 2.0


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.