Name: node-red-dsx-workflow
Owner: International Business Machines
Description: This journey helps to build a complete end-to-end analytics solution using IBM Watson Studio. This repository contains instructions to create a custom web interface to trigger the execution of Python code in Jupyter Notebook and visualise the response from Jupyter Notebook on IBM Watson Studio.
Created: 2017-08-01 14:38:47.0
Updated: 2018-03-22 01:54:03.0
Pushed: 2018-03-22 01:54:05.0
Homepage: https://developer.ibm.com/code/patterns/orchestrate-data-science-workflows-using-node-red/
Size: 2145
Language: Jupyter Notebook
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Data Science Experience is now Watson Studio. Although some images in this code pattern may show the service as Data Science Experience, the steps and processes will still work.
IBM Watson Studio can be used to analyze data using Jupyter notebooks. There is no mechanism exposed by Watson Studio to trigger execution of the notebook cells from outside. If this capability is added, we can build a complete end to end analytics solution using IBM Watson Studio.
The below two requirements are addressed by this journey to help build a complete analytics solution with IBM Watson Studio.
We will use Node-RED to invoke the analytics workflows in Jupyter notebooks on IBM Watson Studio and also to render a custom web user-interface with minimal programming.
Node-RED is a tool for wiring together APIs and online services on IBM Cloud. The APIs and online services are configured as nodes that can be wired to orchestrate a workflow. It is also a web server where the UI solution can be deployed. It has nodes that support integration with many database services, watson services and analytics services.
Node-RED reduces a lot of development effort. It is easy to improve the solution using other services with Node-RED. It opens a world of possibilities for developers.
When the reader has completed this journey, they will understand how to:
The intended audience for this journey are developers who want to develop a complete analytics solution on Watson Studio with a custom web user interface.
Node-RED: Node-RED is a programming tool for wiring together APIs and online services.
IBM Watson Studio: Analyze data using RStudio, Jupyter, and Python in a configured, collaborative environment that includes IBM value-adds, such as managed Spark.
IBM Cloud Object Storage: An IBM Cloud service that provides an unstructured cloud data store to build and deliver cost effective apps and services with high reliability and fast speed to market.
Jupyter Notebooks: An open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text.
Follow these steps to setup and run this developer journey. The steps are described in detail below.
Sign up for IBM's Watson Studio. By signing up for the Watson Studio, two services will be created - Spark and ObjectStore in your Bluemix account.
Create the Node-RED Starter application.
Choose an appropriate name for the Node-RED application - App name:
.
Click on Create
.
On the newly created Node-RED application page, Click on Visit App URL
to launch the Node-RED editor once the application is in Running
state.
On the Welcome to your new Node-RED instance on IBM Cloud
screen, Click on Next
.
On the Secure your Node-RED editor
screen, enter a username and password to secure the Node-RED editor and click on Next
.
On the Browse available IBM Cloud nodes
screen, click on Next
.
On the Finish the install
screen, click on Finish.
Click on Go to your Node-RED flow editor
.
Navigate to the orchestrate_dsx_workflow.json.
Open the file with a text editor and copy the contents to Clipboard.
On the Node-RED flow editor, click the Menu and select Import
-> Clipboard
and paste the contents.
Deploy
buttonThe websocket URL is ws://<NODERED_BASE_URL>
/ws/orchestrate where the NODERED_BASE_URL
is the marked portion of the URL in the above image.
An example websocket URL for a Node-RED app with name myApp
is ws://myApp.mybluemix.net/ws/orchestrate
, where myApp.mybluemix.net
is the NODERED_BASE_URL
.
The NODERED_BASE_URL
may have additional region information i.e. eu-gb
for the UK region. In this case NODERED_BASE_URL
would be: myApp.eu-gb.mybluemix.net
.
Click on the node named HTML
.
Click on the HTML area and search for ws:
to locate the line where the websocket URL is specified.
Update the websocket URL with the base URL that was noted in the Section 4:
var websocketURL = "ws://NODERED_BASE_URL/ws/orchestrate";
Click on Done
and re-deploy the flow.
Create notebook
to create a notebook.Assets
tab, select the Create notebook
option.From URL
tab.Create
button.summer.csv
and dictionary.csv
from:
https://www.kaggle.com/the-guardian/olympic-games.summer.csv
to olympics.csv
Find and Add Data
(look for the 10/01
icon)
and its Files
tab.browse
and navigate to where you downloaded olympics.csv
and dictionary.csv
on your computer.2.1 Add your service credentials for Object Storage
section in the notebook to update the credentials for Object Store.Find and Add Data
(look for the 10/01
icon) and its Files
tab. You should see the file names uploaded earlier. Make sure your active cell is the empty one created earlier.Insert to code
below olympics.csv
.Insert Crendentials
from the drop down menu.credential_2
change them to credentials_1
.6. Expose integration point with a websocket client
, update the websocket url noted in section 4 in the start_websocket_listener
function.When a notebook is executed, what is actually happening is that each code cell in the notebook is executed, in order, from top to bottom.
Each code cell is selectable and is preceded by a tag in the left margin. The tag
format is In [x]:
. Depending on the state of the notebook, the x
can be:
blank
, this indicates that the cell has never been executed.number
, this number represents the relative order this code step was executed.*
, this indicates that the cell is currently executing.There are several ways to execute the code cells in your notebook:
Play
button in the toolbar.Cell
menu bar, there are several options available. For example, you
can Run All
cells in your notebook, or you can Run All Below
, that will
start executing from the first cell under the currently selected cell, and then
continue executing all cells that follow.Schedule
button located in the top right section of your notebook
panel. Here you can schedule your notebook to be executed once at some future
time, or repeatedly at your specified interval.For this Notebook, you can simply Run All
cells.
The websocket client will be started when you run the cell under 7. Start websocket client
. This will start the communication between the UI and the Notebook.
The UI can be accessed at the URL: http://<NODERED_BASE_URL>
/dsxinsights.
The <NODERED_BASE_URL>
is the base URL noted in section Note the websocket URL.