Name: procurement-analysis-with-wks
Owner: International Business Machines
Description: ***WORK IN PROGRESS***
Created: 2018-01-31 05:35:42.0
Updated: 2018-05-24 13:30:27.0
Pushed: 2018-05-24 13:30:26.0
Size: 5607
Language: JavaScript
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
In this code pattern we will be creating a complete end to end solution for a procurement use case. Currently customers perform analysis of various market reports on their own or hire experts to make procurement decision. These experts analyze reports captured from data sources, a process that can be time consuming and prone to human error. This could potentially cause a chain effect of issues that may impact production.
By using our intelligent procurement system, based on Watson Discovery, a customer can receive expert analysis more quickly and accurately. The customer must first train the model with various use cases (via reports) to receive accurate results. The target end user of this system is a person working in a procurement role at a company.
As a developer going through this code pattern, you will learn how to:
As an end user, you will be able to:
To understand the significance of Watson Knowledge Studio (WKS) in this example we will look at the output extracted from Watson Discovery when using with WKS and without using WKS.
Waston Discovery output without WKS:
..
t": "Asahi Kasei Corp",
evance": 0.227493,
e": "Company"
...
t": "Kawasaki",
evance": 0.274707,
e": "Company"
...
Watson Discovery output with WKS:
...
: "-E114",
t": "Asahi Kasei Corp",
e": "Supplier""
...
: "-E119",
t": "Kawasaki",
e": "Facility"
...
Looking at the output of Discovery without WKS we can see that Asahi Kasei
and Kawasaki
are identified as a company
, this is expected as Discovery without WKS only performs basic Natural Language Understanding (NLU) processing, it cannot understand language specific to the procurement domain. However, if we use Watson Discovery with WKS we can see that Asahi Kasei
is identified as a supplier
, whereas Kawasaki
is identified as a facility
.
The steps followed to create solution is as follows. For commands please refer Running the application on IBM Cloud section below.
clone https://github.com/IBM/procurement-analysis-with-wks
Create the following services:
Launch the WKS tool and create a new workspace.
A type system allows us to define things that are specific to our SMS messages. The type system controls how content can be annotated by defining the types of entities that can be labeled and how relationships among different entities can be labeled.
To upload our pre-defined type system, from the Access & Tools -> Entity Types panel, press the Upload button to import the Type System file data/wks-resources/types-36a431a0-f6a0-11e7-8256-672fd3d48302.json found in the local repository.
This will upload a set of Entity Types and Relation Types.
Corpus documents are required to train our machine-learning annotator component. For this Code Pattern, the corpus documents will contain example procurement documents.
From the Access & Tools -> Documents panel, press the Upload Document Sets button to import a Document Set file. Use the corpus documents file data/wks-resources/corpus-36a431a0-f6a0-11e7-8256-672fd3d48302.zip found in the local repository.
NOTE: Uploading the corpus documents provided in this Code Pattern is not required, but recommended to simplify the annotation process (all provided documents will come pre-annotated). An alternative approach would be to is to upload standard text files and perform the annotations manually.
NOTE: Select the option to “upload corpus documents and include ground truth (upload the original workspace's type system first)“.
Once the corpus documents are loaded, we can start the human annotation process. This begins by dividing the corpus into multiple document sets and assigning the document sets to human annotators (for this Code Pattern, we will just be using using one document set and one annotator).
From the Access & Tools -> Documents panel, press the Create Annotation Sets button. Select a valid Annotator user, and provide a unique name for Set name.
Add a task for human annotation by creating a task and assigning it annotation sets.
From the Access & Tools -> Documents panel, select the Task tab and press the Add Task button.
Enter a unique Task name and press the Create button.
A panel will then be displayed of the available annotation sets that can be assigned to this task. Select the Annotation Set you created in the previous step, and press the Create Task button.
Click on the task card to view the task details panel.
Click the Annotate button to start the Human Annotation task.
If you select any of the documents in the list, the Document Annotation panel will be displayed. Since we previously imported the corpus documents, the entity and relationship annotations are already completed (as shown in the following examples). You can annotate mentions (occurrences of words/phrases which can be annotated as an entity) to play around, or you can modify one by annotating mentions with a different entity.
From the Task details panel, press the Submit All Documents button.
All documents should change status to Completed.
Press the blue “File” icon to toggle back to the Task panel, which will show the completion percentage for each task.
From the Access & Tools -> Documents panel, select the Task tab and select the task to view the details panel.
Select your Annotation Set Name and then press the Accept button. This step is required to ensure that the annotation set is considered ground truth.
NOTE: The objective of the annotation project is to obtain ground truth, the collection of vetted data that is used to adapt WKS to a particular domain.
Status should now be set to COMPLETED.
Go to the Model Management -> Performance panel, and press the Train and evaluate button.
From the Document Set name list, select the Annotation Set Name you created previously and press the Train & Evaluate button.
This process may take several minutes to complete. Progress will be shown in the upper right corner of the panel.
Note: In practice, you would create separate annotation sets (each containing thousands of messages) for training and evaluation.
Once complete, you will see the results of the train and evaluate process.
Now we can deploy our new model to the already created Discovery service. Navigate to the Version menu on the left and press Take Snapshot.
The snapshot version will now be available for deployment to Discovery.
To start the process, click the Deploy button associated with your snapshot version.
Select the option to deploy to Discovery.
Then enter your IBM Cloud account information to locate your Discovery service to deploy to.
Once deployed, a Model ID will be created. Keep note of this value as it will be required later in this Code Pattern.
NOTE: You can also view this Model ID by pressing the WDS button listed with your snapshot version.
Launch the Watson Discovery tool. Create a new data collection and give the data collection a unique name.
From the new collection data panel, under Configuration
click the Switch
button to switch to a new configuration file. Click Create a new configuration
option.
Enter a unique name and press Create
.
From the Configuration Panel, press the Add enrichments
option. Ensure that the following extraction options are added: Keyword, Entity, and Relation.
Also, assign your Model ID to both the Entity Extraction and Relation Extraction.
Note: These Model ID assignments are required to ensure your review data is properly enriched.
Close the Add Ennrichments panel by pressing Done
.
Save the configuration by pressing Apply & Save
, and then Close
.
Once the configuration is created, you can proceed with loading discovery files.
From the new collection data panel, under Add data to this collection
use Drag and drop your documents here or browse from computer
to seed the content with the procurment document files extracted from data/disco-docs/
.
nv.sample .env
Edit the .env
file with the necessary settings.
env.sample:
place the credentials here with your own.
name this file to .env before starting the app.
nusGraph DB
H_DB_USERNAME=admin
H_DB_PASSWORD=<add_janusgraph_password>
H_DB_API_URL=<add_janusgraph_api_url>
tson Discovery
OVERY_USERNAME=<add_discovery_username>
OVERY_PASSWORD=<add_discovery_password>
OVERY_ENVIRONMENT_ID=<add_discovery_environment_id>
OVERY_CONFIGURATION_ID=<add_discovery_configuration_id>
OVERY_COLLECTION_ID=<add_discovery_collection_id>
The settings can be found by navigating to the specific service instance from within the IBM Cloud dashboard.
For the JanusGraph entries, navigate to the Service Credentials
panel for the your JanusGraph service instance. The values can be found in the gremlin_console_yaml
section of the generated credentials. For example:
mlin_console_yaml": [
osts: [portal-ssl204-25.bmix-dal-yp-299e7bd4.test1-ibm-com.composedb.com]\nport: 41590\nusername: admin\npassword: MASHDUVREXMCSZLR\nconnectionPool: { enableSsl: true }\nserializer: { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}",
In this case, you would set your values to:
H_DB_API_URL=https://portal-ssl204-25.bmix-dal-yp-299e7bd4.test1-ibm-com.composedb.com:41590
H_DB_PASSWORD=MASHDUVREXMCSZLR
npm install
, followed by npm start
.npm start
command. For example, http://localhost:6003
.To deploy to the IBM Cloud, make sure you have the IBM Cloud CLI tool installed. Then run the following commands to login using your IBM Cloud credentials.
rocurement-analysis-with-wks
ogin
When pushing your app to the IBM Cloud, values are read in from the manifest.yml file. Edit this file if you need to change any of the default settings, such as application name or the amount of memory to allocate.
ications:
me: procurement-analysis-with-wks
mory: 256M
stances: 1
th: .
ildpack: sdk-for-nodejs
ndom-route: false
Additionally, your environment variables must be set in your .env
file as described previously in Step 11. Configure credentials.
To deploy your application, run the following command.
ush
NOTE: The URL route assigned to your application will be displayed as a result of this command. Note this value, as it will be required to access your app.
To view the application, go to the IBM Cloud route assigned to your app. Typically, this will take the form https://<app name>.mybluemix.net
.
To view logs, or get overview information about your app, use the IBM Cloud dashboard.