Name: pixiedust-facebook-analysis
Owner: International Business Machines
Description: A Jupyter notebook that uses the Watson Visual Recognition, Natural Language Understanding and Tone Analyzer services to enrich Facebook Analytics and uses PixieDust to explore and visualize the results in Watson Studio
Created: 2017-06-15 16:19:37.0
Updated: 2018-04-11 23:17:51.0
Pushed: 2018-03-22 01:04:33.0
Homepage: https://developer.ibm.com/code/patterns/discover-hidden-facebook-usage-insights/
Size: 2952
Language: HTML
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Data Science Experience is now Watson Studio. Although some images in this code pattern may show the service as Data Science Experience, the steps and processes will still work.
In this Code Pattern, we will use a Jupyter notebook to glean insights from a vast body of unstructured data. Credit goes to Anna Quincy and Tyler Andersen for providing the initial notebook design.
We'll start with data exported from Facebook Analytics. We'll enrich the data with Watson?s Natural Language Understanding (NLU), Tone Analyzer and Visual Recognition.
We'll use the enriched data to answer questions like:
What sentiment is most prevalent in the posts with the highest engagement performance?
What are the relationships between social tone of article text, the main article entity, and engagement performance?
These types of insights are especially beneficial for marketing analysts who are interested in understanding and improving brand perception, product performance, customer satisfaction, and ways to engage their audiences.
It is important to note that this Code Pattern is meant to be used as a guided experiment, rather than an application with one set output. The standard Facebook Analytics export features text from posts, articles, and thumbnails, along with standard Facebook performance metrics such as likes, shares, and impressions. This unstructured content was then enriched with Watson APIs to extract keywords, entities, sentiment, and tone.
After data is enriched with Watson APIs, there are several different types of ways to analyze it. Watson Studio provides a robust, yet flexible method of exploring the unstructured, enriched Facebook content.
This Code Pattern provides mock Facebook data, a notebook, and comes with several pre-built visualizations to jump start you with uncovering hidden insights.
When the reader has completed this Code Pattern, they will understand how to:
Follow these steps to setup and run this Code Pattern. The steps are described in detail below.
Sign up for IBM's Watson Studio. By creating a project in Watson Studio a free tier Object Storage
service will be created in your IBM Cloud account. Take note of your service names as you will need to select them in the following steps.
Note: When creating your Object Storage service, select the
Free
storage type in order to avoid having to pay an upgrade fee.
Create notebook
to create a notebook.Assets
tab, select the Create notebook
option.From URL
tab.Create
button.Create the following IBM Cloud services by clicking the Deploy to IBM Cloud
button, or use these links to create the services in the IBM Cloud UI.
Find the notebook cell after 1.5. Add Service Credentials From IBM Cloud for Watson Services
.
Replace the five <add_...>
placeholder values with information from the Service Credentials
tab in IBM Cloud. Use your IBM Cloud dashboard to find each of the services and click on the Service Credentials
tab. In some cases, you might need to create credentials with the New Credential
option.
Note: This cell is marked as a
hidden_cell
because it will contain sensitive credentials.
Use Find and Add Data
(look for the 10/01
icon)
and its Files
tab. From there you can click
browse
and add a .csv
file from your computer.
Note: If you don't have your own data, you can use our example by cloning this git repo. Look in the
data/example_input
directory.
Find the notebook cell after 2.1 Load data from Object Storage
. Place your cursor after # Insert pandas DataFrame
. Make sure this cell is selected before inserting code.
Using the file that you added above (under the 10/01
Files tab),
use the Insert to code
drop-down menu.
Select Insert Pandas DataFrame
from the drop-down menu.
Note: This cell is marked as a
hidden_cell
because it contains sensitive credentials.
Note: There is an issue that causes failure of non utf-8 encodings that requires a workaround. You would fix this in the cell above by adding an encoding parameter to read_csv(). For our
example_facebook_data.csv
:ata_1 = pd.read_csv(body, encoding='latin-1')
The inserted code includes a generated method with credentials and then calls
the generated method to set a variable with a name like df_data_1
. If you do
additional inserts, the method can be re-used and the variable will change
(e.g. df_data_2
).
Later in the notebook, we set df = df_data_1
. So you might need to
fix the variable name df_data_1
to match your inserted code or vice versa.
We want to write the enriched file to the same container that we used above. So now we'll use the same file drop-down to insert credentials. We'll use them later when we write out the enriched CSV file.
After the df
setup, there is a cell to enter the file credentials.
Place your cursor after the #insert credentials for file - Change to credentials_1
line. Make sure this cell is selected before inserting credentials.
Use the CSV file's drop-down menu again. This time select Insert Credentials
.
Note: This cell is marked as a hidden_cell
because it contains sensitive credentials.
The inserted code includes a dictionary with credentials assigned to a variable
with a name like credentials_1
. It may have a different name (e.g. credentials_2
).
Rename it or reassign it if needed. The notebook code assumes it will be credentials_1
.
When a notebook is executed, what is actually happening is that each code cell in the notebook is executed, in order, from top to bottom.
Each code cell is selectable and is preceded by a tag in the left margin. The tag
format is In [x]:
. Depending on the state of the notebook, the x
can be:
*
, this indicates that the cell is currently executing.There are several ways to execute the code cells in your notebook:
Play
button in the toolbar.Cell
menu bar, there are several options available. For example, you
can Run All
cells in your notebook, or you can Run All Below
, that will
start executing from the first cell under the currently selected cell, and then
continue executing all cells that follow.Schedule
button located in the top right section of your notebook
panel. Here you can schedule your notebook to be executed once at some future
time, or repeatedly at your specified interval.If you walk through the cells, you will see that we demonstrated how to do the following in Part I:
In Part II, we used pandas to create multiple DataFrames from our main enriched DataFrame. After slicing and dicing and cleaning, these new DataFrames are ready for PixieDust to use.
In Part III, we analyze the results by exploring and visualizing the metrics with PixieDust.
After all the prep work done earlier, you'll see that there is almost no code needed here (thanks to PixieDust). We just use one-liners like this:
lay(<data-frame>)
You should also notice that we used `display(tones)
in two different
cells, but the result was two different charts. How can that happen?
Well, we used cell metadata to tell PixieDust how to display the data.
Notice the
Edit Metadatabutton on each cell. If you don't see it, use the menu
View > Cell Toolbar > Edit Metadata` to make it visible. If you look at
the metadata for the first two charts, you'll see how we got a bar chart and a pie chart.
PixieDust is interactive! This is where we explore to find out what the enriched data will tell us.
Use the Options
button to change the chart settings. The first chart shows
post consumption by the detected emotion in the article. Notice how changing
the aggregation type from SUM to AVG gives you a very different conclusion.
You can also change it to COUNT to see the frequency of each emotion, but when you do that the metric no longer matters.
Explore by trying the following:
The right combination will give you insights into the impact of your facebook posts. Once you uncover the insights, find the best presentation to convince others.
Under the File
menu, there are several ways to save your notebook:
Save
will simply save the current state of your notebook, without any version
information.Save Version
will save your current state of your notebook with a version tag
that contains a date and time stamp. Up to 10 versions of your notebook can be
saved, each one retrievable by selecting the Revert To Version
menu item.The example output in data/examples
has embedded JavaScript for
PixieDust charts. View it via nbviewer: here
Note: Some interactive functionality might not work in the saved example. Run the notebook for full functionality. To see the code and markdown cells without output, you can view notebooks/pixiedust_facebook_analysis.ipynb with the Github viewer.