A couple of weeks ago, I posted an introduction to Power BI for data mash up using D&B data. Generally speaking, before you buy anything, you want to test it out first. The same rule applies to data. You want to make sure that you get the right quality of data and do some testing on this. The easiest way for me to do this is by using Power Query. There are other ways to test it out of course, but in this article I will focus on using Power Query.
Case Study
I have a list of companies with City, State and Country, that may or may not be correct. The requirement is to get more company details, such as Number of Employees, Annual Sales, the year that the company started, the Legal Status and Phone Number. These are common information that competitors or marketers would like to investigate.
After a few minutes of research at Windows Azure Market Place, I have found a couple of interesting data services provided by D&B:
1. Company Cleanse & Match for Microsoft SQL Server Data Quality Services – I will refer this as Company Cleanse & Match for the rest of the article.
2. Company Firmographics – Industry and Sales – I will refer this as Company Firmographics for the rest of the article
Completed Excel Solution is available here: Power Query for DnB Data Mashup.xlsx. You will need Excel 2013 with Power Query installed – check full System Requirements for Power Query here.
Step by Step
1. Prepare test companies
Let’s first start with the data that we want to test out. I have a list of 7 companies in an Excel table, called “CompanyInput” as shown below.
2. Create a Power Query function for Company Cleanse & Match
The strategy here is to create a Power Query function that represent the call to Company Cleanse & Match, which can be called by another query.
To do so in the same Excel workbook:
a. Click on Power Query > From Other Sources > From OData Feed from the ribbon.
Note: At the time of testing this, the D&B Company Cleanse & Match service does not work with the Windows Azure MarketPlace option.
b. Enter the Service Root URL as provided on the “Details” tab of the D&B Company Cleanse & Match page, which is:
https://api.datamarket.azure.com/DNB/DQSCompanyMatch/v1/
Note: This step assumes that your Azure account has a valid subscription to the D&B Company Cleanse & Match service.
c. Clicking OK will display the Query Editor window with the following message:
d. Rename the “Query1” to “fnSuggestCompanyDetails”. The new name is now the friendly name that we will use in other Queries that we will create using Power Query.
Under the Navigator menu, you can see a “fx” symbol next to SuggestCompanyDetails, which is the function that is exposed by the D&B Company Cleanse & Match service. Clicking on the script symbol on the Formula Bar as shown below will bring up the Query Editor window.
So, all the steps that we have done in Step 2 essentially is to produce the script as shown below, which defines “fnSuggestCompanyDetails”.
Code:
let Source = OData.Feed("https://api.datamarket.azure.com/DNB/DQSCompanyMatch/v1/"), SuggestCompanyDetails = Source{[Name="SuggestCompanyDetails"]}[Data] in SuggestCompanyDetails
e. Clicking Done on Edit Query then on Query Editor, will create a new worksheet, Sheet2 containing the fnSuggestCompanyDetails definition.
3. Repeat Step 2 for Company Firmographics service. The OData Feed URL for this service is:
https://api.datamarket.azure.com/DNB/CompanyFirmographics/v1/
Don’t forget to rename the Query to fnGetFirmographic, so that you have something similar as below:
This step should create a new worksheet, called Sheet3 which contains the definition of the new function of fnGetFirmographic.
Code:
let Source = OData.Feed("https://api.datamarket.azure.com/DNB/CompanyFirmographics/v1/"), ReturnData = Source{[Name="ReturnData"]}[Data] in ReturnData
4. Go to Power Query and click on Fast Combine.
Note: This step is important as it allows. you to call other Queries (or functions that we have defined earlier) in another Query within the same workbook.
Click on Enable on the next dialog box: 5. Create a new Query based on the CompanyInput table in Excel. a. Place the cursor on CompanyInput table, and click on Power Query > From Table Clicking From Table will bring up the Query Editor dialog box as follows, which brings the data from the table in CompanyInput table.
b. Right click on Min Confidence column, then click on Insert Column > Custom… c. Enter the function as specified below. The order of the input parameter must match exactly as laid out by D&B Company Cleanse & Match in Step 2d.
d. Clicking OK on the dialog box will return Query Editor with a new column added called “Custom” with “Table” value. Right click on the Custom column header, and choose to Expand a handful of columns (including DunsNumber as you will need it as an input parameter for the fnGetFirmographic call later).
Clicking OK will cause Power Query to call D&B service to bring back the values for the asked columns, and after a few seconds, the Query Editor dialog box should display more Custom columns as shown below:
e. Scroll right across so you can see Custom.Confidence column, and insert a new Custom column to bring back more data from fnGetFirmogrpahic.
f. Similar to Step 5f, expand the Custom column to choose a few columns from the Company Firmographics output. g. Clicking OK from the drop down list above should return Query Editor with additional chosen columns as shown below. h. Clicking Done on Query Editor will create a new sheet that contains the final output of the data from both D&B Company Cleanse & Match as well as Company Firmograhics. Code:
let Source = Excel.CurrentWorkbook(){[Name="CompanyInput"]}[Content], InsertedCustom = Table.AddColumn(Source, "Custom", each fnSuggestCompanyDetails([Company Name],null,null,null,[State],[City],[Country],null,[Max Returned],[Min Confidence])), #"Expand Custom" = Table.ExpandTableColumn(InsertedCustom, "Custom", {"RecordId", "DunsNumber", "CompanyName", "AddressLine", "ZIP", "State", "City", "Country", "Confidence", "MatchGradeText", "MatchGradeComponentCount", "MatchGradeComponentTypeText", "MatchGradeComponentRating", "MatchGradeComponentScore"}, {"Custom.RecordId", "Custom.DunsNumber", "Custom.CompanyName", "Custom.AddressLine", "Custom.ZIP", "Custom.State", "Custom.City", "Custom.Country", "Custom.Confidence", "Custom.MatchGradeText", "Custom.MatchGradeComponentCount", "Custom.MatchGradeComponentTypeText", "Custom.MatchGradeComponentRating", "Custom.MatchGradeComponentScore"}), InsertedCustom1 = Table.AddColumn(#"Expand Custom", "Custom", each fnGetFirmographic([Custom.DunsNumber])), #"Expand Custom1" = Table.ExpandTableColumn(InsertedCustom1, "Custom", {"AnnualSalesUSDollars", "EmployeesHere", "EmployeesTotal", "LegalStatus", "CompanyStartYear", "SingleLocation", "OutOfBusinessInd", "LineOfBusiness", "IndustryCodeType1", "IndustryCode1", "IndustryCodeType2", "IndustryCode2", "FormerCompanyName", "TransactionFlag", "TransactionStatus", "TransactionStatusCode"}, {"Custom.AnnualSalesUSDollars", "Custom.EmployeesHere", "Custom.EmployeesTotal", "Custom.LegalStatus", "Custom.CompanyStartYear", "Custom.SingleLocation", "Custom.OutOfBusinessInd", "Custom.LineOfBusiness", "Custom.IndustryCodeType1", "Custom.IndustryCode1", "Custom.IndustryCodeType2", "Custom.IndustryCode2", "Custom.FormerCompanyName", "Custom.TransactionFlag", "Custom.TransactionStatus", "Custom.TransactionStatusCode"}) in #"Expand Custom1"
You can also rename columns. To do that, click on Filter & Shape under Query Settings side pane. For example, instead of Custom.DunsNumber, it would be more user friendly to call it DUNS Number.
Important Notes
Every time you make changes to the query, Power Query will make the calls to D&B Data Service. Since the D&B Services are based on number of transactions per month, it is advisable to make sure that you are using a trial subscription while developing.
The Step by Step section above is accurate as at 1 Sep 2013. There may be some changes to the API which may render the instruction above invalid.
Earlier I mentioned that try it before you buy it. Most companies that sell data, should be able to provide you with reasonable good documentation and would be able to supply you with the latest documentation in case their data service changes. As an example, the Company Cleanse & Match for Microsoft SQL Server Data Quality Services provides Reference Guide, which is helpful in understanding what data you are getting.
Wrap Up
Data has become a commodity where it is sold, massaged and reused in many different ways. One of the ways to purchase data is through Windows Azure Market Place. Power BI helps us easily to test data and to quickly prototype self-service analysis. Using Power Query, you can retrieve and perform data mash up, within Excel. The data sources can come from an Excel table and a number of other data sources such as D&B Data services on Windows Azure Market Place, as shown in the case study above.
No doubt when you deal with larger scale of data with more complexities, that it may be better to use SQL Server to do the job, as shown in this example.
Further Reading
Company Cleanse & Match for Microsoft SQL Server Data Quality Services – Windows Azure Market Place
Company Firmographics – Industry and Sales – Windows Azure Market Place
Power Query articles by Chris Webb (No doubt that if I have some questions about Power Query, I generally go to Chris Webb’s website first)
Power Query: Creating a Function to Correct and and Verify Addresses via External API – Rafael Salas (Thanks to Rafael for the great tutorial on how to reusing code in Power Query via functions)
Cleansing your Customer Data using Dun & Bradstreet DQS service – MSDN DQS Blog
Power BI for D&B Data Mashup – Introduction – Ms SQL Girl
No responses yet