Lyza

Support

Support Home Support Home | Glossary Glossary | Contact Us Contact Us
Home > Tips and Tricks > Data Profiling via Summary

Data Profiling via Summary

Objective


To find all the unique combinations across multiple columns and/or to analyze aggregate values like MIN, LAST, and AVERAGE, use a Summary step.
 

How To


A sample workbook is available for download via the link at the bottom of this article. It will allow you to view and interact with the data in the examples and screen shots used in this article.

Drag a Summary step (circled in red) from the palette and drop it onto any source or any intermediate dataset. Lyza will connect the new step on the workbook canvas.

To find the distinct combinations of values across multiple columns (numeric, date, or text), drag a Summary step onto any dataset. Drag one or more columns from the Input to the Output section, and select GROUP BY for all the columns.

The example at right shows there are only 11 unique combinations of these three columns across all the records of the Input dataset.



To analyze the frequency of each of these combinations, add a COUNT. To do this, drag another column (any one will do) from the Input into the Output and select COUNT in its drop down.


In the example above, note that the user has sorted the distinct set descending based upon the frequency of occurrence in the Input dataset. To sort, click on the blue diamond in the column header, which will make it an inverted green triangle. Click it again to change the sort to ascending, which will flip the triangle.



You can drag the same Input column down multiple times to compare multiple aggregations, such as MIN vs. MAX as in the example at right.

Summary also allows you to select the FIRST, LAST, MEDIAN, STD DEV, AVERAGE, and SUM values for metric columns - again, segmented for each combination of the GROUP BY values.

 

Downloadable Workbook with Sample Data


Follow these steps to import the workbook and sample data used for this tip into Lyza:
  1. Save the file to your desktop (TIP_Data_Profiling_via_Summary.lwb)
  2. Open Lyza
  3. From the menu, select File > Import Workbook and browse to the downloaded LWB file on your desktop
  4. Experiment with the tips in this article, using the sample data provided
User Comments User Comments Add Comment
Click Here to be the first to post a comment.
Related Questions Related Questions
  1. August 18, 2009
Was this article helpful?
Rate it:
Overall Rating
No Rating