Cleaning data in Power BI: Transform (Edit) Data Lesson-5

Cleaning data in Power BI: Transform (Edit) Data Lesson-5

Cleaning Data in Power BI

Data cleaning is an essential step in data analysis, and Power BI offers powerful tools to transform your raw data into a clear and consistent format. Here is a comprehensive guide:

Problem identification:

  • Incomplete data: missing values, empty entries, or inconsistent formats.
  • Inaccurate data: typos, errors in data entry, or outdated information.
  • Conflicting data: Different units, formatting styles, or naming conventions.
  • Duplicate data: The same data points appear multiple times.

Common cleaning techniques:

  • Filtering: Remove irrelevant rows or columns based on specified criteria.
  • Substituting values: Replace missing values with suitable substitutes such as mean or median.
  • Formatting: Ensure consistent date, time, number, and text formats.
  • Column Splitting: Split a column into separate columns for clarity.
  • Merging Columns: Combine related columns for easier analysis.
  • Creating calculated columns: Derive new data points based on existing columns.
  • Duplicate Removal: Identify duplicate rows or entries and eliminate them.

Power BI Tools:

Power Query Editor: Offers a rich toolbox for cleaning and transforming data.
Conditional Formatting: Visually highlight potential problems such as missing values.
Data modeling tools: Create relationships between tables and manage data types.

Cleaning Points:

  1. Connect to your data source.
  2. Analyze data: Identify areas that need cleaning using Power BI visual or data profiling tools.
  3. Open the Power Query Editor: Transform data in Home Tab your data using the available options.
  4. Filter out irrelevant data.
  5. Replace missing values: Choose appropriate replacement or removal strategies.
  6. Format and Clean Data Types: Ensure consistency and compatibility.
  7. Split or merge columns: Refine the data structure for clarity.
  8. Create Calculated Columns: Gain new insights from existing data.
  9. Remove duplicates: Ensure data integrity.
  10. Review and Improve: Validate your cleaning process and make adjustments if needed.

Suggestions:

  • Document your cleaning steps for future reference.
  • Consider using data profiling tools for a comprehensive analysis.
  • Start with small datasets to practice and build your skills.
  • Take advantage of online resources and community forums for support and learning.
  • Remember: Data cleaning is an iterative process. Continually review your data and refine your cleaning techniques for optimal results.

 

Lab Work 

first of all connect required data file and then follow these steps:

With this option (Transform data) we can change:

  • Heading (fields)
  • Remove row
  • Remove column
  • Change data type
  • Keep Rows
  • Keep Columns
  • No change table data records and More...

 

1:- Start:

  • Home -> Transform data -> Transform data
  • Display this table view window (Which is editable mode)
  • Now you can modify your table

 

2:- Change or rename your field name (Heading)

  • Double click or Right Click on required field and change it.

 

3:- Undo

  • if you want to "undo" then you click on cross from "APPLIED STEPS" box
  • you can undo one by one

 

How to change datatypes (like string or numaric)

  • We can change the data type of table headings
  • click on right top corner icon(ABC123) of any headin and change type
  • i.e 123 means numaric type & ABC means string (Text type)

 

4:- Remove Rows

Remove Rows from table

How many rows to remove from the top (start)

  • Home -> Remove Rows -> Remove Top Rows
  •  
  • Type number you want to how many rows to delete from start (From Top) and press ok button

 

How many rows to remove from Bottom(end) to top 

  • Home -> Remove Rows -> Remove Bottom Rows

How many rows to remove from - to (OR) From Mid 

  • Home -> Remove Rows -> Remove Alternate

Remove Duplicate Rows

  • Remove all duplicate(Repeated) rows(Records) from table
    • Home -> Remove Rows -> Remove Duplicates

Remove Blank Rows

  • Remove all blank(Empty) rows from table
    • Home -> Remove Rows -> Remove Blank Rows

Remove Errors

  • Remove all errors from table
    • Home -> Remove Rows -> Remove Errors

 

5:- Keep Rows from table

Keep Top Rows

  • keep only the top N rows from table 
    • Home -> Keep Rows -> Keep Top Rows

 

Keep Bottom Rows

  • keep only the bottom N Rows fron this table
    • Home -> Keep Rows -> Keep Bottom Rows

Keep Range of Rows

  • Specify the number of rows to keep from start to end
    • Home -> Keep Rows -> Keep Range of Rows

Keep Duplicate

  • keep duplicate rows of selected column
    • Home -> Keep Rows -> Keep Duplicats

Keep Errors

  • not remove error from selected columns
    • Home -> Keep Rows -> Keep Errors

 

 


I hope this comprehensive guide empowers you to effectively clean your data in Power BI and unlock valuable insights!