Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Data Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more

Reply
timeseriesIQR
Helper I
Helper I

Data Engineering Template

Does anyone use Excel/PowerQuery for Data Cleansing/Engineering? An Excel template I developed could be of interest.
 
There are only three buttons.
 
  1. The IQR Process button runs a process that flags your a column of data for unusually low or high statistical outliers using the IQR Statistical Rule.
  2. The K-means Clustering button lets you run a simple K-means Clustering process on the same original column of data.
  3. The Start Over button deletes Columns B and C created by the two buttons above.
 
The pop-up messages and 3 pivot tables let data analysts compare the results from the two processes and provide a means to filter extremes in their data when creating Power BI and other dashboards so KPIs are not skewed in reporting.
4 REPLIES 4
apturlov
Super User
Super User

@timeseriesIQR thank you for sharing your work with the community. Why it's an interesting and potentially useful approach using Excel with Microsoft Fabric for data cleansing would not be considered a suggested pattern. Fabric provides multiple native workloads for exactly such a purpose as part of an ETL/ELT process or as standalone analytical tools. In no particular order here is a recommended toolset for data cleansing in Microsoft Fabric:

  1. DataFlow Gen2 - a visual way of loading data from a data source, transforming data to a new shape and storing into a native Fabric storage such as Data Warehouse. Can be combined with other tools in data orchestration workflows.
  2. Spark Notebook - a code first workload for Spark engine that allows various data manipulations for all kinds of phases in ETL/ELT processes. Preferred way in Lakehouse-based architecture.
  3. SQL explorer - a T-SQL oriented environment that enables all kinds of data manipulations using T-SQL language. Preferred way in SQL-first data architectures.
  4. KQL Query Set - a time-series oriented environment native for working with time-series data stored in KQL databases. Can be used for DDL and DML operations.

If you are a beginner in Microsoft Fabric and trying to find your ways and migrate from convenient and familiar office-level data tools to a professioanal toolset of a unified enterprise-grade data platfrom I suggest you start educating yourself using specialized learning tracks on Microsoft Learn like this Course DP-600T00-A: Implement analytics solutions using Microsoft Fabric - Training | Microsoft Lear...

I don't have to. I am a programmer.

timeseriesIQR
Helper I
Helper I

Here is my OneDrive link to the template I am talking about above if you want to download and try for free and follow along with what I said above in my post: https://1drv.ms/x/c/e6e20e628ff89088/IQCMzu0JwWduSK6xF9D31_IqAZUZinFWHp7UtgbHVOUlcWk

v-saisrao-msft
Community Support
Community Support

Hi @timeseriesIQR,

We appreciate you sharing your work with the community and taking the time to explain your approach. I'm sure other community members will find it helpful.

 

Thank you.

Helpful resources

Announcements
Fabric Data Days is here Carousel

Data Days 2026

Don't miss out on Data Days, June 15 through August 7. Learn Fabric, Power BI, SQL, AI and more.

June Fabric Update Carousel

Fabric Monthly Update - June 2026

Check out the June 2026 Fabric update to learn about new features.