Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more

Reply
Rob112358
Frequent Visitor

Power Query - Self Referencing Sheet Failure

Actual Problem I’m trying to solve:

  • Item numbers do not have unique customers.  Some customers were assigned inappropriately through the ERP system, duplicating the item number.  This duplication cannot be deleted from PLEX (our ERP).
  • Duplicates cannot be removed automatically.  There is no Primary / Secondary customer field.
  • All ERP data is in our lake house.
  • Ultimately, I need the actual Item – customer sheet to add to my Power BI Model.  This needs to have unique rows so it can be a dimension table.  I cannot have duplicate items.

 

Proposed Solution Using Self Referencing Sheet:

  1. Using Excel, query the lake house (Get Data – Power Platform – Lake House – Select Fields) pulling in the Items and customer names (duplicates included, as they cannot be separated automatically)
  2. A table (T1) is created automatically of the selected fields.
  3. Open Table 1 in Power Query A) add an Index Column and B) save query to a sheet.
  4. In the new sheet (Table 2) add a column where I differentiate the 300 or so duplicates as either Primary (keeping) or Secondary (Duplicates to be removed).
  5. Open Sheet 2 with the new column in Power Query
  6. While in Power Query, select Table 1 and merge (Left Outer) to Table 2 using the common Index column (from step 3).
  7. Expand the columns and select the “Primary-Secondary” field.
  8. Filter out the duplicates
  9. Looks beautiful, all duplicates are removed.  I can sort, move, the table is maintained.

Process Failure:

  • Upon Lakehouse refresh, the self-referenced sheet is not maintained and the duplicates return.
  • I am very open to other solutions!!
  • I have created working self-referencing sheets that are in "production".  I never had to add an index in the middle of the process to generate a unique column.


Software:

MS Azure – Lakehouse

MS Excel / Power Query

Windows 365 fully licensed / corporate.

Power BI Desktop (well, once we solve this it will be added to my data model)

 

Example Tables:

Rob112358_0-1706218655376.pngRob112358_1-1706218666473.png

Rob112358_2-1706218682527.png

Rob112358_3-1706218700182.png

 

 

 

 

2 ACCEPTED SOLUTIONS
Anonymous
Not applicable

Hi @Rob112358 

1.You can use add index group by the [Part_No] and [Main_part_key], the step you cna refer to the following link.

Numbering Grouped Data in Power Query • My Online Training Hub

2.Then filter the index column(created in fiest setp) to filter the index equal 1

vxinruzhumsft_0-1706249239062.png

Then when you refresh the datasouce , it will always will return the fist record without repeating.

 

Best Regards!

Yolo Zhu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

View solution in original post

dufoq3
Super User
Super User

Try this:

dufoq3_0-1706384393451.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQwMDQ0UdJRCijKTylNLlHIS8xNVVA2BIqYG5hbGpsBGc75RUWpQCnn0uKS/NzUIoi0W2ZeZnFGaooCTKeuglNpTrZSrA5xpganJufnpSQWVcLNJcZQMwxDjUCGGhobmRphdaoRdlMDEpOzE9NTU4g2mVjnYhpsjmGwMcxgY6xONqaSyaQ7ORYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"GOLD PL_PartMain [Part_No]" = _t, #"GOLD PL_PartMain [Part_Name]" = _t, #"GOLD PL_PartMain [Main_Part_Key]" = _t, #"GOLD PL_CommonCustomerPart [Customer_Name]" = _t, #"GOLD PL_PartMain [Part_Type]" = _t]),
    #"Grouped Rows" = Table.Group(Source, {"GOLD PL_PartMain [Main_Part_Key]"}, {{"All", each Table.FirstN(_, 1)}}),
    // This step expands every column except 1st column of prev. step "Grouped Rows" which is "GOLD PL_PartMain [Main_Part_Key]"
    #"Expanded All Dynamic" = Table.ExpandTableColumn(#"Grouped Rows", "All", List.Select(Table.ColumnNames(Source), each _ <> Table.ColumnNames(#"Grouped Rows"){0}))
in
    #"Expanded All Dynamic"

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

2 REPLIES 2
dufoq3
Super User
Super User

Try this:

dufoq3_0-1706384393451.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQwMDQ0UdJRCijKTylNLlHIS8xNVVA2BIqYG5hbGpsBGc75RUWpQCnn0uKS/NzUIoi0W2ZeZnFGaooCTKeuglNpTrZSrA5xpganJufnpSQWVcLNJcZQMwxDjUCGGhobmRphdaoRdlMDEpOzE9NTU4g2mVjnYhpsjmGwMcxgY6xONqaSyaQ7ORYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"GOLD PL_PartMain [Part_No]" = _t, #"GOLD PL_PartMain [Part_Name]" = _t, #"GOLD PL_PartMain [Main_Part_Key]" = _t, #"GOLD PL_CommonCustomerPart [Customer_Name]" = _t, #"GOLD PL_PartMain [Part_Type]" = _t]),
    #"Grouped Rows" = Table.Group(Source, {"GOLD PL_PartMain [Main_Part_Key]"}, {{"All", each Table.FirstN(_, 1)}}),
    // This step expands every column except 1st column of prev. step "Grouped Rows" which is "GOLD PL_PartMain [Main_Part_Key]"
    #"Expanded All Dynamic" = Table.ExpandTableColumn(#"Grouped Rows", "All", List.Select(Table.ColumnNames(Source), each _ <> Table.ColumnNames(#"Grouped Rows"){0}))
in
    #"Expanded All Dynamic"

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Anonymous
Not applicable

Hi @Rob112358 

1.You can use add index group by the [Part_No] and [Main_part_key], the step you cna refer to the following link.

Numbering Grouped Data in Power Query • My Online Training Hub

2.Then filter the index column(created in fiest setp) to filter the index equal 1

vxinruzhumsft_0-1706249239062.png

Then when you refresh the datasouce , it will always will return the fist record without repeating.

 

Best Regards!

Yolo Zhu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

 

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors