Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.
Hi,
I have a data file coming in daily. I need to set up an ongoing table where I add the data each day. However the daily data file may contain data that's already loaded so I need to avoid duplicate records. I also need to add an integar unique ID that picks up from the last unique ID from the ongoing table.
Example:
Historical ongoing table:
Unique ID | Product | Invoice | Cost |
100 | Pants | 753 | $35 |
101 | Pants | 229 | $30 |
102 | Pants | 421 | $30 |
103 | Pants | 789 | $32 |
104 | Shirt | 236 | $25 |
105 | Shirt | 237 | $45 |
106 | Dress | 358 | $80 |
Today's data I'm trying to append. Note that product Pants with Invoice 421 for $30 already exists in the historical table so it should not be added. All three columns must be checked against the historical data.
Product | Invoice | Cost |
Shirt | 235 | $20 |
Pants | 421 | $30 |
Dress | 848 | $50 |
Dress | 848 | ($20) |
Final result: 3 transactions added with unique ID 107-109
Unique ID | Product | Invoice | Cost |
100 | Pants | 753 | $35 |
101 | Pants | 229 | $30 |
102 | Pants | 421 | $30 |
103 | Pants | 789 | $32 |
104 | Shirt | 236 | $25 |
105 | Shirt | 237 | $45 |
106 | Dress | 358 | $80 |
107 | Shirt | 235 | $20 |
108 | Dress | 848 | $50 |
109 | Dress | 848 | ($20) |
Solved! Go to Solution.
Hi, az38 ,thanks for your concern about this issue.
Your answer is excellent!
And I would like to share some additional solutions below.
Hello,@buttercream .I am glad to help you.
I hope my test below will help you.
I think you may need to modify your M code a bit to preprocess the data
Here is my test data:
Record the first id value of your original data for subsequent re-generation of sorts.
Delete the index of the original data table (in a moment, we will do an append query, and we need the same number of columns in both tables).
Merge two tables (using an append query)
Remove duplicate values
Add index column
final results
Here is my M code:
let
Source = Excel.Workbook(File.Contents("C:\test_pbixfloders\test_10_11.xlsx"), null, true),
Historical_Sheet = Source{[Item="Historical",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Historical_Sheet, [PromoteAllScalars=true]),
#"minID"=List.Min((#"Changed Type"[Unique ID])),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Unique ID", Int64.Type}, {"Product", type text}, {"Invoice", Int64.Type}, {"Cost", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Unique ID"}),
#"Appended Query" = Table.Combine({#"Removed Columns", added}),
#"Removed Duplicates" = Table.Distinct(#"Appended Query"),
#"Added Index" = Table.AddIndexColumn(#"Removed Duplicates", "Unique ID", #"minID", 1, Int64.Type),
#"Reordered Columns" = Table.ReorderColumns(#"Added Index",{"Unique ID", "Product", "Invoice", "Cost"})
in
#"Reordered Columns"
Unique ID | Product | Invoice | Cost |
100 | Pants | 753 | $35 |
101 | Pants | 229 | $30 |
102 | Pants | 421 | $30 |
103 | Pants | 789 | $32 |
104 | Shirt | 236 | $25 |
105 | Shirt | 237 | $45 |
106 | Dress | 358 | $80 |
107 | Shirt | 235 | $20 |
108 | Dress | 848 | $50 |
109 | Dress | 848 | ($20) |
1.Using incremental refresh, configuring a Scheduled Refresh is really a better way to
Have Power BI automatically fetch the latest data from a specified data source, such as an Excel file or database. This ensures that data is automatically loaded into the report on a daily basis. Refresh only a portion of the data at each refresh (load new data) through an incremental refresh mechanism.
2.You can also achieve a similar effect (unified processing of data through power query) by doing the following
Dynamically load daily data files: You can write code in Power Query to dynamically load daily data files. For example, you can use a folder path to load all files in a folder and merge them. Automatic Duplicate Removal and Unique ID Generation: Logic can be written in Power Query to automatically remove duplicate records and generate unique IDs.
I have given specific steps above for specific Power Query processing operations.
I hope my suggestions give you good ideas, if you have any more questions, please clarify in a follow-up reply.
Best Regards,
Carson Jian,
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi, az38 ,thanks for your concern about this issue.
Your answer is excellent!
And I would like to share some additional solutions below.
Hello,@buttercream .I am glad to help you.
I hope my test below will help you.
I think you may need to modify your M code a bit to preprocess the data
Here is my test data:
Record the first id value of your original data for subsequent re-generation of sorts.
Delete the index of the original data table (in a moment, we will do an append query, and we need the same number of columns in both tables).
Merge two tables (using an append query)
Remove duplicate values
Add index column
final results
Here is my M code:
let
Source = Excel.Workbook(File.Contents("C:\test_pbixfloders\test_10_11.xlsx"), null, true),
Historical_Sheet = Source{[Item="Historical",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Historical_Sheet, [PromoteAllScalars=true]),
#"minID"=List.Min((#"Changed Type"[Unique ID])),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Unique ID", Int64.Type}, {"Product", type text}, {"Invoice", Int64.Type}, {"Cost", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Unique ID"}),
#"Appended Query" = Table.Combine({#"Removed Columns", added}),
#"Removed Duplicates" = Table.Distinct(#"Appended Query"),
#"Added Index" = Table.AddIndexColumn(#"Removed Duplicates", "Unique ID", #"minID", 1, Int64.Type),
#"Reordered Columns" = Table.ReorderColumns(#"Added Index",{"Unique ID", "Product", "Invoice", "Cost"})
in
#"Reordered Columns"
Unique ID | Product | Invoice | Cost |
100 | Pants | 753 | $35 |
101 | Pants | 229 | $30 |
102 | Pants | 421 | $30 |
103 | Pants | 789 | $32 |
104 | Shirt | 236 | $25 |
105 | Shirt | 237 | $45 |
106 | Dress | 358 | $80 |
107 | Shirt | 235 | $20 |
108 | Dress | 848 | $50 |
109 | Dress | 848 | ($20) |
1.Using incremental refresh, configuring a Scheduled Refresh is really a better way to
Have Power BI automatically fetch the latest data from a specified data source, such as an Excel file or database. This ensures that data is automatically loaded into the report on a daily basis. Refresh only a portion of the data at each refresh (load new data) through an incremental refresh mechanism.
2.You can also achieve a similar effect (unified processing of data through power query) by doing the following
Dynamically load daily data files: You can write code in Power Query to dynamically load daily data files. For example, you can use a folder path to load all files in a folder and merge them. Automatic Duplicate Removal and Unique ID Generation: Logic can be written in Power Query to automatically remove duplicate records and generate unique IDs.
I have given specific steps above for specific Power Query processing operations.
I hope my suggestions give you good ideas, if you have any more questions, please clarify in a follow-up reply.
Best Regards,
Carson Jian,
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.