Solved: Re: Automating ongoing data table

buttercream · ‎10-10-2024

Hi,

I have a data file coming in daily. I need to set up an ongoing table where I add the data each day. However the daily data file may contain data that's already loaded so I need to avoid duplicate records. I also need to add an integar unique ID that picks up from the last unique ID from the ongoing table.

Example:

Historical ongoing table:

Unique ID	Product	Invoice	Cost
100	Pants	753	$35
101	Pants	229	$30
102	Pants	421	$30
103	Pants	789	$32
104	Shirt	236	$25
105	Shirt	237	$45
106	Dress	358	$80

Today's data I'm trying to append. Note that product Pants with Invoice 421 for $30 already exists in the historical table so it should not be added. All three columns must be checked against the historical data.

Product	Invoice	Cost
Shirt	235	$20
Pants	421	$30
Dress	848	$50
Dress	848	($20)

Final result: 3 transactions added with unique ID 107-109

Unique ID	Product	Invoice	Cost
100	Pants	753	$35
101	Pants	229	$30
102	Pants	421	$30
103	Pants	789	$32
104	Shirt	236	$25
105	Shirt	237	$45
106	Dress	358	$80
107	Shirt	235	$20
108	Dress	848	$50
109	Dress	848	($20)

Anonymous · ‎10-10-2024

Hi, az38 ,thanks for your concern about this issue.

Your answer is excellent!
And I would like to share some additional solutions below.

Hello,@buttercream .I am glad to help you.
I hope my test below will help you.
I think you may need to modify your M code a bit to preprocess the data

Here is my test data:

Record the first id value of your original data for subsequent re-generation of sorts.

Delete the index of the original data table (in a moment, we will do an append query, and we need the same number of columns in both tables).

Merge two tables (using an append query)

Remove duplicate values

Add index column

final results

Here is my M code:

let
    Source = Excel.Workbook(File.Contents("C:\test_pbixfloders\test_10_11.xlsx"), null, true),
    Historical_Sheet = Source{[Item="Historical",Kind="Sheet"]}[Data],
    #"Promoted Headers" = Table.PromoteHeaders(Historical_Sheet, [PromoteAllScalars=true]),
    #"minID"=List.Min((#"Changed Type"[Unique ID])),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Unique ID", Int64.Type}, {"Product", type text}, {"Invoice", Int64.Type}, {"Cost", Int64.Type}}),
    #"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Unique ID"}),
    #"Appended Query" = Table.Combine({#"Removed Columns", added}),
    #"Removed Duplicates" = Table.Distinct(#"Appended Query"),
    #"Added Index" = Table.AddIndexColumn(#"Removed Duplicates", "Unique ID", #"minID", 1, Int64.Type),
    #"Reordered Columns" = Table.ReorderColumns(#"Added Index",{"Unique ID", "Product", "Invoice", "Cost"})
in
    #"Reordered Columns"

Unique ID	Product	Invoice	Cost
100	Pants	753	$35
101	Pants	229	$30
102	Pants	421	$30
103	Pants	789	$32
104	Shirt	236	$25
105	Shirt	237	$45
106	Dress	358	$80
107	Shirt	235	$20
108	Dress	848	$50
109	Dress	848	($20)

1.Using incremental refresh, configuring a Scheduled Refresh is really a better way to
Have Power BI automatically fetch the latest data from a specified data source, such as an Excel file or database. This ensures that data is automatically loaded into the report on a daily basis. Refresh only a portion of the data at each refresh (load new data) through an incremental refresh mechanism.

2.You can also achieve a similar effect (unified processing of data through power query) by doing the following

Dynamically load daily data files: You can write code in Power Query to dynamically load daily data files. For example, you can use a folder path to load all files in a folder and merge them. Automatic Duplicate Removal and Unique ID Generation: Logic can be written in Power Query to automatically remove duplicate records and generate unique IDs.
I have given specific steps above for specific Power Query processing operations.

I hope my suggestions give you good ideas, if you have any more questions, please clarify in a follow-up reply.
Best Regards,
Carson Jian,
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Anonymous · ‎10-10-2024