The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Every day we are refreshing data based on reporting output to SQL.
So every day the query is moving yesterday's data and replacing with today's data and it looks something like this -->
data_today
data_yesterday
The column headers are always indentical, but we have different values for Sales and the possibility for a different number of rows.
Example -->
Today -->
Category | Product | Sales |
Snacks | Cookies | 100 |
Snacks | Crackers | 75 |
Candy | Chocolate Bars | 200 |
Candy | Gum | 125 |
Category | Product | Sales |
Snacks | Cookies | 90 |
Snacks | Crackers | 75 |
Candy | Chocolate Bars | 200 |
So in today's report, I need to see that Cookies increased from 90 to 100 and also we have a new line item for Gum at 125.
Here are the options I have considered -
1) Use a unique id (snacks_cookies, snacks_crackers, etc) - create a relationship between the 2 tables and / or lookup the value from yesterday's table.
2) APPEND the 2 tables, first re-naming Sales to "Yesterday Sales" so now I have double the rows, all columns filled and 2 sales columns
3) MERGE the 2 tables using ALL ROWS Outer so the # of rows today is the same and I add a new column for "Yesterday Sales"
4) Tag each data set with a report date and use some time intelligence to compare sales vs prior day in the same dataset
I know there are other options out there - any suggestions or considerations?