Supplies are limited. Contact info@espc.tech right away to save your spot before the conference sells out.
Get your discountScore big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount
Hi all,
I have a conundrum. The conundrum being multiple tables connected via PID (primary identifier). All tables have a record start date and a record end date (end date is blank if it's a current record). For instance:
Table 1:
PID | Value | Start_date | End_date |
1 | 10 | 13/10/2023 | 26/10/2023 |
1 | 15 | 27/10/2023 | 19/03/2024 |
1 | 17 | 20/03/2024 | |
2 | 12 | 13/10/2023 | |
3 | 14 | 13/10/2023 | 17/11/2023 |
Table 2:
PID | Location | Start_date | End_date |
1 | G01 | 13/10/2023 | 23/02/2024 |
1 | F07 | 24/02/2024 | |
2 | B08 | 13/10/2023 |
You'll notice for the same PID i have different date ranges within the various tables. Now, i would like to introduce a visual that will show what a record was like on a specific date.
So, if say what was PID 1 on 20/10/2023, it should say Value 10, Location G01.
And on 01/01/2024 it should be Value 15, Location G01. On 25/02/2024 it should be Value 15, Location F07. On 19/05/2024 it should be the current record, Vale 17, Location F07.
Simplified example... it's 15 tables, about 18M records which change weekly, but not changed in each table. I do have an additional unique idenfier for each record of a concatenation of PID and Start Date.
Any suggestions?
Thanks @ManuelBolz and @dufoq3 .
I agree that exploding the dates would not be a viable option, i'm already sitting at a 9.6Gb PBix file... it would die.
I did decide to go somewhat via the custom function way, but instead coding it in Synapse Studio. I have a main table that has the PID & Date combination (union of all tables to get all values), then in that same table i have listed for each PID/date combination, which date field should be used in each of the tables.
I hope doing this server side in Synapse means the powerBI won't be too slow.
Hi @OzMartijn, regarding this is PQ forum here is PQ solution (but it will be too slow for 18M rows of data).
You can filter via [Date] column and see the Value and Location for exact date.
Result
let
Table1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTI0ABHG+oYG+kYGRsZAjpEZnBOrA1VkChI3R1JkaKlvYAzimCAUmYMUGcDFdZQUwHJGIDkjdFsgcmCjTNDlDIE2GUJdEAsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [PID = _t, Value = _t, Start_date = _t, End_date = _t]),
Table2 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXI3AJGGxvqGBvpGBkbGQI6Rsb6BEYhjohSrA1HlZmAOkjCBS+goKYAljYAsJwMLdCOAkrEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [PID = _t, Location = _t, Start_date = _t, End_date = _t]),
T1_ChangedType = Table.TransformColumnTypes(Table1,{{"PID", Int64.Type}, {"Value", Int64.Type}, {"Start_date", type date}, {"End_date", type date}}),
T1_Ad_Date = Table.AddColumn(T1_ChangedType, "Date", each List.Dates([Start_date], Duration.TotalDays((if [End_date] = null then Date.From(DateTime.FixedLocalNow()) else [End_date]) - [Start_date])+1, #duration(1,0,0,0)), type list),
T1_ExpandedDate = Table.ExpandListColumn(T1_Ad_Date, "Date"),
T2_ChangedType = Table.TransformColumnTypes(Table2,{{"PID", Int64.Type}, {"Start_date", type date}, {"End_date", type date}}, "sk-SK"),
T2_Ad_Date = Table.AddColumn(T2_ChangedType, "Date", each List.Dates([Start_date], Duration.TotalDays((if [End_date] = null then Date.From(DateTime.FixedLocalNow()) else [End_date]) - [Start_date])+1, #duration(1,0,0,0)), type list),
T2_ExpandedDate = Table.ExpandListColumn(T2_Ad_Date, "Date"),
MergedQueries = Table.NestedJoin(T1_ExpandedDate, {"PID", "Date"}, T2_ExpandedDate, {"PID", "Date"}, "T2_ExpandedDateList", JoinKind.LeftOuter),
ExpandedT2_Location = Table.ExpandTableColumn(MergedQueries, "T2_ExpandedDateList", {"Location"}, {"Location"}),
RemovedColumns = Table.RemoveColumns(ExpandedT2_Location,{"Start_date", "End_date"}),
ChangedType = Table.TransformColumnTypes(RemovedColumns,{{"Date", type date}})
in
ChangedType
Hello @OzMartijn
If my post helped you, please give me a 👍kudos and mark this post with Accept as Solution.
Thank you for your interesting problem. Here is my solution approach.
Table 1:
Your PID/Value Table: testTable1
Table 2:
Your PID/Location Table: testTable2
Custom Function:
fnGetLocationValue
let
Source1 = testTable1,
Source2 = testTable2,
Table1Updated = Table.TransformColumns(Source1, {{"End_date", each if _ = null then DateTime.Date(DateTime.LocalNow()) else _, type date}}),
Table2Updated = Table.TransformColumns(Source2, {{"End_date", each if _ = null then DateTime.Date(DateTime.LocalNow()) else _, type date}}),
CustomFunction = (inputPID as number, inputDate as date) =>
let
FilteredTable1 = Table.SelectRows(Table1Updated, each [PID] = inputPID and [Start_date] <= inputDate and [End_date] >= inputDate),
FilteredTable2 = Table.SelectRows(Table2Updated, each [PID] = inputPID and [Start_date] <= inputDate and [End_date] >= inputDate),
MatchedRows = Table.SelectRows(FilteredTable2, each List.NonNullCount(
List.Transform(FilteredTable1[Start_date], (start1) => [Start_date] <= inputDate and [End_date] >= inputDate)
) > 0),
Result = if Table.RowCount(FilteredTable1) > 0 and Table.RowCount(MatchedRows) > 0 then
[Value = FilteredTable1{0}[Value], Location = MatchedRows{0}[Location]]
else
null
in
Result
in
CustomFunction
Result Table:
You can call up the function here
fnGetLocationValue = Table.AddColumn(Source, "fnGetLocationValue", each fnGetLocationValue([PID],[Date])),
Best regards from Germany
Manuel Bolz
🟦Follow me on LinkedIn
🟨How to Get Your Question Answered Quickly
🟩Fabric Community Conference
🟪My Solutions on Github