Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredPower BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.
Why does a linked table/entity get duplicated when refreshing a dataflow that uses the linked table/entity to create a computed table/entity?
I understand linked entities as shortcuts to Azure Data Lake entities when I read the MS documentation:
I'm trying to build a data pipeline that extracts data from a database view and performs some transformations without doing unnecessary work. However it appears that the transformation dataflow that ingests the extracted data as linked table also refreshes/duplicates the data. Detailed scenario of what I'm doing below.
Can anyone shed some light on why this could be happening? Am I doing something wrong? Would using the PowerBI.Dataflows() connector to link to the table instead of the PowerPlatform.Dataflows() connector be a better choice here?
Oracle database contains view V_DIM_SIEBEL_CONTACT
LZ_Dim Siebel Contact dataflow contains 1 query LZ_Dim Siebel Contact that selects all data from the V_DIM_SIEBEL_CONTACT view
let
ConnectToDB = Oracle.Database("TST_DWH", [HierarchicalNavigation = true]),
SchemaDWPBI = ConnectToDB{[Schema = "DWPBI"]}[Data],
SelectTable = SchemaDWPBI{[Name = "V_DIM_SIEBEL_CONTACT"]}[Data]
in
SelectTable
Requested on | Dataflow name | Dataflow refresh status | Table name | Partition name | Refresh status | Start time | End time | Duration | Rows processed | Bytes processed (KB) | Max commit (KB) | Processor Time | Wait time | Compute engine | Error |
2022-02-23 15:43 | LZ_Dim Siebel Contact | Completed | LZ_Dim Siebel Contact | FullRefreshPolicyPartition | Completed | 2022-02-23 15:43 | 2022-02-23 17:07 | 01:23:28 | NA | 5804536 | 54864 | NA | 00:00.1 | Not used | NA |
Dim Siebel Contact dataflow contains 2 queries:
Sensitive information scrubbed from code
let
Source = PowerPlatform.Dataflows([]),
Navigation = Source{[Id = "Workspaces"]}[Data],
#"Navigation 1" = Navigation{[workspaceId = "<WORKSPACEID>"]}[Data],
#"Navigation 2" = #"Navigation 1"{[dataflowId = "<DATAFLOWID>"]}[Data],
#"Navigation 3" = #"Navigation 2"{[entity = "LZ_Dim Siebel Contact", version = ""]}[Data]
in
#"Navigation 3"
Sensitive information scrubbed from code
let
Source = #"LZ_Dim Siebel Contact",
#"Changed column type" = Table.TransformColumnTypes(Source, {<COLUMN DATA TYPES>}),
#"Marked key columns" = Table.AddKey(#"Changed column type", {"Tech Dim Siebel Contact Key"}, true)
in
#"Marked key columns"
Query plan not available
Requested on | Dataflow name | Dataflow refresh status | Table name | Partition name | Refresh status | Start time | End time | Duration | Rows processed | Bytes processed (KB) | Max commit (KB) | Processor Time | Wait time | Compute engine | Error |
2022-02-24 09:18 | Dim Siebel Contact | Completed | LZ_Dim Siebel Contact | FullRefreshPolicyPartition | Completed | 2022-02-24 09:18 | 2022-02-24 09:58 | 39:32.5 | 9661869 | 5804536 | 56080 | 19:12.7 | 00:00.1 | Cached | NA |
2022-02-24 09:58 | Dim Siebel Contact | Completed | Dim Siebel Contact | FullRefreshPolicyPartition | Completed | 2022-02-24 09:58 | 2022-02-24 10:47 | 48:45.4 | 9661869 | 5804536 | 56640 | 33:41.6 | 00:00.1 | Cached + folded | NA |
Hi @Anonymous ,
Here are several relationships between Power Platform dataflows and Power BI dataflows.
If you want to know more about Power Platform dataflows, there is a document for your reference.
https://docs.microsoft.com/en-us/data-integration/dataflows/dataflows-integration-overview
In summary, Power Platform dataflow will be easier and faster.
Best Regards,
Jay
User | Count |
---|---|
47 | |
32 | |
30 | |
27 | |
26 |
User | Count |
---|---|
56 | |
55 | |
36 | |
33 | |
28 |