Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hi Experts,
I have 2 table queries in Power BI editor.
Query1 has many columns including below columns (data refreshed from SAP HANA DB)
Owner_id | Comp_code | Creation_date |
KANR | 1001 | 02/03/2023 |
TANJI | 1500 | 04/05/2023 |
KANR | 1003 | 06/06/2023 |
MPAVI | 1047 | 07/07/2023 |
KANR | 1002 | 31/03/2020 |
KANR | 1002 | 17/04/2023 |
Query2 has below columns only with mapping value ( Data sourced from excel sheet in Sharepoint) hardly 20 rows only.
Owner_id | comp_code | creation_date | BU |
KANR | 1002 | 02/02/2020 | BABL |
KANR | 1002 | 01/04/2023 | CALB |
KANR | null (no values) | null (no values) | BABL |
TANJI | null | null | DART |
MPAVI | null | null | RANM |
I need to a new column (BU) in my Query1 and map it from query2. I need to implement most specific to least with some conditions.
First condition is match owner_id, comp_code and creation_date in Query2 should be less than or equal to creation_date in query1.
If there is no match for this then select based on owner_id match only where comp_code = null , creation_date = null.
Based on these 2 conditions the result table would be as below
Owner_id | Comp_code | Creation_date | BU |
KANR | 1001 | 02/03/2023 | BABL (derived from row-3) |
TANJI | 1500 | 04/05/2023 | DART (derived from row-4) |
KANR | 1003 | 06/06/2023 | BABL (derived from row-3) |
MPAVI | 1047 | 07/07/2023 | RANM (derived from row-5) |
KANR | 1002 | 31/03/2020 | BABL (derived from row-1) creation_date in query2 is less than creation_date from query1. but only matching entry so it should take this. |
KANR | 1002 | 17/04/2023 | CALB (derived from row-2) creation_date in query2 is less than creation_date from query1. but there are 2 records for this condition but system should consider the latest date record |
But we have millions of record in Query1 and hence try to avoid merge queries instead want to use list functions. But would like to hear from the experts which option is more efficient from performance perspective. Please provide solution for both options.
Solved! Go to Solution.
Hi, @rraja2k2
The goal of your problem is to assign a value from Query2's "BU" column to Query1 based on certain conditions. You have correctly identified two approaches:
In terms of performance, neither approach is necessarily optimal for very large datasets, as both involve row-by-row operations. Power Query is best for transforming medium-sized datasets, while a database-level operation would be more efficient for large datasets. It may be worth considering moving this operation to the source system (SAP HANA DB) if possible, as databases are designed for handling this kind of large-scale operations more efficiently than Power BI.
If you find performance degrades with either approach, it may be helpful to simplify the model or reduce the amount of data being processed.
Proud to be a Super User!
I am finally able to arrive the result but the performance is very poor ...so I am planning to DAX query.
the code I used as below
let
exactMatch=Table.SelectRows(VIM_BUMAP, (IT) => IT[CO_OWNER_ID] = [CO_OWNER_ID] and IT[Company_Code] = [Company Code] and IT[Document_Create_Date] <= [Document Create Date] ),
partialMatch=Table.SelectRows(VIM_BUMAP, (IT) => IT[CO_OWNER_ID] = [CO_OWNER_ID] and IT[Company_Code] = null and IT[Document_Create_Date] = null )
in
if Table.RowCount(exactMatch) > 0 then List.Last(exactMatch[BU])
else if Table.RowCount(partialMatch) > 0 then partialMatch{0}[BU] else null
Thanks for your input.
Due to some reason , I want to do in power query only. Can you please help how to do as I tried many options in the M query but not succeeded ? or merge queries option also ?
for power query:
let
currentRow = [Owner_id = _[Owner_id], Comp_code = _[Comp_code], Creation_date <= _[Creation_date]],
matches = Table.SelectRows(Query2, each Record.HasFields(currentRow, _)),
result = if List.IsEmpty(matches) then
List.FirstN(Table.SelectRows(Query2, each [Owner_id] = _[Owner_id] and [Comp_code] = null and [creation_date] = null), 1)
else
List.FirstN(matches, 1)
in
result
and for merge:
Proud to be a Super User!
I am getting error on the M query it shows error against date field.
Also the merge queries option how to add the additional logic .
let
currentRow = [CO_OWNER_ID = _[CO_OWNER_ID], Company Code = _[Company Code],Document_Create_Date <= [Document Create Date]],
matches = Table.SelectRows(VIM_BUMAP, each Record.HasFields(currentRow, _)),
result = if List.IsEmpty(matches) then
List.FirstN(Table.SelectRows({VIM_BUMAP}, each [CO_OWNER_ID] = _[CO_OWNER_ID] and [Company Code] = null and [Document Create Date] = null), "1")
else
List.FirstN(matches,"1")
in
result
If removes date then it is no syntax error but the result column is error with below.
Expression.Error: We cannot convert a value of type Table to type List.
Details:
Value=[Table]
Type=[Type]
modified M
let
Source = Query1,
AddCustom = Table.AddColumn(Source, "BU", each
let
currentRow = [Owner_id = [Owner_id], Comp_code = [Comp_code], Creation_date <= DateTime.From([Creation_date])],
matches = Table.SelectRows(Query2, each Record.HasFields(currentRow, _)),
result =
if List.IsEmpty(matches) then
List.FirstN(Table.SelectRows(Query2, each [Owner_id] = [Owner_id] and [Comp_code] = null and [creation_date] = null)[BU], 1)
else
List.FirstN(matches[BU], 1)
in
result
)
in
AddCustom
in merger query,
to add the additional logic, use conditional column.
Learn more about conditional column.
Add a conditional column - Power Query | Microsoft Learn
Proud to be a Super User!
I am creating custom column in the query1 but the current code shows me the columns from Query1 instead of query2. Still the date is showing as invalid identifier.
you are creating custom column in query 1 so it is obvious that it will show columns from query1. nit query 2. And power Query uses #date and date funstions to identify date.
#date - PowerQuery M | Microsoft Learn
Date functions - PowerQuery M | Microsoft Learn
Proud to be a Super User!
Hi, @rraja2k2
The goal of your problem is to assign a value from Query2's "BU" column to Query1 based on certain conditions. You have correctly identified two approaches:
In terms of performance, neither approach is necessarily optimal for very large datasets, as both involve row-by-row operations. Power Query is best for transforming medium-sized datasets, while a database-level operation would be more efficient for large datasets. It may be worth considering moving this operation to the source system (SAP HANA DB) if possible, as databases are designed for handling this kind of large-scale operations more efficiently than Power BI.
If you find performance degrades with either approach, it may be helpful to simplify the model or reduce the amount of data being processed.
Proud to be a Super User!
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.
Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.
User | Count |
---|---|
14 | |
13 | |
12 | |
12 | |
12 |