Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
Hi!
It is too heavy for Excel to execute the function =index(a:a, match(true, exact(b2, c:c), 0)) for +800 rows. I thus want to do this via Power Query in Excel instead (I loaded the data using get data from csv); after trying for hours and even asking ChatGPT for help, I am unable to solve this.
Per row value in column B, I want to search for that row value in column C. If found, I want a new custom column to get the corresponding value in Column A.
Example:
Column A Column B Column C
Text1 1234 5678
Text 2. 2346 6789
Text 3. 5678 73785
Text 4
So when iterating the first time, the Custom column would have value Text3.
Using a similar algorithm as @m_dekorte , and generating a CSV file with 800,000 rows, and only single matches between Columns B & C, the PQ took about 15 seconds to run. (It took longer to write the results back to the Excel worksheet). I did not get any out of memory errors.
M-Code: Paste into Advanced Editor
let
//Change next line to reflect actual file path
Source = Csv.Document(File.Contents("C:\Users\ron\Documents\test.csv"),[Delimiter=",", Columns=3, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Column A", type text}, {"Column B", Int64.Type}, {"Column C", Int64.Type}}),
//Add index column to be able to re-sort into original order
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
//Join the table to itself, using different columns for the "key"
#"Self-Join" = Table.NestedJoin(#"Added Index","Column B",#"Added Index","Column C","Index Match",JoinKind.LeftOuter),
//Expand the joined table Column A
#"Expanded Index Match" = Table.ExpandTableColumn(#"Self-Join", "Index Match", {"Column A"}, {"Index Match.Column A"}),
//Sort back to original order
#"Sorted Rows" = Table.Sort(#"Expanded Index Match",{{"Index", Order.Ascending}}),
//Remopve Index column
#"Removed Columns" = Table.RemoveColumns(#"Sorted Rows",{"Index"})
in
#"Removed Columns"
Partial Results
How much memory do you have?
Are you running Excel 32 bit or 64 bit?
Will there only be a single match? Or might there be multiple matches?
Also, why would you expect Text 3 as a result? Your formula in Excel would return Text 1.
@TestssonNow thank you for your post, and leveraging Power Query to handle this for which it is aptly able. @m_dekorte is a rockstar, and her crazy wonderful work and warm gentle touch with folks on solutions is quite a gift. You are in good hands. Thank you!
Hi!
Thanks for the reply. What do you mean by extracting column 1? Also, in your image your extra column shows text1 instead of text3 which is the desired result.
Hi @TestssonNow
Select Merge on the ribbon, merge with the itself
That yields a nested table as result, which can be expanded, with the sideward arrows
Select Column1, from the available fields
Thank you for the explanation, much appreciated! I have a dataset with +800k rows and that seemed to be too much with the expansion solution (i.e. Power Equery says it ran out of memory and doesnt complete the process). Is there an m code / Power Query Custom column code I could use to solve this?
Hi @TestssonNow,
You can achieve this through a self-merge
And subsequently expanding/Extracting "Column1" from the result table.
I hope this is helpful
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.
User | Count |
---|---|
25 | |
12 | |
12 | |
11 | |
8 |
User | Count |
---|---|
42 | |
27 | |
14 | |
13 | |
13 |