Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
ITSNev
Frequent Visitor

Unable to Combine Data - Custom Function

Hi,

 

At first I was unable to combine data due to reference other queries to steps, so I split this one query in to 3 seperate queries, but now the Power BI Refresh GW is complaining

Processing error: [Unable to combine data] Section1/XXXXXXXXXXXXXXXXXXXXX/Invoked Custom Function references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.

This is the M for the Table

 

let
    Source = List.Generate (
    () => [offset = 500, actuals = #"FnGetProjects"( 0 ) ] ,
    //each [actuals][total_count] <> 0,
    each not List.IsEmpty([actuals][data]),
    each [offset = [offset] + 500, actuals = #"FnGetProjects" ([offset]) ],
    each [actuals]
),
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"data"}, {"data"}),
    #"Expanded data" = Table.ExpandListColumn(#"Expanded Column1", "data"),
    #"Expanded data1" = Table.ExpandRecordColumn(#"Expanded data", "data", {"XXXXXX"})
in
    #"Expanded data1"

 

 

And this is the customer Function.

 

let
    Source = (offset as number) =>
let
    //start_date=eRSActualsStartOfMonth,
    //end_date=MaxProjectEndDate,
    records = Number.ToText(offset),
    header = [  #"Authorization"="Bearer XXXXXXXXXXXXXXXXX",
                #"Content-Type"= "application/json"],
    content = "", //"{""last_date:ex"": [null, ""2022-05-1""]""}",
    url = "https://app.YYYYYYYYYYYYYYYY.cloud/rest/v1/",
    response = Web.Contents(url, 
        [RelativePath = "projects/search?",
            Query=[
                offset = records,
                limit = "500"
                ],
        Content=Text.ToBinary(content),Headers=header]),
    out = Json.Document(response,1252)
in
    out
in
    Source

 

 

I've tried setting privacy level to all data sources as None and Organisational, but getting the same error.  I can understand Microsoft privacy settings, but we should be allow to OVERRIDE the defaults in the GW for specific datasources that we are ok with merging data from.  Wasted several hours on this trying to work it out myself, but as I'm sure the community has experience this, I thought I would raise here.

Thanks

Nev

1 ACCEPTED SOLUTION
ImkeF
Super User
Super User

Hi @ITSNev ,
this can indeed be a pain.
Try combining these queries into one like so:

let
FnGetProjects_ = 
(offset as number) =>
let
    //start_date=eRSActualsStartOfMonth,
    //end_date=MaxProjectEndDate,
    records = Number.ToText(offset),
    header = [  #"Authorization"="Bearer XXXXXXXXXXXXXXXXX",
                #"Content-Type"= "application/json"],
    content = "", //"{""last_date:ex"": [null, ""2022-05-1""]""}",
    url = "https://app.YYYYYYYYYYYYYYYY.cloud/rest/v1/",
    response = Web.Contents(url, 
        [RelativePath = "projects/search?",
            Query=[
                offset = records,
                limit = "500"
                ],
        Content=Text.ToBinary(content),Headers=header]),
    out = Json.Document(response,1252)
in
    out,
    Source = List.Generate (
    () => [offset = 500, actuals = #"FnGetProjects"( 0 ) ] ,
    //each [actuals][total_count] <> 0,
    each not List.IsEmpty([actuals][data]),
    each [offset = [offset] + 500, actuals = #"FnGetProjects_" ([offset]) ],
    each [actuals]
),
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"data"}, {"data"}),
    #"Expanded data" = Table.ExpandListColumn(#"Expanded Column1", "data"),
    #"Expanded data1" = Table.ExpandRecordColumn(#"Expanded data", "data", {"XXXXXX"})
in
    #"Expanded data1"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

15 REPLIES 15
Element115
Power Participant
Power Participant

Functional programming is really cool and elegant BUT only if there is an extensive documentation so we can learn the language properly!  😁

ImkeF
Super User
Super User

Hi @Element115 ,
not sure I understand.
But it is a common pattern to combine different sources for API calls. 
What is important is that you combine them all into one query if you run into issues.
So in your OLC-query, include the function code like so:

 

OLC-Query:
let
 existing step 1 = ....,
 existing step 2 = ....,
 last step = .... 
 , myCustomFunction = (OLC as table) =>
   let
    handleID                = Table.AddColumn(
                                            OLC
                                        ,   "handleID"
                                        ,   each getHandleID([ID_External], getToken())
                                        ,   type text
                                        )
in 
    handleID,

, applyCustomFunction = myCustomFunction(#"last step"),

in
//last step
applyCustomFunction

 




Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Other quick question re syntax:  given the following function def:

 

addColumn = (T as table, col_name as text, f as function, col as list, token as text, col_type as type) =>

 

how can you refer to col, a column that belongs to table T?

For ex, this does not work as PQ complains that [ID_External] is an unknown identifier:

 

add_handleID = addColumn(Source, "handleID", getHandleID, [ID_External], getToken(), type text),

 

Is there a way to pass the name of column as a param instead of the column itself as a list?  Or is that the wrong way of doing it?

 

I should add:  [ID_External] belongs to Source and is of type number.  So if I pass the param as Source[ID_External] instead of just [ID_External], I get the error:  The column 'col' of the table wasn't found.  The function definition looks like this:

addColumn = (T as table, col_name as text, f as function, col as list, token as text, col_type as type) =>
let
    new_col	=	Table.AddColumn(
         			T
				,   col_name				//"handleID"
				,   each f(T[col], token)		//getHandleID([ID_External], getToken())
				,   col_type
			)
in
    new_col,

I found the answer!  

addColumn = (T as table, new_col_name as text, f as function, col_name as text, token as text, col_type as type) =>
    let
		new_col	= Table.AddColumn(
				T
				,   new_col_name				//"handleID"
				,   each f(Record.Field(_, col_name), token)		//getHandleID([ID_External], getToken())
				,   col_type
			)
	in
		new_col,

Credit goes to Lz-3068 who suggested this answer here:
https://learn.microsoft.com/en-us/answers/questions/253951/using-a-variable-as-a-column-name-in-powe...

Hi Imke, 

 

other than the Microsoft documentation, are there any exhaustive resources (books or websites) where I could learn all these details about M?  I've been looking but not finding anything.  

Yes.  I was meaning to thank you in fact in the other post you made where you actually go over this.  It helped steer me in the right direction.  Nonetheless, it is annoying that one has to do this. It is not elegant and that is why I wrote up the comment for Microsoft because surely they can do better.

 

I say this because I have 3 custom functions under the Other folder.  Then I call one function in a query in the Staging folder. Pass that query as a reference to the next query in Staging where I call another custom function, and repeat a 3rd time with a 3rd function.  

But as things stand today, I have to cram all 3 custom function definitions inside 1 query.  The query LOC now has exploded to over 100!

 

Anyhow, thank you very much for your help, Imke.

I'm also 100% with you on this @Element115 as its stupid that you have to do this. Some of my queries are HUGE now as a result of having to cover all REST API Calls into one query so we don't get the combine error. Even if say ignore privacy, it ignores this and fails on the combine.  Very shortsighted of the designers.

Element115
Power Participant
Power Participant

I wasted weeks!!!  And still after trying to decouple the REST API call from the local queries, it doesn't work.  My issue though is that in order for the first API call to succeed, I need to get IDs from an on-prem DB, so I have no choice but to refer to a local M query to iterate over 100s of IDs using 'each', like so:

each getHandleID([ID_External], getToken())

The function getHandleID() encapsulates the call to the REST endpoint as a POST request. 
The full code is:

//vs 2.0
let
    handleID                = Table.AddColumn(
                                            OLC
                                        ,   "handleID"
                                        ,   each getHandleID([ID_External], getToken())
                                        ,   type text
                                        )
in 
    handleID

So you see that in order to be able to add a new column of IDs coming from a REST API to table OLC, there is no choice when using the Table.AddColumn() function since you need to provide a reference to the source table (OLC).  
I can't simply put the handleIDs in a static file or a DB table because these IDs change over time and it is the API provider that manages that, hence why I need to initiate a REST call everything single time when I want to update the values associated with the IDs because I need the latest IDs to get those values. Talk about an asinine API design.

waeltken
Helper I
Helper I

Imke and ITSNev,

 

can I ask if the underlying problem that was causing the combination error in the PBI service had to do with calling out to a custom function in another query and if you are essentially migrating that function code into a single query above? I'm faced with a simpler scenario and have tried disaggregating portions of the same query which access data sources but cannot completely eliminate calling a helper function to format and treat a binary wb.contents call result. I had not considered moving the code for these functions into the main query instead.

 

Regards,

 

 

Henrik

I fully understand now what this is all about.  In another thread, I got Microsoft to point me to one of their doc pages:  this is main source of the problem:  the Data Privacy Firewall inside the PQ engine.  

https://learn.microsoft.com/en-us/power-query/dataprivacyfirewall

In a nutshell, all code outside of your query's top level let...in is not allowed to reference any code outside of the query itself.  I'll publish a more exhaustive answer example in response to Imke first response later as I am still busy testing.... I got the Refresh in the Service to work but now no values are coming out!  Fun, huh? 😁

ITSNev
Frequent Visitor

Thank you @ImkeF , I just needed that last peice of information to bring the function into the main table. This address the Combine data issue in the BI Service on Refresh and also helped me to address the remaining Combine data issues where I was referencing a Function.

ImkeF
Super User
Super User

Hi @ITSNev ,
this can indeed be a pain.
Try combining these queries into one like so:

let
FnGetProjects_ = 
(offset as number) =>
let
    //start_date=eRSActualsStartOfMonth,
    //end_date=MaxProjectEndDate,
    records = Number.ToText(offset),
    header = [  #"Authorization"="Bearer XXXXXXXXXXXXXXXXX",
                #"Content-Type"= "application/json"],
    content = "", //"{""last_date:ex"": [null, ""2022-05-1""]""}",
    url = "https://app.YYYYYYYYYYYYYYYY.cloud/rest/v1/",
    response = Web.Contents(url, 
        [RelativePath = "projects/search?",
            Query=[
                offset = records,
                limit = "500"
                ],
        Content=Text.ToBinary(content),Headers=header]),
    out = Json.Document(response,1252)
in
    out,
    Source = List.Generate (
    () => [offset = 500, actuals = #"FnGetProjects"( 0 ) ] ,
    //each [actuals][total_count] <> 0,
    each not List.IsEmpty([actuals][data]),
    each [offset = [offset] + 500, actuals = #"FnGetProjects_" ([offset]) ],
    each [actuals]
),
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Expanded Column1" = Table.ExpandRecordColumn(#"Converted to Table", "Column1", {"data"}, {"data"}),
    #"Expanded data" = Table.ExpandListColumn(#"Expanded Column1", "data"),
    #"Expanded data1" = Table.ExpandRecordColumn(#"Expanded data", "data", {"XXXXXX"})
in
    #"Expanded data1"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

So I got a response from Microsoft.  Basically all this nonsense is because they are trying to prevent data leakage and side-channel attacks.  The issue is due to the data privacy firewall inside PQ as explained here:  https://learn.microsoft.com/en-us/power-query/dataprivacyfirewall .

Imke, you are correct.  Being green behind the ears, I misunderstood the definition of 'query.'  I had defined a project_id in the query pane, but coming from a software engineering background, I automatically started thinking of 

 

 

project_id = ######,

 

 

 as a variable with global scope and not a query.  Therefore when I brought all the queries underlined in green as shown in this screenshot:

queries_pane_Screenshot 2022-12-12 145301.png

into the RT_table query as inline code, I forgot to also bring in project_id (red box).  And because it is a local query being referenced inside the RT_Data query where all the REST web API calls occur, the Service kept issuing the PrivacyError message.  I also had forgotten to substitute OLC:

 

 

let
    Source = OLC,
...

 

 

for the inline code referencing the SQL on-prem data source:

 

 

    Source                  = Sql.Databases("DB_instance"),
    db                      = Source{[Name="DB_name"]}[Data],
    dbo_some_view           = db{[Schema="dbo",Item="some_view"]}[Data],

 

 

But once the project_id and the database referencing was inline, the Service stopped complaining and I could refesh at will all day long!  Ahhh life is good  😎

Imke, I think I found the solution. Will publish it here tomorrow after some testing to make sure.

Regarding my last question about syntax, I found the answer.

 

addColumn = (T as table, new_col_name as text, f as function, col_name as text, token as text, col_type as type) =>
    let
		new_col	= Table.AddColumn(
				T
				,   new_col_name				//"handleID"
				,   each f(Record.Field(_, col_name), token)		//getHandleID([ID_External], getToken())
				,   col_type
			)
	in
		new_col,

 

Credit goes to Lz-3068 who suggested this answer here:
https://learn.microsoft.com/en-us/answers/questions/253951/using-a-variable-as-a-column-name-in-powe...

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.