Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
ozmike
Resolver I
Resolver I

Filtering/Joining/Merging A Large DB table against an XLS file

Hi

 

 I have a table with 9 million rows in a readonly database. I get supplied a list of record say 12000 keys in an XLS which I want to filter against. I can't use direct query as I'm joining XLS to DB? So of when I filter down (join /merge) to the 12000 using power query it brings back all 9 million rows when applying changes. 

 

So any ideas with the desktop on how to speed things up..I'm using the desktop power bi and HANA database, I only need 12000 records out of the dataset.

1 ACCEPTED SOLUTION

Hi here is a turn key solution ( excuse the pun)  example joining the backend to an XLS, This will avoid the default action of pulling the whole table down. It uses SAP hana but i'm sure it could be used for other dbs. The key column in this example is a text field so needs single quotes here and there. Assume 'query' is an item that you can connect to normally in power bi.  Enjoy

 

let
    Source2 = Excel.Workbook(File.Contents("\\yourpath\KeyList.xlsx"), null, true),
    Sheet1_Sheet = Source2{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Promoted Headers1" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
    Source1 = #"Promoted Headers1",
    KeyList =    Text.Combine(Source1[KEYfield],"','")  ,  // column to single line of text 
    sel2 = "select * from ""ZXXX1.ZXXX_REP.bp::Query"" where ""KeyField"" IN  ('" & KeyList  & "')  ",
    Source = SapHana.Database("server:portnumber", [Query=sel2])
in
   Source

View solution in original post

6 REPLIES 6
v-shex-msft
Community Support
Community Support

Hi @ozmike,

 

You can refer to below steps:

 

1. Write a function to parameterize connect to hana database.

2. Get data from excel file, then analysis key list and format them to string.

3. Write t-sql query with above keys string.

4. Invoke the custom function with t-sql query as parameter.

 

Sample function:

 

let 
    loadData=(ServerName as text,source as text, table as text, Query as text) =>
let
    Source = SapHana.Database(ServerName ,Query ),
    Contents = Source{[Name="Contents"]}[Data]{[Name=source]}[Data]{[Name=table]}[Data],
    #"Added Items" = Cube.Transform(Contents, {{Cube.AddAndExpandDimensionColumn, "STLNR", {"STLNR"}, {"Bill of material"}}})
in
    #"Added Items"
in
    loadData

 

Use: last parameter is the t-sql query.

 

let
    Source = loadData("xxxxxxx", "xxxxx", "xxxxx", T-sqlquery)
in
    Source

 

 

 

Notice:

Before these steps you should input the sap hana certificate to power bi.

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Hi Thanks

 

Its a little over my head ..you might have to break your original into 20 steps!

 

1. Write a function to parameterize connect to hana database.

Where do write a function ? I assume in query editor.?

 

loadData=(ServerName as text,source as text, table as text, Query as text) =>

 

Servername = "myhanaserver:12345"

Source= "foldername" ( folder icons in the connect to data source)

table="cubename" ( cube icon - in data source)

query ? - you mean tsql parameter

also what is  "Bill of material" and  "STLNR" in my world

    #"Added Items" = Cube.Transform(Contents, {{Cube.AddAndExpandDimensionColumn, "STLNR", {"STLNR"}, {"Bill of material"}}})

"Bill of material" = cubename, anything else I need to change or is his a column name? 

"STLNR" - ?

 

2. Get data from excel file, then analysis key list and format them to string.

Open a data source to an excel file and format key to text?

 

3. Write t-sql query with above keys string.

I want to join the list of keys? to the cube to filter rows? what t-sql ? coudl you give an example 😉 I know SQL.

 

4. Invoke the custom function with t-sql query as parameter.

You mean set in the advanced editor the query's definition as calling the function.

 

5. before these steps you should input the sap hana certificate to power bi.

I have a server name , user and password - whats a hana certificate?

Hi @ozmike,

 

>>Where do write a function ? I assume in query editor.?

Yes, at query editor, you can right click to creata a blank query and write these formula.

 

>>Open a data source to an excel file and format key to text?

get data from excel, deal with excel file and output the key list.

Capture.PNG

 

>>I want to join the list of keys? to the cube to filter rows? what t-sql ? Smiley Wink I know SQL.

You can use above key string as the parameter of the sql query.

E.g. "Select * from xxxx where key in ("&keylist&")"

 

>>You mean set in the advanced editor the query's definition as calling the function.

Invoke the method which I mentioned, open the advanced editor to input the parameters.

 

Capture2.PNGCapture3.PNG

Notice: query is the sql query which added the key list.

 

>>I have a server name , user and password - what a hana certificate?

It means you need to sign in your database, after this operation the certificate will be stored in power bi.

 

You can find it at datasource -> global permissions.

Capture4.PNG

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Hi

 

 

 

Ok

1) Create a query of the EXCEL spreadsheet with just one column the KEY column only

call this query KEYLISTQUERY and remove duplicates.

 

2) Then create another query. in the adavance query editor paste this.

 

let
    
    #"Changed Type" = Table.TransformColumnTypes(#"KEYLISTQUERY",{{"KEY", type text}}),
    Custom1 = "'" & Text.Combine(#"Changed Type"[KEY],"','") & "'"
in
    Custom1

 

The output of this is something like this

'key1','key2','key3'  -  (This video here helped me create comma separated list.)

 

We are going to use this text in the dynamic where clause later. This assumes you have a text column number columns and dates will be different, of course.

 

3. Right mouse on the this new query and select create a function called KEYListfunction

with no parameters. to run this function  press the  invoke button after clicking on the function. This function output the query 'key1','key2','key3'.

 

4. Finally create a another query . This now has the dynamic where clause. Which will push down the query to the server. This is what we want to do the join  to the XLS in the hana server not in power bi pc client.

let
xx =   KEYListfunction(),
sel = "select * from ""ZXXX.XXX::mytable""  where ""KEYCOL"" IN ("  & xx & ")  ",

Source = SapHana.Database("myhanaserver:12345", [Query=sel])
in
Source

Finally is you get a weird Firewall error see this post here

 

Have tested  with a XLS key list of 12000 against a HANA TABLE 9 million row.  Once you have the data its a lot quicker as you don't download the 9 million!

Hi here is a turn key solution ( excuse the pun)  example joining the backend to an XLS, This will avoid the default action of pulling the whole table down. It uses SAP hana but i'm sure it could be used for other dbs. The key column in this example is a text field so needs single quotes here and there. Assume 'query' is an item that you can connect to normally in power bi.  Enjoy

 

let
    Source2 = Excel.Workbook(File.Contents("\\yourpath\KeyList.xlsx"), null, true),
    Sheet1_Sheet = Source2{[Item="Sheet1",Kind="Sheet"]}[Data],
    #"Promoted Headers1" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
    Source1 = #"Promoted Headers1",
    KeyList =    Text.Combine(Source1[KEYfield],"','")  ,  // column to single line of text 
    sel2 = "select * from ""ZXXX1.ZXXX_REP.bp::Query"" where ""KeyField"" IN  ('" & KeyList  & "')  ",
    Source = SapHana.Database("server:portnumber", [Query=sel2])
in
   Source

I know this is a very old post but hoping someone has an answer. Has anyone been able to get this to work once published to the Power BI service and refreshing through a gateway? I seem to get the privacy errors no matter what I set the levels to in the gateway data sources. 

Helpful resources

Announcements
Sept PBI Carousel

Power BI Monthly Update - September 2024

Check out the September 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors