Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
mjohnsonuk
Frequent Visitor

WHERE EXISTS in a csv file

Does anyone know how I can reference a .csv table I have loaded into Power BI
-- The csv file I have uploaded has a field wr_rfa_ID and the table is called NTS_Report
--The table I would like to load is the wmis_activity_component_visit

I only need the wr_rfa_ID`s listed in the csv file not the whole table (I have tried a couple of ways - 1 example below)

SELECT
a.wr_no + a.wr_rfa_no AS wr_rfa_ID,
a.vst_rpt AS NTS_vst_rpt,
FROM product_history.job_table a
WHERE a.wr_rfa_ID IN (SELECT wr_rfa_ID FROM NTS_Report)
AND a.acty_cmpnent_vst_date >='2017-01-01 00:00:00.0';

Thank you in advance

1 ACCEPTED SOLUTION
v-yuezhe-msft
Microsoft Employee
Microsoft Employee

@mjohnsonuk,

I make a test using the following csv source data and SQL source data.
1.PNG2.png


You can perform the following steps:

1. Remove all other columns except wr_rfa_ID column in CSV query, add a new custom column using formula: “’” & [wr_rfa_ID] & “’”.

 3.PNG


The code generated in Advanced Editor of the csv query is as below.

 

let
Source = Csv.Document(File.Contents("yourpath\csvname.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"wr_rfa_ID", type text}, {"Index", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Index"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Custom", each "'"&[wr_rfa_ID]&"'")
in
#"Added Custom"



 

2. Add the highlighted steps in the Advanced Editor of the current csv query.

let
Source = Csv.Document(File.Contents("yourpath\csvname.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"wr_rfa_ID", type text}, {"Index", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Index"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Custom", each "'"&[wr_rfa_ID]&"'"),
keylist= Text.Combine(#"Added Custom"[Custom],","),
select1="SELECT a.wr_no + a.wr_rfa_no ID, a.vst_rpt FROM servername.databasename.dbo.wmis_activity_component_visit a WHERE a.wr_no + a.wr_rfa_no IN (" & keylist & ") AND a.acty_cmpnet_vst_date >='2017-01-01 00:00:00.0'",
Source1 = Sql.Database("Servername", "databasename", [Query=select1])
in
Source1

 

 

3. Then you can get expected data.

 4.png

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

12 REPLIES 12
v-yuezhe-msft
Microsoft Employee
Microsoft Employee

@mjohnsonuk,

Please create two blank queries in Power BI Desktop to troubleshoot this issue.

First:

let

    Source = Salesforce.Reports(),

    #"00O1o000005J5ixEAC" = Source{[Name="00O1o000005J5ixEAC"]}[Data],

    #"Removed Columns" = Table.RemoveColumns(#"00O1o000005J5ixEAC",{"Case Owner", "Case Number", "Visit No", "Opened Date", "Call Type", "Call Status"}),

    #"Merged Columns" = Table.CombineColumns(#"Removed Columns",{"Work Request Number", "RFA"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"wr_rfa_ID"),

    #"Duplicated Column" = Table.DuplicateColumn(#"Merged Columns", "wr_rfa_ID", "wr_rfa_ID - Copy"),

    #"Removed Columns1" = Table.RemoveColumns(#"Duplicated Column",{"wr_rfa_ID - Copy"}),

    #"Added Custom" = Table.AddColumn(#"Removed Columns1", "Custom", each "'"&[wr_rfa_ID]&"'"),

    keylist= Text.Combine(#"Added Custom"[Custom],",")

in
   keylist


Second(Replace keylist to static values): 

Let

 Source1 = Odbc.Query("dsn=Hortonworks_Hive_ODBC_64_dox" " SELECT a.wr_no + a.wr_rfa_no ID, a.vst_rpt FROM prod_history_capture_open.wmis_activity_component_visit a WHERE a.wr_no + a.wr_rfa_no IN (" & keylist & ") AND a.acty_cmpnet_vst_date >='2017-01-01 00:00:00.0' Limit 10;")

in

Source1


Check if you get any issues when creating the above query.

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
v-yuezhe-msft
Microsoft Employee
Microsoft Employee

@mjohnsonuk,

I make a test using the following csv source data and SQL source data.
1.PNG2.png


You can perform the following steps:

1. Remove all other columns except wr_rfa_ID column in CSV query, add a new custom column using formula: “’” & [wr_rfa_ID] & “’”.

 3.PNG


The code generated in Advanced Editor of the csv query is as below.

 

let
Source = Csv.Document(File.Contents("yourpath\csvname.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"wr_rfa_ID", type text}, {"Index", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Index"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Custom", each "'"&[wr_rfa_ID]&"'")
in
#"Added Custom"



 

2. Add the highlighted steps in the Advanced Editor of the current csv query.

let
Source = Csv.Document(File.Contents("yourpath\csvname.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"wr_rfa_ID", type text}, {"Index", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"Index"}),
#"Added Custom" = Table.AddColumn(#"Removed Columns", "Custom", each "'"&[wr_rfa_ID]&"'"),
keylist= Text.Combine(#"Added Custom"[Custom],","),
select1="SELECT a.wr_no + a.wr_rfa_no ID, a.vst_rpt FROM servername.databasename.dbo.wmis_activity_component_visit a WHERE a.wr_no + a.wr_rfa_no IN (" & keylist & ") AND a.acty_cmpnet_vst_date >='2017-01-01 00:00:00.0'",
Source1 = Sql.Database("Servername", "databasename", [Query=select1])
in
Source1

 

 

3. Then you can get expected data.

 4.png

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi Lydia

 

Thank you so much for taking the time to write this out for me ...... I have one quick question - I am not sure what to put in for the servername and database name

 

 

let
    Source = Salesforce.Reports(),
    #"000O1o000004J5ixEAC" = Source{[Name="000O1o000004J5ixEAC"]}[Data],
    #"Removed Columns" = Table.RemoveColumns(#"000O1o000004J5ixEAC",{"Case Owner", "Case Number", "Visit No", "Opened Date", "Call Type", "Call Status"}),
    #"Merged Columns" = Table.CombineColumns(#"Removed Columns",{"Work Request Number", "RFA"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"wr_rfa_ID"),
    #"Duplicated Column" = Table.DuplicateColumn(#"Merged Columns", "wr_rfa_ID", "wr_rfa_ID - Copy"),
    #"Removed Columns1" = Table.RemoveColumns(#"Duplicated Column",{"wr_rfa_ID - Copy"}),
    #"Added Custom" = Table.AddColumn(#"Removed Columns1", "Custom", each "'"&[wr_rfa_ID]&"'"),
keylist= Text.Combine(#"Added Custom"[Custom],","),

select1="SELECT #(lf)   
a.wr_no + a.wr_rfa_no AS wr_rfa_ID, #(lf)   

FROM prod_history_capture_open.wmis_activity_component_visit a
WHERE a.wr_no + a.wr_rfa_no IN (" & keylist & "),
Source1 = SQL.Database(prod_history_capture_open,wmis_activity_component_visit,[Query=select1])
in Source1

 

is it

Odbc.Query("dsn=Hortonworks_Hive_ODBC_64_dox", prod_history_capture_open.wmis_activity_component_visit 

@mjohnsonuk,

You can update the code as below:

 

 

let

    Source = Salesforce.Reports(),

    #"00O1o000005J5ixEAC" = Source{[Name="00O1o000005J5ixEAC"]}[Data],

    #"Removed Columns" = Table.RemoveColumns(#"00O1o000005J5ixEAC",{"Case Owner", "Case Number", "Visit No", "Opened Date", "Call Type", "Call Status"}),

    #"Merged Columns" = Table.CombineColumns(#"Removed Columns",{"Work Request Number", "RFA"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"wr_rfa_ID"),

    #"Duplicated Column" = Table.DuplicateColumn(#"Merged Columns", "wr_rfa_ID", "wr_rfa_ID - Copy"),

    #"Removed Columns1" = Table.RemoveColumns(#"Duplicated Column",{"wr_rfa_ID - Copy"}),

    #"Added Custom" = Table.AddColumn(#"Removed Columns1", "Custom", each "'"&[wr_rfa_ID]&"'"),

    keylist= Text.Combine(#"Added Custom"[Custom],","),

    Source1 = Odbc.Query("dsn=Hortonworks_Hive_ODBC_64_dox" " SELECT a.wr_no + a.wr_rfa_no ID, a.vst_rpt FROM prod_history_capture_open.wmis_activity_component_visit a WHERE a.wr_no + a.wr_rfa_no IN (" & keylist & ") AND a.acty_cmpnet_vst_date >='2017-01-01 00:00:00.0' Limit 10;")

in

Source1



Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thank you

Would this stop the millions of lines loading?

@mjohnsonuk,

Do you have millions rows that meet the following condition or only several rows meet the following condition? For latter case, only several rows are imported after you use the above script. For former case, you have used "LIMIT 10" to restrict the rows.

SELECT
a.wr_no + a.wr_rfa_no AS wr_rfa_ID,
a.vst_rpt AS NTS_vst_rpt,
FROM product_history.job_table a
WHERE a.wr_rfa_ID IN (SELECT wr_rfa_ID FROM NTS_Report)
AND a.acty_cmpnent_vst_date >='2017-01-01 00:00:00.0';



Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

The Salesforce file has about 10,000 lines so I would expect 10-15,000 lines to import
The limit 10 was just to rewrite the code (sorry)
My concern is around if I have to add more tables or refresh it will try and sort through the millions of lines and add unnecessary usage on the server

@mjohnsonuk,

Based on your original post, you want to limit data from ODBC query. But it seems that you want to limit data from Salesforce. Could you please describe it?

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Hi,

The full Salesforce table is about 10,000 lines so the full table can be downloaded directly out of SF
The other data has millions of other job numbers that aren’t relevant to the SF data so I want to limit only downloading the matching job numbers that are in the SF extract .

Hope this makes sense 😀

@mjohnsonuk,

Only matched records will be imported into Power BI using the above script.

Regards,
Lydia

Community Support Team _ Lydia Zhang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Expression.Error: We cannot convert the value " SELECT a.wr_no + a...." to type Record. Details: Value= SELECT a.wr_no + a.wr_rfa_no ID, a.vst_rpt FROM prod_history_capture_open.wmis_activity_component_visit a WHERE a.wr_no + a.wr_rfa_no IN ('00','00','00','00','00','00','01','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','01','00','00','00','00','00','01','00','00','00','00','00','00','01','00','00','00','01','00','00','00','00','00','00','00','00','00','00','01','00','00','00','00','00','00','01','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','01','00','00','01','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','00','01','00','00','00','00','00','00','00','00','00','00','00','01','00000000','00000000','000000001','0000000000','0000000000','0000000000','0000000000','0000000000','0000000000','0000000000','0000000000','000... Type=TypeI get this error on save of the query ? any ideas??

I presume it is a mismatch in the field type (i.e Text / Number)

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.