Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
This is question 752 in the ongoing saga of 'How in the heck do I do that!?'. What I'm going to say might get a bit garbled, so feel free to let me know that I'm making no sense.
I have the following piece of code that people on here helped me create:
(P_List as text) =>
let
Source = SharePoint.Files("https://SHAREPOINTFOLDER", [ApiVersion = 15]),
#"P_List xlsx_https://SHAREPOINTFOLDER/DOCUMENTS/" = Source{[Name=P_List,#"Folder Path"="https://SHAREPOINTFOLDER/DOCUMENTS/"]}[Content],
#"Imported Excel" = Excel.Workbook(#"P_List xlsx_https://SHAREPOINTFOLDER/DOCUMENTS/"),
#"Monthly_Sheet" = #"Imported Excel"{[Item=Date.ToText( Date.AddMonths(Date.From(DateTime.LocalNow()),-1),"MMMM yyyy") & " data",Kind="Sheet"]}[Data],
#"Removed Top Rows" = Table.Skip(#"Monthly_Sheet",1),
In the above, 'P_List' is a parameter. It's a list of spreadsheets within the Sharepoint folder. Each spreadsheet within P_List is opened and a specific sheet within that spreadsheet is pulled into Power BI. The line that starts '#Monthly_Sheet' contains a formula which identifies the required spreadsheet. In this case it is looking at the current date, going back one month, then adding the word data to the end. So, for instance, as we're in April, the sheet it will look for will be called 'March 2021 data'. There is a 'March 2021 data' sheet within every spreadsheet in P_List.
Now, however, instead of data from a specific sheet they want data from every sheet that has the word 'data' in the name.
So, each spreadsheet has a number of sheets in them all named in the same way: March 2021, March 2021 data, April 2021, April 2021 data, June 2021, June 2021 data etc. Rather than pulling data from a specifically named sheet. I now need to be able to pull data from every sheet that has 'data' in the name. It's essentially a second parameter/loop.
As a basic example of what needs to happen, let's say the parameter P_List is comprised of the following spreadsheets:
Spreadsheet1
Spreadsheet2
Spreadsheet3
(the list is non-exhaustable as more spreadsheets could be added at any time)
Each spreadsheet contains the same sheets: March 2021, March 2021 data, April 2021, April 2021 data, June 2021, June 2021 data etc. I need to be able to pull data from each sheet with 'data' in the name from each spreadsheet in P_List.
Is this possible?
Solved! Go to Solution.
Hi,
This can be done in PowerQuery/M. Given below is a step-by-step demo of the outputs at each stage. After that, I have given the PowerQuery/M code which you need to modify - mainly the first line. I have used "Folder.Files" function. You have to change the line to your Sharepoint source. Other than that, this should work without needing any major changes.
Please note that it is possible to achieve the same results using various other functions and a little bit of more complex code to optimize the process, but I have given it step-by-step based on our natural thought process for your understanding so that you would be able to modify it if required according to your needs.
Step 1: Fetch the contents of the Folder.
Step 2: Count the files.
Step 3:
Generate a list of file indices from 0 to (n-1) where n is the number of files in the folder.
Step 4:
Covert the list we got in Step 3 to a table.
Step 5:
Add the contents of the spreadsheets as binary using the file indices (0 to n-1) rather than predefined names.
Step 6: Add the file names.
Step 7:
Add the field with the count of sheets in each of those files. In this example, all the 3 files have 3 sheets in them.
Step 8:
Add the list of sheet indices. (Just like we did for the files)
Step 9:
Expand the SheetIndices list.
Step 10:
Add the sheet names of each file.
Step 11:
Add the sheet contents
Step 12:
Filter the sheet names with the word "Data" in them.
Step 13:
Remove unnecessary columns.
That's it. Now the contents of each file and each sheet is added to this table as a "Table". If you expand the last column, you will have your results.
Given below is the PowerQuery/M Code for the same.
let
Source = Folder.Files("C:\Users\Sreenath\Documents\PowerBIDataSource"),
FileCount = Table.RowCount(Source),
FileIndexList = List.Generate(()=>0,each _ < FileCount, each _ +1),
FileIndexTable = Table.FromList(FileIndexList, Splitter.SplitByNothing(), null, null, ExtraValues.Ignore),
FileAdditions = Table.AddColumn(FileIndexTable,"ContentBinary", each Source{Record.Field(_,"Column1")}[Content]),
FileAdditions2 = Table.AddColumn(FileAdditions,"FileName", each Source{Record.Field(_,"Column1")}[Name]),
AddSheetCount =
Table.AddColumn(
FileAdditions2,
"SheetCount",
each Table.RowCount(Excel.Workbook(Record.Field(_,"ContentBinary"),null,true))
),
#"Changed Type" = Table.TransformColumnTypes(AddSheetCount,{{"SheetCount", Int64.Type}}),
AddSheetIndices =
Table.AddColumn(
#"Changed Type",
"SheetIndices",
each let RC = Record.Field(_,"SheetCount") in List.Generate(()=>0,each _ < RC,each _ +1)
),
ExpandIndicesColumn = Table.ExpandListColumn(AddSheetIndices, "SheetIndices"),
RenameFileIndices = Table.RenameColumns(ExpandIndicesColumn,{{"Column1","FileIndex"}}),
AddSheetNames =
Table.AddColumn(
RenameFileIndices,
"Sheet Name",
each
let F = Excel.Workbook(Record.Field(_,"ContentBinary"),null,true), S=Record.Field(_,"SheetIndices")
in F{S}[Name]
),
AddSheetContents =
Table.AddColumn(
AddSheetNames,
"Sheet Contents",
each
let S2 =
let F = Excel.Workbook(Record.Field(_,"ContentBinary"),null,true), S=Record.Field(_,"SheetIndices")
in F{S}[Data]
in
Table.PromoteHeaders(S2, [PromoteAllScalars=true])
),
FilterSheetsWithWord_Data_inIt = Table.SelectRows(AddSheetContents, each Text.Contains([Sheet Name], "Data")),
RemoveUnnecessaryColuumns = Table.RemoveColumns(FilterSheetsWithWord_Data_inIt,{"ContentBinary","SheetCount"})
in
RemoveUnnecessaryColuumns
Please remember to change this line
Source = Folder.Files("C:\Users\Sreenath\Documents\PowerBIDataSource"),
to
Source = SharePoint.Files("https://SHAREPOINTFOLDER", [ApiVersion = 15]),
or whatever is your path to the directory/folder.
Hi,
This can be done in PowerQuery/M. Given below is a step-by-step demo of the outputs at each stage. After that, I have given the PowerQuery/M code which you need to modify - mainly the first line. I have used "Folder.Files" function. You have to change the line to your Sharepoint source. Other than that, this should work without needing any major changes.
Please note that it is possible to achieve the same results using various other functions and a little bit of more complex code to optimize the process, but I have given it step-by-step based on our natural thought process for your understanding so that you would be able to modify it if required according to your needs.
Step 1: Fetch the contents of the Folder.
Step 2: Count the files.
Step 3:
Generate a list of file indices from 0 to (n-1) where n is the number of files in the folder.
Step 4:
Covert the list we got in Step 3 to a table.
Step 5:
Add the contents of the spreadsheets as binary using the file indices (0 to n-1) rather than predefined names.
Step 6: Add the file names.
Step 7:
Add the field with the count of sheets in each of those files. In this example, all the 3 files have 3 sheets in them.
Step 8:
Add the list of sheet indices. (Just like we did for the files)
Step 9:
Expand the SheetIndices list.
Step 10:
Add the sheet names of each file.
Step 11:
Add the sheet contents
Step 12:
Filter the sheet names with the word "Data" in them.
Step 13:
Remove unnecessary columns.
That's it. Now the contents of each file and each sheet is added to this table as a "Table". If you expand the last column, you will have your results.
Given below is the PowerQuery/M Code for the same.
let
Source = Folder.Files("C:\Users\Sreenath\Documents\PowerBIDataSource"),
FileCount = Table.RowCount(Source),
FileIndexList = List.Generate(()=>0,each _ < FileCount, each _ +1),
FileIndexTable = Table.FromList(FileIndexList, Splitter.SplitByNothing(), null, null, ExtraValues.Ignore),
FileAdditions = Table.AddColumn(FileIndexTable,"ContentBinary", each Source{Record.Field(_,"Column1")}[Content]),
FileAdditions2 = Table.AddColumn(FileAdditions,"FileName", each Source{Record.Field(_,"Column1")}[Name]),
AddSheetCount =
Table.AddColumn(
FileAdditions2,
"SheetCount",
each Table.RowCount(Excel.Workbook(Record.Field(_,"ContentBinary"),null,true))
),
#"Changed Type" = Table.TransformColumnTypes(AddSheetCount,{{"SheetCount", Int64.Type}}),
AddSheetIndices =
Table.AddColumn(
#"Changed Type",
"SheetIndices",
each let RC = Record.Field(_,"SheetCount") in List.Generate(()=>0,each _ < RC,each _ +1)
),
ExpandIndicesColumn = Table.ExpandListColumn(AddSheetIndices, "SheetIndices"),
RenameFileIndices = Table.RenameColumns(ExpandIndicesColumn,{{"Column1","FileIndex"}}),
AddSheetNames =
Table.AddColumn(
RenameFileIndices,
"Sheet Name",
each
let F = Excel.Workbook(Record.Field(_,"ContentBinary"),null,true), S=Record.Field(_,"SheetIndices")
in F{S}[Name]
),
AddSheetContents =
Table.AddColumn(
AddSheetNames,
"Sheet Contents",
each
let S2 =
let F = Excel.Workbook(Record.Field(_,"ContentBinary"),null,true), S=Record.Field(_,"SheetIndices")
in F{S}[Data]
in
Table.PromoteHeaders(S2, [PromoteAllScalars=true])
),
FilterSheetsWithWord_Data_inIt = Table.SelectRows(AddSheetContents, each Text.Contains([Sheet Name], "Data")),
RemoveUnnecessaryColuumns = Table.RemoveColumns(FilterSheetsWithWord_Data_inIt,{"ContentBinary","SheetCount"})
in
RemoveUnnecessaryColuumns
Please remember to change this line
Source = Folder.Files("C:\Users\Sreenath\Documents\PowerBIDataSource"),
to
Source = SharePoint.Files("https://SHAREPOINTFOLDER", [ApiVersion = 15]),
or whatever is your path to the directory/folder.
@Anonymous that is absolutely amazing. It actually runs faster than what I had in place initally and does exactly what I need it to.
I'm going to have to rebuild my entire report to work with the new layout but I don't care. This does what I need.
Thank you so so much.
I am happy that my code is useful to you. But a word of caution - The powerquery that I have given is not designed to take a parameter. So it scans the folder and fetches all the excel files in that folder and processes them. If by mistake, there is an excel file in that folder that is not supposed to be there and by chance that file has a sheet that has the word "Data" in its name, then that scenario will mess up your entire table by loading its contents, probably in a haphazard manner. So for this code to work properly always, you have to ensure that only relevant files are stored in that SharePoint directory and any other files should be stored in some other directory so that this program does not fetch those unnecessary excel files and disrupts your data.
@Anonymous I did notice that. Staff are under strict instructions to put nothing in the folder that shouldn't be there.
They won't listen, but the instructions are there.
Hi @Mat42 ,
Sounds like nested functions in power query, not certain but you could refer:
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
@v-yingjl Thanks for your reply.
I've checked out the links you provided and can basically see how it works, but I don't know enough about M to put it in to practice.
I know that I need to adjust this section of code:
Date.ToText( Date.AddMonths(Date.From(DateTime.LocalNow()),-1),"MMMM yyyy") & " data",Kind="Sheet"]}[Data]
At the minute it is looking specifically for a sheet named for the month previous to the current month with the word 'data' at the end (so, in this case, it is looking for a sheet named 'March 2021 data'). I know I need to adjust it to only look for sheets containing the word 'data', which I can do, but I can't work out how to implement nested functions to loop the code to look at all sheets containing the word 'data' in the name and pull data from them.
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
113 | |
91 | |
86 | |
76 | |
66 |
User | Count |
---|---|
149 | |
117 | |
111 | |
106 | |
95 |