Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi Guys,
I need to build a model using excel files which has different structure into a unique structure.
My client has a bunch of excel files coming from an APP (CostX)
The excel files have different structure for example:
The challenge here is the build a model in power bi to automate when the users save these files in the client's database.
Basically I need to build a model to retrieve all these excel files into a unique pattern to build the dashboards,
any idea or help would be very appreciate, thank you so much guys.
Fabricio,
Solved! Go to Solution.
HI @Anonymous ,
For large amount of randomly structure files, I'd like to suggest you do these transform/conversation by external tools instead of M query code. (notice: these M query codes of transform are complex, low performance and hard to maintain, you need to manually fix these steps codes every time major data structure changes)
In fact, current I haven't found any simple solutions to automatically handle fully randomly data structure formatting, maybe you can try to machine learning?
Guide to File Formats for Machine Learning: Columnar, Training, Inferencing, and the Feature Store
Regards,
Xiaoxin Sheng
HI @Anonymous,
>>Different descriptions for the same item
For this scenario, I'd like to suggest you create a dictionary table or use 'fuzzy match' feature to mapping different descriptions of similar items.
Automatically create data dictionary for your Power Pivot model
>>Bunch of spaces
You can take a look at the following blog about use 'Text.Split' function to remove space characters into values:
Clean WhiteSpace in PowerQuery
>>has no headers
In my opinion, I'd like to suggest you use 'Promoted Headers' function to define the table header if your header is stored in first row of table records or use 'Table.RenameColumns' with 'List.zip' and 'Table.ColumnNames' functions to achieve auto change default table names with default value:
Table.RenameColumns(PreviousStep,List.Zip({Table.ColumnNames(PreviousStep), {"Columname1","Columname2","Columname3","Columname4","Columname5"}}))
Notice: list.zip function expression should have similar count of items in left and right parameters.
>>The address is in the first rown then the second row start the columns (name, description of the item, cost, etc)
You can add an index column to your table, then you can add a custom column to extract the next/previous odd/even index records(current index +/- 1) then remove empty row to display address and detail descriptions at the same row.
>>subtotals in the middle then some spaces
I'd like to suggest you remove these blank columns and subtotal column. Power bi will aggregation your records based on its category and legends fields when you use them in visuals, so the subtotal fields are not necessarily.
Work with aggregates (sum, average, and so on) in the Power BI service
I hope the above description will help with your scenarios.
Regards,
Xiaoxin Sheng
Hi @Anonymous
thanks for your message
I'll try to use the fuzzy Match,
for you solution, I can do manually for one or two excel files but do you think would be good if I have hundreds of excel files? I don't know excel how many but I know this app where I'm getting the excel files provides all in different patterns, for example It doesn't provide exctely the same strucure of spaces or columns names and etc,
so I need somehow to automate this ETL,
Fabricio,
HI @Anonymous ,
For large amount of randomly structure files, I'd like to suggest you do these transform/conversation by external tools instead of M query code. (notice: these M query codes of transform are complex, low performance and hard to maintain, you need to manually fix these steps codes every time major data structure changes)
In fact, current I haven't found any simple solutions to automatically handle fully randomly data structure formatting, maybe you can try to machine learning?
Guide to File Formats for Machine Learning: Columnar, Training, Inferencing, and the Feature Store
Regards,
Xiaoxin Sheng
Hi @Anonymous ,
You need to create a function with Power query that will allow you to automate the treatment of each of the files in a block check the post below:
The only question here is that you probably need to apply this to a folder or a sharepoint folder to get all the files as a parameter in your custom function.
Regards,
MFelix
Regards
Miguel Félix
Proud to be a Super User!
Check out my blog: Power BI em PortuguêsRegards
Miguel Félix
Proud to be a Super User!
Check out my blog: Power BI em PortuguêsHi @MFelix
Thanks for your email,
Please see below 2 samples, both of the files are regarding to the same Project and the app provides this files in excel but in different format.
https://drive.google.com/file/d/17hJ32NxKC3Rz1f5ze7nxMtiJ472eJLWq/view?usp=sharing
https://drive.google.com/open?id=1Mbrh7IkGkxTAPrFPYp2BaFw8h4QiT8PD
Would be a good solution if the query can do the ETL and bring into the model in just one shape for all the following files.
Fabricio
User | Count |
---|---|
73 | |
70 | |
38 | |
23 | |
23 |
User | Count |
---|---|
96 | |
94 | |
50 | |
42 | |
40 |