Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.
Hello,
I'm working on a single pdf file.
Once I've finished all the transformation steps, I'd like to turn it into a function
However, I'm having trouble selecting the range in the table
I've added an "Index" column but then I don't know how to continue.
I would like to keep dynamically the range from "Column1" to the "Index" column according to the interval :
- if Column1 = "INVOICE N°"
- if Column 5 = "Puissance :" (Row 20-1=Row 19)
In this example, il want tu keep the range Row 15 to 19 and delete the rest (see screenshot 3)
I don't know where to start with all the possibilities: List.PositionOf, Text.Contains....etc
Thanks in advance for your help
Best regards
let
Source = File.Contents(AllPath[Column1]{0}),
Custom1 = Pdf.Tables(Source),
#"Filtered Rows" = Table.SelectRows(Custom1, each ([Kind] = "Page")),
Custom2 = Table.TransformColumns(#"Filtered Rows", {"Data", each Table.AddIndexColumn(_, "Index", 1,1) })
in
Custom2
Hello,
after a headache, I got around the problem of my previous message,
Taking "Table003" instead of the "Page001", it worked for the 23 pdf, even though I would have liked to use "Page001"
I also simplified the code M (see below)
One question please, is it possible to combine the 2 lines "Custom3" and "Custom2" in one ?
Otherwise, I have not quite finished yet, I still have 4 columns to add but it is more complicated there
Thanks in advance
Best regards
(file_binary)=>
let
//Source = File.Contents(AllPath[Column1]{0}),
Custom1 = Pdf.Tables(file_binary),
#"Filtered Rows3" = Table.SelectRows(Custom1, each ([Id] = "Table003")),
Custom3 = Table.TransformColumns(#"Filtered Rows3", {"Data", each Table.PromoteHeaders(_)}),
Custom2 = Table.TransformColumns(Custom3, {"Data", each Table.SelectRows(_, each ([#"Unit Price"] <> null and [#"Unit Price"] <> "" ))}),
#"Removed Other Columns" = Table.SelectColumns(Custom2,{"Data"}),
#"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", {"Reference", "Designation", "Qty/Time", "Unit Price", "Amount excl. taxes"}, {"Reference", "Designation", "Qty/Time", "Unit Price", "Amount excl. taxes"})
in
#"Expanded Data"
Hello everyone,
I haven't made any progress since my last message but I'm not giving up hope 🤗.
Today I converted the code into a function and applied it to the 23 Binary files (pdf)
It works very well for this part of the code except for 3 or 4 pdf files,
The data is shifted to the right with an almost empty column1
My code returns an empty table for these 3 or 4 files
I don't know how to correct this
Thank you in advance for your help
Best Regards
Hi @Mederic
You can try to use Table.RemoveMatchingRows() function.
e.g
= Table.AddColumn(#"Changed Type1", "Custom", each if [Column3]="apple" then Table.RemoveMatchingRows(#"Changed Type1",{[Index=1]},"Index") else null)
You can refer to the following link about the function.
Table.RemoveMatchingRows - PowerQuery M | Microsoft Learn
Best Regards!
Yolo Zhu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hello @Anonymous ,
Thank you for your message and for your help,
Last night, I got the result in screenshot with the code below
It seems to work even if I would have liked to see other more elegant solutions.
I still have to add the last columns but I'm currently Table.
Best Regards
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Column6", type text}, {"Column7", type text}, {"Column8", type any}, {"Column9", type text}, {"Column10", type any}, {"Column11", type text}, {"Column12", type text}, {"Column13", type text}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 1, 1, Int64.Type),
RecordList = Table.AddColumn(#"Added Index", "Custom", each Record.ToList(_)),
ListContains = Table.AddColumn(RecordList, "Custom.1", each List.ContainsAny([Custom], {"Reference"})),
#"Filtered Rows" = Table.SelectRows(ListContains, each ([Custom.1] = true)),
StartPointRows = #"Filtered Rows"{0}[Index],
RefersChangedType = #"Changed Type",
#"Removed Top Rows" = Table.Skip(RefersChangedType,StartPointRows-1),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers",{{"Reference", type text}, {"Designation", type text}, {"Column3", type text}, {"Column4", type text}, {"Column5", type text}, {"Qty/Time", type text}, {"Column7", type text}, {"Column8", type any}, {"Unit Price", Int64.Type}, {"Column10", type any}, {"", type text}, {"Amount excl. taxes", Int64.Type}, {"Column13", Int64.Type}}),
#"Filtered Rows1" = Table.SelectRows(#"Changed Type1", each ([Unit Price] <> null)),
SelectColumns = #"Filtered Rows1"[[Reference],[Designation],[#"Qty/Time"],[#"Unit Price"],[#"Amount excl. taxes"]]
in
SelectColumns
Hello,
I know it's not easy, especially with images,
I enclose a similar Excel file of the pdf table and a new capture corresponding to the final result.
Thank you in advance
Best regards