Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
RiaanJS
Regular Visitor

Search words and return content in the row below

Good day,

I am quite new to Power Query and a little unsure of how everything works. I am looking to extract some data from pdf invoices that all have the same format. Currently I have set up a folder where all the invoices are stored and added to Power Query but I would like to retain only certain information by looking for certain Keywords such as "Date", "Invoice Number", "Total", etc. My idea was to create a new column that would look for each of these key words, but should return the value in the cell below or to the right of the cell that contains the Matched keyword. I have seen some tutorials on how to look for and retain certain keywords, but how do I set it up so that it gives me the value in the cell that actually contains the information that I am looking for?

An example would be like this:

GIVEN DATA: ||RETAINED INFORMATION:
(Invoice 1) ||DateInvoice numberTotal(Keywords to look for)
DateInvoice number||07/01/2022INV0001$99.00(Information gathered)
07/01/2022INV0001||08/01/2022INV0002 $150.00 
  ||    
Total$99.00||    
-------------------------------------||   
(Invoice 2) ||    
DateInvoice number||    
08/01/2022INV0002||    
  ||    
Total$150.00||    
1 ACCEPTED SOLUTION
ronrsnfld
Super User
Super User

One way to do it is to

  • Add duplicate columns that are offset by one from the original
  • Add the Date, Invoice Number and Total columns with appropriate logic
  • Remove the original columns
  • Filter the remaining columns by removing rows where all entries are null

Read the comments and also examine the Applied Steps window to better understand

 

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type any}, {"Column2", type any}}),

//add shifted columns
    shiftedCols = Table.FromColumns(Table.ToColumns(#"Changed Type") & 
                    {List.RemoveFirstN(#"Changed Type"[Column1]) & {null}} &
                    {List.RemoveFirstN(#"Changed Type"[Column2]) & {null}}),

//Add columns for Date, Invoice Number and Total with appropriate logic
   date = Table.AddColumn(shiftedCols, "Date", each if [Column1] ="Date" then [Column3] else null),
   invNum = Table.AddColumn(date,"Invoice Number", each if [Column2] = "Invoice number" then [Column4] else null),
   total = Table.AddColumn(invNum,"Total", each if [Column1] = "Total" then [Column2] else null),

//Remove the original and the shifted columns
    #"Removed Columns" = Table.RemoveColumns(total,{"Column1", "Column2", "Column3", "Column4"}),

//remove rows where all entries are null
    remAllNulls = Table.SelectRows(#"Removed Columns", each not ([Date]=null and [Invoice Number]=null and [Total]=null)),

//Fill up the Total column since that will be on a different row than the others
    #"Filled Up" = Table.FillUp(remAllNulls,{"Total"}),

//Remove the unneeded rows, filtering on Date<>null
    #"Filtered Rows" = Table.SelectRows(#"Filled Up", each ([Date] <> null)),

//Set the data types
    #"Changed Type1" = Table.TransformColumnTypes(#"Filtered Rows",{
        {"Date", type date}, 
        {"Invoice Number", type text}, 
        {"Total", Int64.Type}})
in
    #"Changed Type1"

 

ronrsnfld_0-1641675462450.png

 

 

 

View solution in original post

2 REPLIES 2
ronrsnfld
Super User
Super User

One way to do it is to

  • Add duplicate columns that are offset by one from the original
  • Add the Date, Invoice Number and Total columns with appropriate logic
  • Remove the original columns
  • Filter the remaining columns by removing rows where all entries are null

Read the comments and also examine the Applied Steps window to better understand

 

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table7"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type any}, {"Column2", type any}}),

//add shifted columns
    shiftedCols = Table.FromColumns(Table.ToColumns(#"Changed Type") & 
                    {List.RemoveFirstN(#"Changed Type"[Column1]) & {null}} &
                    {List.RemoveFirstN(#"Changed Type"[Column2]) & {null}}),

//Add columns for Date, Invoice Number and Total with appropriate logic
   date = Table.AddColumn(shiftedCols, "Date", each if [Column1] ="Date" then [Column3] else null),
   invNum = Table.AddColumn(date,"Invoice Number", each if [Column2] = "Invoice number" then [Column4] else null),
   total = Table.AddColumn(invNum,"Total", each if [Column1] = "Total" then [Column2] else null),

//Remove the original and the shifted columns
    #"Removed Columns" = Table.RemoveColumns(total,{"Column1", "Column2", "Column3", "Column4"}),

//remove rows where all entries are null
    remAllNulls = Table.SelectRows(#"Removed Columns", each not ([Date]=null and [Invoice Number]=null and [Total]=null)),

//Fill up the Total column since that will be on a different row than the others
    #"Filled Up" = Table.FillUp(remAllNulls,{"Total"}),

//Remove the unneeded rows, filtering on Date<>null
    #"Filtered Rows" = Table.SelectRows(#"Filled Up", each ([Date] <> null)),

//Set the data types
    #"Changed Type1" = Table.TransformColumnTypes(#"Filtered Rows",{
        {"Date", type date}, 
        {"Invoice Number", type text}, 
        {"Total", Int64.Type}})
in
    #"Changed Type1"

 

ronrsnfld_0-1641675462450.png

 

 

 

Thank you. That worked perfectly!

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors