Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredPower BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.
Hello there!
Basically, I work with very large datasets containing hundreds of thousands of rows and a hundred or so columns. These datasets consist of lists of clients' accounts, with associated information in the various columns.
Every client is identified with an Account key, the first 10 digits of which represent the Cash account number, while the last 3 represent the Branch number. The two sequences are separated by an underscore.
Now, the problem is that many clients have the same account in different branches, as they may need different services provided by different branches: as a result, for some clients, the first 10 digits (cash account) remain the same, while the branch number changes. As a result, I end up with duplicate accounts in the dataset, belonging to the same customer: the extra account is called a mirror account, and it always has the same branch number (671).
I'm trying to add a custom column in Power Query, implementing this Excel formula with M, but I'm really struggling:
Excel Formula =IF(AND(COUNTIF(A:A,LEFT(A1,10)&"*")>1,RIGHT(A1,3)="671"),"",A1)
In this way, the blanks that I will have in the custom column will be easily identify and can be deleted (remove blanks).
Is there anyone able to help please? If yes, thanks a lot in advance!
Solved! Go to Solution.
Hi @antotom ,
If all you want is to delete rows with account ending with 671, you can use
Table.SelectRows(PreviousStep, each not Text.EndsWith([Account], "671"))
Otherwise to add a column:
Table.AddColumn(PreviousStep, "Custom", each if Text.EndsWith([Account], "671") then "" else [Account])
Hi @latimeria
Thanks for your answer. However I didn't explain myself as best as possible.
I don't have to delete all the accounts ending in 671, I have to delete only the ones for which the first 10 digits repeat themselves.
Imagine that you have an account like 0123456789_616 and another one 0123456789_671: I have to delete the latter and keep the former, that's the purpose of this formula:
=IF(AND(COUNTIF(A:A,LEFT(A1,10)&"*")>1,RIGHT(A1,3)="671"),"",A1)
Hi @antotom ,
Sorry it was crystal clear, I misunderstood.
Try this (copy and paste in a blank query)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyNjE1M7ewNIgHMpVidVCEzMwNwUJGhhiqjI0MUVTFAgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Account = _t]),
#"Added Custom" = Table.AddColumn(
Source,
"Custom",
each if Table.RowCount(Table.SelectRows(Source, (a)=> Text.Start(a[Account],10) = Text.Start([Account],10))) > 1
and Text.End([Account],3) = "671" then "" else [Account]
)
in
#"Added Custom"
Thanks a lot for your help! I'm getting this error though, what's the matter in your opinion?
Expression.Error: We cannot convert a value of type Record to type Text.
Details:
Value=
Account=6300166326_613
Type=[Type]
You've got a syntax error.
send me or publish the query, I'll have a look.
I used only the part from "each if"of the code reported above:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ACCT_KY", type text}, {"CATEGORY_CD", Int64.Type}, {"CATEGORY_DESC", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each if Table.RowCount(Table.SelectRows(Source, (a)=> Text.Start(a[[ACCT_KY]],10) = Text.Start([[ACCT_KY]],10))) > 1
and Text.End([[ACCT_KY]],3) = "671" then "" else [[ACCT_KY]])
in
#"Added Custom"
You got it, thanks a lot, you've been super kind! 🙂
ACCT_KEY = Account. I have 300 columns actually, but I just reported the first ones.