The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Dear Community,
I have a some data and I'm updating every month and comparing those data. But I have a problem because of Duplicate.
I have created an example table myself below. My goal is to compare this previous month and the month before that. (February and March for today)
If I use the "Remove Duplicate" option in the query, I get an incorrect result. Because if there is a person named A in February or before, it automatically deletes that person from March. In comparison, person A in Germany does not appear at all in March. That's why my comparison is wrong.
What should I do for this situation? If I'm going to do the "Remove duplicate" that I need, I need to make it do it separately for each month. I divided it into different signs, but this time I am having trouble making comparisons as a comparison diagram.
01.03.2022 | A | Germany |
01.03.2022 | B | England |
03.03.2022 | B | England |
04.03.2022 | C | Spain |
05.03.2022 | D | Spain |
01.02.2022 | A | Germany |
01.02.2022 | A | Germany |
02.02.2022 | B | England |
04.02.2022 | C | Germany |
05.02.2022 | D | Spain |
Thank you for your support
Best Regards
Solved! Go to Solution.
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test (later on when you use the query on your dataset, you will have to change the source appropriately)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtY31DcyMDJS0lFyBGL31KLcxLxKpVgdFCknIHbNS89JzEuBShnjljKBSTljGGgKk3IB4uCCxMw8sIQRbkfglTLC5Qgj3I4wwuUIQxw2xQIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, Group = _t, Country = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Group", type text}, {"Country", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "CurrentPreviousMonth", each [Date]>=Date.AddMonths(Date.StartOfMonth(Date.From(DateTime.FixedLocalNow())),-2) and [Date]<=Date.AddMonths(Date.EndOfMonth(Date.From(DateTime.FixedLocalNow())),-1)),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([CurrentPreviousMonth] = true)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"CurrentPreviousMonth"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "Month", each Date.Month([Date])),
#"Grouped Rows" = Table.Group(#"Added Custom1", {"Month"}, {{"Temp", each _, type table [Date=nullable date, Group=nullable text, Country=nullable text, Month=number]}}),
#"Added Custom2" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.Distinct([Temp], {"Group", "Country"})),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom2", "Custom", {"Date", "Group", "Country"}, {"Date", "Group", "Country"}),
#"Removed Columns1" = Table.RemoveColumns(#"Expanded Custom",{"Month", "Temp"})
in
#"Removed Columns1"
Thank you for your help.
I need to work on this code. It is a bit complicated.
For the given data, can you also post expected result?
Good Morning,
I'm sorry. I forgot to write expected result :
For February User-A connected to the system 2 times and its duplicate for me, I'm removing it and
for March User-B connected to the system 2 times and I need to see it only one time.
February | March | |
Germany | 2 | 1 |
England | 1 | 1 |
Spain | 1 | 2 |
But if I say from query "Remove duplicates" I cant see any result at March !!
Thanks.
Let me rephrase it.
1. User A appears one time in March and 2 times in Feb.
Do you need to keep March entry and delete both entries of Feb or you want to keep one entry from Feb as well?
2. User B has 2 entries for Mar and one for Feb.
Hello :
No. If I use in Query "Remove Duplicates" I see March is empty. Because user A B C and D has been connected to the system at February. That's why, system removes users from March.
I want to see like that :
1. User A is connected from Germany in February 2 times and in March 1 time. I want to remove duplicates From February also from March. But he connected in March 1 time, thats why I dont need to remove it. However in February he was 2 times in System. Then I don't need to see him in my result 2 times, only one time. and Result will come for user A : February 1 time and March 1 time. (For example: User A could connect 100 times at February and 2 times at March, Still I want to see him 1 time for both months.)
2. For User D : connected for both months only one time. There is no duplicate here.
I want to compare monthly. However, the Duplicates to be deleted will be via the user.
Finally, I calculate how many different user from which country are connected to the system.
Best Regards
See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test (later on when you use the query on your dataset, you will have to change the source appropriately)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtY31DcyMDJS0lFyBGL31KLcxLxKpVgdFCknIHbNS89JzEuBShnjljKBSTljGGgKk3IB4uCCxMw8sIQRbkfglTLC5Qgj3I4wwuUIQxw2xQIA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Date = _t, Group = _t, Country = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"Group", type text}, {"Country", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "CurrentPreviousMonth", each [Date]>=Date.AddMonths(Date.StartOfMonth(Date.From(DateTime.FixedLocalNow())),-2) and [Date]<=Date.AddMonths(Date.EndOfMonth(Date.From(DateTime.FixedLocalNow())),-1)),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each ([CurrentPreviousMonth] = true)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"CurrentPreviousMonth"}),
#"Added Custom1" = Table.AddColumn(#"Removed Columns", "Month", each Date.Month([Date])),
#"Grouped Rows" = Table.Group(#"Added Custom1", {"Month"}, {{"Temp", each _, type table [Date=nullable date, Group=nullable text, Country=nullable text, Month=number]}}),
#"Added Custom2" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.Distinct([Temp], {"Group", "Country"})),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom2", "Custom", {"Date", "Group", "Country"}, {"Date", "Group", "Country"}),
#"Removed Columns1" = Table.RemoveColumns(#"Expanded Custom",{"Month", "Temp"})
in
#"Removed Columns1"
this is my print out, but those results are with duplicate. If I remove duplicates, I see for example Austria in March as 0 (For my list) should be 3 (not 5 because there is also duplicated user.)