The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi,
I have a table with several columns including Date, ID and other variables.
I want to get a grouped table with a distinct count of the IDs grouped by 2 variables. Until here the group by option solves this quickly.
The added difficulty is that the IDs distinct count has to be done for the previous 3 months (e.g. for Feb-20 I will count the distinct IDs in Jan-20, Dec-19 and Nov-19)
I am working with the advanced editor but I don't find a way to add a "between dates" or any other option to add to the query that will provide all the IDs for a past period of time.
The data source looks like this:
And I want a table like the following:
It means that, for the month of Feb-20 I am counting how many distinct ID+Var+Var2 existed in the previous 3 months (so Date between Nov-2019 and Jan-2020)
The result table has to contain all the months, not only Feb-20.
**I'm adding a link to a sample file containing fake data: https://1drv.ms/u/s!AiEedMWX9yXiayCL3IzMb4-Shy0?e=SXa9FP
Anyone knows how can I do that from the advanced editor?
Thanks
Solved! Go to Solution.
change the column [all] in the expressione where
#"Added Custom 1" = List.Transform(#"Added Custom"[all],
to [rollingQ]
#"Added Custom 1" = List.Transform(#"Added Custom"[rollingQ],
but, please PLEASE, be carefull to the space in the text values. For instance Blue is not "Blue" but "Blue " and this cause problem.
I don't have now enough time to analyze and understand the real reason for these problems, but, for safety, find and remove the extra spaces
#########edited#############
use this code to try
let
s1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZTNbsIwEIRfpcqZSKU3jqCqhwh6IBcqxCGA+0MtQEhpVZ6+Voyn9o4Xou4hsfAnz2ZnzHJZDMvnw1c5HBWDovHlVhPbmjv3rovVIEHWXbnF3Gzdcyb3/454MdYevt1iemEezcYzm0tFOguCtl1BqJb7xlcsFJqpmn35cO9+eA0VKU0l9NYVhBYEZD6plgzGEmSoFzG4PircyntXGX80GWrjA5WzKNMNOfRk1pJJLQKAXsghILtQGYcAffpii2bNSVOqJYI8SYeYSB3CfhRbtmh8PHHm0rkAQeJoLkB4/iQknOZO9KFckeFjkDnlayj6kT0/mdClx4DR0pQBaGxgqBdqJQ4cpUlXom4wlX80U7X72+kHpKYfhG40EDX9TIj/p7B/Nf2g9PRXrb0hZHsJ2d5C+jUDol8zINo1y5yh65x9CWb1Cw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, ID = _t, Var1 = _t, Var2 = _t]),
s2=Table.ReplaceValue(s1,"Blue ","Blue",Replacer.ReplaceText,{"Var1"}),
#"Changed Type" = Table.TransformColumnTypes(s2,{{"Date", type date}},"en-US"),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Date] <> null and [Date] <> ""),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Date"}, {"all", each _},GroupKind.Local,(x,y)=>Number.From(y[Date]>=Date.AddMonths(x[Date],1))),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "rollingQ", each Table.Combine(List.LastN(List.FirstN(#"Grouped Rows"[all], List.PositionOf(#"Grouped Rows"[all],[all])),3))),
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcsopTVXSUQpWitWBc3yROT7InAgILyg1Ba4HwvZFYiOrgWqITM3JyS+H64FzfVG5aIpBmmMB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type 1" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
#"Added Custom 1" = List.Transform(#"Added Custom"[rollingQ], (t)=> Table.AddColumn(#"Changed Type 1", "dist", each try Table.RowCount(Table.Distinct(Table.SelectRows(t, (r)=> r[Var1]=[Column1] and r[Var2]=[Column2] ),"ID" ))otherwise 0)),
q1=Table.FromColumns({#"Added Custom"[Date], #"Added Custom 1"}),
#"Expanded Column2" = Table.ExpandTableColumn(q1, "Column2", {"Column1", "Column2", "dist"}, {"Column2.Column1", "Column2.Column2", "Column2.dist"}),
#"Added Custom 2" = Table.AddColumn(#"Expanded Column2", "mmm", each Date.MonthName([Column1],"it-IT")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom 2",{"Column1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"mmm", "Column1"}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Renamed Columns", {{"Column1", type text}}, "it-IT"), List.Distinct(Table.TransformColumnTypes(#"Renamed Columns", {{"Column1", type text}}, "it-IT")[Column1]), "Column1", "Column2.dist")
in
#"Pivoted Column"
The problem about space ending Blue was that I hand writed an auxiliary table for the colours and sizes and, of course, I couldn't realize there was a space after Blue in the table you linked
Hi, @Anonymous
What if the data before the current month is less than three months? and could you upload a complete spreadsheet of simulated data
Hi, @ziying35
If there is any ID only existing in the current month, it won't appear in the ID distinct count until the next 3 months.
The goal is to get the distinct IDs in groups of 3 months. If I have an ID only appearing in Jan-20, it will be counted in the lines for Feb-20, Mar-20 and Apr-20. It won't appear in May-20 if it doesn't exist in any of the previous 3 months. It won't appear either in Jan-20.
If the current data is less than 3 months - imagine we started logging info starting on Jun-20 - I'd like to get all the IDs distinct count for the month on June assigned to the July row in the result table.
I am not finding a way to upload a sample data file, my apologies.
@Anonymous
You can upload the file to onedrive or Google or Media Fire and post the link, File data as much as possible to simulate, the desired effect is also simulated, Because my English is not good, but I can understand what the spreadsheet says
Here it is the link to the file, I hope you can open it
https://1drv.ms/u/s!AiEedMWX9yXiayCL3IzMb4-Shy0?e=SXa9FP
I'm putting the link to the file in my first post.
Thank you 🙂
is this the expected result?
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZTNbsIwEIRfpcqZSKU3jqCqhwh6IBcqxCGA+0MtQEhpVZ6+Voyn9o4Xou4hsfAnz2ZnzHJZDMvnw1c5HBWDovHlVhPbmjv3rovVIEHWXbnF3Gzdcyb3/454MdYevt1iemEezcYzm0tFOguCtl1BqJb7xlcsFJqpmn35cO9+eA0VKU0l9NYVhBYEZD6plgzGEmSoFzG4PircyntXGX80GWrjA5WzKNMNOfRk1pJJLQKAXsghILtQGYcAffpii2bNSVOqJYI8SYeYSB3CfhRbtmh8PHHm0rkAQeJoLkB4/iQknOZO9KFckeFjkDnlayj6kT0/mdClx4DR0pQBaGxgqBdqJQ4cpUlXom4wlX80U7X72+kHpKYfhG40EDX9TIj/p7B/Nf2g9PRXrb0hZHsJ2d5C+jUDol8zINo1y5yh65x9CWb1Cw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, ID = _t, Var1 = _t, Var2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}},"en-US"),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Date] <> null and [Date] <> ""),
#"Added Custom" = Table.AddColumn(#"Filtered Rows", "ddd", each Table.RowCount(Table.Distinct(Table.SelectRows(#"Filtered Rows",
(d)=> Date.AddMonths(d[Date],+3) >= [Date] and d[Date]<[Date]),{"ID","Var1","Var2"})))
in
#"Added Custom"
peraphs (but I'm not really sure ) now I understood the request 😀
this is the query quarter, which group the rows of the orginal table in table containings rows which date is in tree months
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZTNbsIwEIRfpcqZSKU3jqCqhwh6IBcqxCGA+0MtQEhpVZ6+Voyn9o4Xou4hsfAnz2ZnzHJZDMvnw1c5HBWDovHlVhPbmjv3rovVIEHWXbnF3Gzdcyb3/454MdYevt1iemEezcYzm0tFOguCtl1BqJb7xlcsFJqpmn35cO9+eA0VKU0l9NYVhBYEZD6plgzGEmSoFzG4PircyntXGX80GWrjA5WzKNMNOfRk1pJJLQKAXsghILtQGYcAffpii2bNSVOqJYI8SYeYSB3CfhRbtmh8PHHm0rkAQeJoLkB4/iQknOZO9KFckeFjkDnlayj6kT0/mdClx4DR0pQBaGxgqBdqJQ4cpUlXom4wlX80U7X72+kHpKYfhG40EDX9TIj/p7B/Nf2g9PRXrb0hZHsJ2d5C+jUDol8zINo1y5yh65x9CWb1Cw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, ID = _t, Var1 = _t, Var2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}},"en-US"),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Date] <> null and [Date] <> ""),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Date"}, {"all", each _},GroupKind.Local,(x,y)=>Number.From(y[Date]>=Date.AddMonths(x[Date],3)))
in
#"Grouped Rows"
this is the query quarteDistinctID, which count for each type of product the distinc ID
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcsopTVXSUQpWitWBc3yROT7InAgILyg1Ba4HwvZFYiOrgWqITM3JyS+H64FzfVG5aIpBmmMB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
#"Added Custom" = List.Transform(quarter[all], (t)=> Table.AddColumn(#"Changed Type", "dist", each Table.RowCount(Table.Distinct(Table.SelectRows(t, (r)=> r[Var1]=[Column1] and r[Var2]=[Column2]),"ID"))))
in
#"Added Custom"
if you expect (is not clear to me) a sort of rolling info on previuos tree months:
let
s1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZTNbsIwEIRfpcqZSKU3jqCqhwh6IBcqxCGA+0MtQEhpVZ6+Voyn9o4Xou4hsfAnz2ZnzHJZDMvnw1c5HBWDovHlVhPbmjv3rovVIEHWXbnF3Gzdcyb3/454MdYevt1iemEezcYzm0tFOguCtl1BqJb7xlcsFJqpmn35cO9+eA0VKU0l9NYVhBYEZD6plgzGEmSoFzG4PircyntXGX80GWrjA5WzKNMNOfRk1pJJLQKAXsghILtQGYcAffpii2bNSVOqJYI8SYeYSB3CfhRbtmh8PHHm0rkAQeJoLkB4/iQknOZO9KFckeFjkDnlayj6kT0/mdClx4DR0pQBaGxgqBdqJQ4cpUlXom4wlX80U7X72+kHpKYfhG40EDX9TIj/p7B/Nf2g9PRXrb0hZHsJ2d5C+jUDol8zINo1y5yh65x9CWb1Cw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, ID = _t, Var1 = _t, Var2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(s1,{{"Date", type date}},"en-US"),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Date] <> null and [Date] <> ""),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Date"}, {"all", each _},GroupKind.Local,(x,y)=>Number.From(y[Date]>=Date.AddMonths(x[Date],1))),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "rollingQ", each List.LastN(List.FirstN(#"Grouped Rows"[all], List.PositionOf(#"Grouped Rows"[all],[all])),3)),
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcsopTVXSUQpWitWBc3yROT7InAgILyg1Ba4HwvZFYiOrgWqITM3JyS+H64FzfVG5aIpBmmMB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type 1" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
#"Added Custom 1" = List.Transform(#"Added Custom"[all], (t)=> Table.AddColumn(#"Changed Type 1", "dist", each Table.RowCount(Table.Distinct(Table.SelectRows(t, (r)=> r[Var1]=[Column1] and r[Var2]=[Column2]),"ID"))))
in
Table.FromColumns({#"Added Custom"[Date], #"Added Custom 1"})
Hi, @Anonymous
I am trying the different solutions you've provided me. Thank you so much, they look much better than what I did 🙂
Comparing the last query with a manual calculation I've done in the excel, this how I'd like to get the month of July 2020:
The results are not exactly the same, the differences are in red color.
I'll try to understand why some totals are different.
As I need the last 3 months, to fill the rows for Jul-20 I am taking all the rows in Jun-20, May-20 and Apr-20.
Then I am filtering by the VAR1 and VAR2, and with the results I count the distinct IDs.
I have highlighted a case I think it's relevant: if I count all the IDs for Yellow+S, the total is 3 but, as far as there is an ID repeated (bbbbbb) it will be only counted once. So, in the row 01-Jul-20 | Yellow | S, the total I'd like to see is 2 instead of 3
Maybe I did something wrong when copying your query, I'll double check just in case.
Thanks again!! I really appreciate your help 🙂
change the column [all] in the expressione where
#"Added Custom 1" = List.Transform(#"Added Custom"[all],
to [rollingQ]
#"Added Custom 1" = List.Transform(#"Added Custom"[rollingQ],
but, please PLEASE, be carefull to the space in the text values. For instance Blue is not "Blue" but "Blue " and this cause problem.
I don't have now enough time to analyze and understand the real reason for these problems, but, for safety, find and remove the extra spaces
#########edited#############
use this code to try
let
s1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("nZTNbsIwEIRfpcqZSKU3jqCqhwh6IBcqxCGA+0MtQEhpVZ6+Voyn9o4Xou4hsfAnz2ZnzHJZDMvnw1c5HBWDovHlVhPbmjv3rovVIEHWXbnF3Gzdcyb3/454MdYevt1iemEezcYzm0tFOguCtl1BqJb7xlcsFJqpmn35cO9+eA0VKU0l9NYVhBYEZD6plgzGEmSoFzG4PircyntXGX80GWrjA5WzKNMNOfRk1pJJLQKAXsghILtQGYcAffpii2bNSVOqJYI8SYeYSB3CfhRbtmh8PHHm0rkAQeJoLkB4/iQknOZO9KFckeFjkDnlayj6kT0/mdClx4DR0pQBaGxgqBdqJQ4cpUlXom4wlX80U7X72+kHpKYfhG40EDX9TIj/p7B/Nf2g9PRXrb0hZHsJ2d5C+jUDol8zINo1y5yh65x9CWb1Cw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Date = _t, ID = _t, Var1 = _t, Var2 = _t]),
s2=Table.ReplaceValue(s1,"Blue ","Blue",Replacer.ReplaceText,{"Var1"}),
#"Changed Type" = Table.TransformColumnTypes(s2,{{"Date", type date}},"en-US"),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each [Date] <> null and [Date] <> ""),
#"Grouped Rows" = Table.Group(#"Filtered Rows", {"Date"}, {"all", each _},GroupKind.Local,(x,y)=>Number.From(y[Date]>=Date.AddMonths(x[Date],1))),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "rollingQ", each Table.Combine(List.LastN(List.FirstN(#"Grouped Rows"[all], List.PositionOf(#"Grouped Rows"[all],[all])),3))),
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcsopTVXSUQpWitWBc3yROT7InAgILyg1Ba4HwvZFYiOrgWqITM3JyS+H64FzfVG5aIpBmmMB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Changed Type 1" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
#"Added Custom 1" = List.Transform(#"Added Custom"[rollingQ], (t)=> Table.AddColumn(#"Changed Type 1", "dist", each try Table.RowCount(Table.Distinct(Table.SelectRows(t, (r)=> r[Var1]=[Column1] and r[Var2]=[Column2] ),"ID" ))otherwise 0)),
q1=Table.FromColumns({#"Added Custom"[Date], #"Added Custom 1"}),
#"Expanded Column2" = Table.ExpandTableColumn(q1, "Column2", {"Column1", "Column2", "dist"}, {"Column2.Column1", "Column2.Column2", "Column2.dist"}),
#"Added Custom 2" = Table.AddColumn(#"Expanded Column2", "mmm", each Date.MonthName([Column1],"it-IT")),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom 2",{"Column1"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"mmm", "Column1"}}),
#"Pivoted Column" = Table.Pivot(Table.TransformColumnTypes(#"Renamed Columns", {{"Column1", type text}}, "it-IT"), List.Distinct(Table.TransformColumnTypes(#"Renamed Columns", {{"Column1", type text}}, "it-IT")[Column1]), "Column1", "Column2.dist")
in
#"Pivoted Column"
The problem about space ending Blue was that I hand writed an auxiliary table for the colours and sizes and, of course, I couldn't realize there was a space after Blue in the table you linked
Hi @Anonymous
That's fantastic!
It is exactly what I needed. I will adapt your query to my data, as I have many other columns and data processing needed but yes, the point I couldn't solve is this.
You're a star, thank you so much for your help 🙂
adapt this to you data and see if it works
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bdAxDoAgDAXQu3QmgXbzLIQBCLDoIove3iYWDdahXV5+W/Ae0CJacriAgRgj98TVIZjZUkpim7I7d3Ktw2hYzllyh8Zaq6C2UooMfRYykfuPDWqtcd+ndcPeM/uX9OtoTsmnhAs=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [when = _t, id = _t, v1 = _t, v2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"when", type date}, {"id", type text}, {"v1", type text}, {"v2", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "ddd", each Table.RowCount(Table.Distinct(Table.SelectRows(#"Changed Type",
(d)=> Date.AddMonths(d[when],+3) >= [when] and d[when]<[when]),{"id","v1","v2"})))
in
#"Added Custom"
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.