Re: Merge rows based on continuity

Martin_Songstad · ‎04-24-2023

Hi!

How can I calculate a new table like this?

Ages 19 and 20 are adjacent to each other in this filter context, so I want to lump them together in the background to reduce the number of iterations.

barritown · ‎04-24-2023

Hi @Martin_Songstad,

I can propose such a query:

Result Table = 
VAR _temp = ADDCOLUMNS ( data,
                         "NewMin", COALESCE ( LOOKUPVALUE ( data[Min], data[Max], data[Min] ), data[Min] ),
                         "NewMax", COALESCE ( LOOKUPVALUE ( data[Max], data[Min], data[Max] ), data[Max] ) )
RETURN DISTINCT ( SELECTCOLUMNS (_temp, "Min", [NewMin], "Max", [NewMax] ) )

If you have no more than 2 adjacent intervals, it should work.

Best Regards,

Alexander

My YouTube vlog in English

My YouTube vlog in Russian

Martin_Songstad · ‎04-25-2023

Thanks @barritown !

I am not getting excactly the result that I was looking for, but I will look into that code.

barritown · ‎04-25-2023

My solution will break if you try to merge three lines into one, like here:

Age	Min	Max
19	6575	7000
20	7000	7670
21	7670	8766

If your problem is of a different nature, you can try generating more toy data - maybe I'll come up with some other idea.

Best Regards,

Alexander

My YouTube vlog in English

My YouTube vlog in Russian

Martin_Songstad · ‎04-25-2023

Yes i would like to merge all continuously selected age groups.

This is the best take I have come up with myself so far:

EVALUATE

VAR __Simulated_selection =

FILTER(
	ALL(Aldersgrupper[Alder]),
	Aldersgrupper[Alder] = 19 || Aldersgrupper[Alder] = 20 || Aldersgrupper[Alder] = 22
	)

VAR __MinValues =
CALCULATETABLE(
CALCULATETABLE(
	SELECTCOLUMNS(Aldersgrupper,
					"Min", Aldersgrupper[Min],
					"Rank", RANKX(Aldersgrupper, [Min],,1)
					),
	EXCEPT( VALUES( Aldersgrupper[Min] ), 
			SELECTCOLUMNS(Aldersgrupper,
							"Max", Aldersgrupper[Max]) )
),
__Simulated_selection)

VAR __MaxValues =
CALCULATETABLE(
CALCULATETABLE(
	SELECTCOLUMNS(Aldersgrupper,
					"Max", Aldersgrupper[Max],
					"Rank", RANKX(Aldersgrupper, [Max],,1)
					),
	EXCEPT( VALUES( Aldersgrupper[Max] ), 
			SELECTCOLUMNS(Aldersgrupper,
							"Min", Aldersgrupper[Min]) )
),
__Simulated_selection)

VAR __AgeGroupsCompressed =
NATURALINNERJOIN(__MinValues,__MaxValues )

RETURN __AgeGroupsCompressed

barritown · ‎04-25-2023

I have an idea, which works on the test dataset below, but with my current [DAX] knowledge I am unable to pack it up into one single query. Take a look, maybe you will be able to achieve it.

Test dataset:

Step 1.

We put all the minimums and maximums into one table, mark (1 - max, 0 - min in Status) and rank them.

Step 2 (this is the main obstacle why I cannot pack it up into a single query).

We add a calculated column which flags the first minimum and last maximum in each sequence.

Step 3.

We produce an output by filtering, re-ranking and joining tables.

For convenience, here are those tables and column in the text format:

one_column_view = 
VAR _tmp = UNION (
ADDCOLUMNS ( SELECTCOLUMNS ( data, "Value", data[Min] ), "Status", {0} ), 
ADDCOLUMNS ( SELECTCOLUMNS ( data, "Value", data[Max] ), "Status", {1} ) )
VAR _tmp2 = ADDCOLUMNS ( _tmp, "id", RANKX ( _tmp, [Value], , 1, Dense ) )
RETURN _tmp2

flag = 
VAR cur_id = [id]
VAR part1 = IF ( one_column_view[Status] = 0 && one_column_view[Status] <> CALCULATE ( MAX ( one_column_view[Status] ) , ALL ( one_column_view ), one_column_view[id] = cur_id - 1 ) || [id] = 1, 1, 0 )
VAR part2 = IF ( one_column_view[Status] = 1 && one_column_view[Status] <> CALCULATE ( MAX ( one_column_view[Status] ) , ALL ( one_column_view ), one_column_view[id] = cur_id + 1 ), 1, 0 )
RETURN part1 + part2

result = 
VAR _tmp = CALCULATETABLE ( one_column_view, one_column_view[flag] = 1 )
VAR _tmp2 = SELECTCOLUMNS ( _tmp, "Value", [Value], "Rank", RANKX ( _tmp, [Value], , ASC ), "Status", [Status] )
VAR _mins = SELECTCOLUMNS ( FILTER ( _tmp2, [Status] = 0 ), "Min", [Value], "Rank", INT ( [Rank] / 2 ) + 1 )
VAR _maxs = SELECTCOLUMNS ( FILTER ( _tmp2, [Status] = 1 ), "Max", [Value], "Rank", INT ( [Rank] / 2 ) )
RETURN NATURALINNERJOIN ( _mins, _maxs )

Best Regards,

Alexander

My YouTube vlog in English

My YouTube vlog in Russian