High density sampling (error?)

BastiaanBrak · ‎02-07-2019

hello, I'm confused about the High density sampling (HDS) algorithm. I've got a Line Chart with 166 different series. The documentation about High Density (Line) Sampling indicates that the maximum number of series that can be displayed is 60. It then describes the process of how 60 representative series are selected if - as in my case - the actual total is higher.
https://docs.microsoft.com/en-us/power-bi/desktop-high-density-sampling

Specifically:

The algorithm creates as many bins as possible to create the greatest granularity for the visual. Within each bin, the algorithm finds the minimum and maximum data value, to ensure that important and significant values (for example, outliers) are captured and displayed in the visual.

Below are two screenshots, first one with HDS On, second with HDS Off:

Based on these screenshots, it appears as if HDS is being applied as indicated in documentation. However ... it turns out that - at least in my use case - outliers at the top end are not represented at all but left out altogether when HDS is On (I used targeted filtering to eliminate some series and leave the outliers in).

I've tried getting my head round the information in the 'Considerations and limitations' section to understand if this is intended behaviour of HDS but am getting confused because of the points below, which appear to suggest the outliers are excluded because alphabetically they appear after the 60th series, but to me this would defeat the point of HDS altogether.

When the size of an overall data source is too big, the new algorithm eliminates series (legend elements) to accommodate the data import maximum constraint.
- In this situation, the new algorithm orders legend series alphabetically, starts down the list of legend elements in alphabetical order until the data import maximum is reached, and does not import additional series.
When an underlying data set has more than 60 series (the maximum number of series, as described earlier), the new algorithm orders the series alphabetically, and eliminates series beyond the 60th alphabetically-ordered series.

In any case it seems to me this is undesirable behaviour from HDS but can anyone explain why the outliers are not included by HDS?

Many thanks, Bastiaan

v-yulgu-msft · ‎02-11-2019

Hi @BastiaanBrak,

In any case it seems to me this is undesirable behaviour from HDS but can anyone explain why the outliers are not included by HDS?

What is your desired output? What outliners were you referring to?

Regards,

Yuliana Gu

Community Support Team _ Yuliana Gu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BastiaanBrak · ‎02-11-2019

hi Yuliana @v-yulgu-msft

Ideally, my desired output is for all 166 series to be visible. If that option is not available, I'd be content with what High Density Sampling is purported to do, i.e. "ensure that important and significant values (for example, outliers) are captured and displayed in the visual" but in my use case HDS does not work as described.

As you can see in the screenshot on the right (High Density Sampling = OFF), there are at five series, three in the pink area and two in the blue area, that are not present when High Density Sampling = ON (screenshot on left), at least three of which I would argue represent "important and significant values" since they represent faster rising series than the ones included in HDS.

Thanks and hope this helps, Bastiaan

BastiaanBrak · ‎02-13-2019

Anyone??

BastiaanBrak · ‎03-26-2019

Update: you can see the high density sampling error as described above in action in this web report:
https://ahdb.org.uk/bgmec

Specifically: the location in the south-west of England, which has been omitted from the graph by the high density sampling algorithm, represents the time-series with the steepest slope (click the location on the map or select 'South West England' from the Region drop down to verify) so should NOT have been omitted.

Can you confirm this has been raised as a glitch now @v-yulgu-msft ?