Supplies are limited. Contact info@espc.tech right away to save your spot before the conference sells out.
Get your discountScore big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount
I have a large dataset in PowerBI Service, that was fetched from a web API. It is separated in multiple partitions (one per month) and there is an incremental refresh setup once per day. It only refresh one day's worth of data. It contains about 2 years of data which took a long time to import.
Now here is my issue:
I need te create reports which can involve calculating some values, for example the difference between two dates (endDate - startDate). What is the best way to do that?
I would really like to not refresh the whole dataset because that will be resource and time intensive, and it seems it shouldn't be needed, since all the data is already there, and it just needs to compute a value based on already existing values.
Also, what if I need to compute another calculation later, will I also need to reimport the whole dataset? That doesn't seem efficient..
Is there a way to only compute a Calculated column? Or is there another solution I'm missing?
Thanks for your help!
Solved! Go to Solution.
I have found the solution.
I can actually use Calculated Columns. I thought the whole dataset needed to be refreshed in order for them to be computed, since I was seeing this error message
The expression referenced column 'MyTable'[My Calculated Column] which does not hold any data because it needs to be recalculated or refreshed.
But I've noticed that if I only refresh a single partition (through SSMS), the column is computed for the WHOLE dataset. I can therefore add any Calculated Column I might need, and then use Measure to compute aggregation functions on them.
I have found the solution.
I can actually use Calculated Columns. I thought the whole dataset needed to be refreshed in order for them to be computed, since I was seeing this error message
The expression referenced column 'MyTable'[My Calculated Column] which does not hold any data because it needs to be recalculated or refreshed.
But I've noticed that if I only refresh a single partition (through SSMS), the column is computed for the WHOLE dataset. I can therefore add any Calculated Column I might need, and then use Measure to compute aggregation functions on them.
Hi @mverwil,
Based on your description, if you don't want to import the dataset all together to do the calculations, you're better off using MEASUREMENT to do the calculations. This is because measure only needs to perform calculations based on the context. You mentioned that you used incremental refresh, which is a good thing to try. You can dynamically get the max and min dates.
DateDifference = DATEDIFF(MIN(Table[StartDate]), MAX(Table[EndDate]), DAY)
Best regards,
Albert He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Thanks for your reply! I cannot just have a measure, I need this calculation for each row.
id | startDate | endDate | duration (Calculated Column) |
1 | 01/03/2024 | 05/03/2024 | 4 days |
2 | 02/03/2024 | 03/03/2024 | 1 day |
3 | 05/03/2024 | 15/03/2024 | 10 days |
I need to have the duration for each row, and then I can add a measure for it. But I have found that the Calculated Column can be computed by refreshing any small partition (the last day), see my answer.
Hi mverwil
Is there a specific reason why you can't use Calculated Measures, instead of Calculated Columns?
They are both more dynamic in nature, and are only computed once the measure is used in a visualization.
Measures in Power BI Desktop - Power BI | Microsoft Learn
_____________________________________________________
I hope my comment was helpful.
If your question was answered, please mark your post as 'Solved' and consider giving me a 'Thumbs Up'.
Find me on LinkedIn, Sessionize, or my blog Downhill Data
Hi Jonvoge,
Thanks for your answer.
Maybe I misunderstand something, but I was under the impression that measures are used for aggregations. So I would use a measure to show the SUM of the prices for example. What I want is to have, for each row, the difference between two fields of this row.
I did try to create a Measure, but I got this error:
Calculation error in measure 'Table'[timeDifference]: A single value for column 'Column1.endDate' in table 'Table' cannot be determined. This can happen when a measure formula refers to a column that contains many values without specifying an aggregation such as min, max, count, or sum to get a single result.
I also tried to have directly a measure for the average by doing this
AVERAGE(Table[Column1.endDate] - IncidentsData[Column1.startDate])
But it said
The AVERAGE function only accepts a column reference as an argument.
So it seems I need to create both a Calculated column for the difference, and a Measure for the average of this calculated column.