Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hello All,
I would like to have blank values to be filled with interpolated values as per the dates and name column values.
Sample data:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jVg7blwxDLxK4DoRxD+ZLgiQJilSBGkM3/8a4VvEi/UjtXqGXSzI0UokZzTy6+sLyJeJX3AifZrz65wvn1944o/vYDZ+/81PL2+fP2bBpSy8lEWXsvhSllzK0ktZdinLL2XFlSy4VHu4VHtoa080UJWnABRA2wbCAQAJICuAtiMEA12ExakA2ubkN0wyDjcpgLZPyGNOw/zjAmhbllsyJEefXgBt97JKk5hMIAqgbeRRVkyEkp4BeKmn2PY0S4MqSBRlH9i2F3WIehZTK+AJy3TDfpTBCCwmXADtxnMfkXOGzFIAy7kMC1CGAujnkgZD9rSm91OZ6aSiIF4A/VTSuE1AyW5H8rZ9ZeeogH4kZwJ4ekwqgHYk0YcTmBBaAbQjmQBDxThn9xqDMSxYaGqpTy83xwHQVKQeYKE8c5CEaFBpWa88SXOfIIqznLhXHrSBM/IIVA/d9xhGdszEpQL6NicfCdgI6hn6NuOQg44etay98iQgx4JTrwqgVx4cSXjI33KGXnmebGkpQgqU4ll21GvQkW8HdwALYHXF6I36/v8ItpGj4whJHE36F8CTu9E2GlSyVrsVOAqIXABPLIrt9CaLgIEUpgXwxK3YRmdSBbKyxkmkAlgNYCa7kdRmrAZQJPLn/XK1jdSUrP4GSUV1nDS93GTvQ/Hrz/ef355402UcN3HaxHkTl01cN3HbxH0Tj+dx2NQPNvWDU/2Uh0GOAFX/dqqk5pCn+VRHXJnJe6oPUASgxoKd6quRpsemp4isDOR7qkimMpNrNadWVgXPCWy+f1N/ONXfjmtmQop08dy4aQVuWoHnVszctPiUyhpsp7o6v3upbFDIjYMrz3dPjeGRyhVYzVXLtWru7rXC4wDAKOXy6GlXfdzDUkpAItUD9Qysnu1eVsv3AIajVv/rZcQNOK1p6Mqn3VNT4mhCesalSXugGAWpe6nKmayqYybHwr0ueh6WfCykhyOaxbSeiZu8MQdwtKUXu696PLaQj5OtXNgDxfLRl+tiqdWZuDbTDMrthbVyXg90A6ZsVbNXL6m5Zl52tVlnEq/M1TLeMrd6p2X8TAw43DBCWrGVSfq4VPVEy3i7leqDlvGWmNXwLOMtG6uzeRC5fMtHOFQHdm6wjZiRbKim6izSMtJ6sd//C1FtzGp/ZwqyH69lhODc39s/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Datetime = _t, Name = _t, value = _t]),
#"Added Suffix" = Table.TransformColumns(Source, {{"Datetime", each _ & ":00", type text}}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Added Suffix", {{"Datetime", type datetime}}, "en-NA"),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"value", type number}})
in
#"Changed Type"
I have a column which has tag names and here i have sample data of two tags for two days.
The columns i have here as Datetime, NAME and value columns.
for each name i want to fill the blank values using interpolation by Name column of the datetime.
I found multiple blogs and community questions on interpolation most of them using dax but not with power query.
https://community.powerbi.com/t5/Community-Blog/Linear-Interpolation-with-Power-BI/ba-p/341202
Can anyone please help me on calculating interpolation values.
Thanks,
Mohan V.
@ronrsnfld i have added few more data in this one drive folder of excel file.
data.csv
https://1drv.ms/u/s!AmauTLNmHKexhGk_35S3lYSDxxQs?e=5NcD4K
Could you please check on the same.
Unable to download your file.
Below is code that
Custom function to do the interpolateion
Rename a blank query => fnStraightLineInterpolation
(t as table)=>
let
#"Grouped Rows" = Table.Group(t, {"value"}, {
{"Null Count", each if [value]{0}=null then List.Count([value]) else 0, Int64.Type}},
GroupKind.Local),
#"Shifted" = Table.FromColumns(
Table.ToColumns(#"Grouped Rows")
& {List.RemoveFirstN(#"Grouped Rows"[value] & {null})}
& {{null} & List.RemoveLastN(#"Grouped Rows"[value])},
type table[value=number,Null Count=Int64.Type, after Null=number, before Null=number]),
#"Added Custom" = Table.AddColumn(Shifted, "Interpolation", each
if [Null Count] = 0 then {[value]}
else let
increment = ([after Null] - [before Null]) / ([Null Count]+1),
values = if increment = null
then List.Repeat({null},[Null Count])
else List.Numbers([before Null] + increment, [Null Count], increment)
in
values, type list),
//Merge back with Datetime and Name columns
#"Result" = Table.FromColumns(
Table.ToColumns(Table.SelectColumns(t,{"Datetime","Name"}))
& {List.Combine(#"Added Custom"[Interpolation])},
type table[Datetime=datetime, Name=text, Value=number])
in
#"Result"
Main Query
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jVg7blwxDLxK4DoRxD+ZLgiQJilSBGkM3/8a4VvEi/UjtXqGXSzI0UokZzTy6+sLyJeJX3AifZrz65wvn1944o/vYDZ+/81PL2+fP2bBpSy8lEWXsvhSllzK0ktZdinLL2XFlSy4VHu4VHtoa080UJWnABRA2wbCAQAJICuAtiMEA12ExakA2ubkN0wyDjcpgLZPyGNOw/zjAmhbllsyJEefXgBt97JKk5hMIAqgbeRRVkyEkp4BeKmn2PY0S4MqSBRlH9i2F3WIehZTK+AJy3TDfpTBCCwmXADtxnMfkXOGzFIAy7kMC1CGAujnkgZD9rSm91OZ6aSiIF4A/VTSuE1AyW5H8rZ9ZeeogH4kZwJ4ekwqgHYk0YcTmBBaAbQjmQBDxThn9xqDMSxYaGqpTy83xwHQVKQeYKE8c5CEaFBpWa88SXOfIIqznLhXHrSBM/IIVA/d9xhGdszEpQL6NicfCdgI6hn6NuOQg44etay98iQgx4JTrwqgVx4cSXjI33KGXnmebGkpQgqU4ll21GvQkW8HdwALYHXF6I36/v8ItpGj4whJHE36F8CTu9E2GlSyVrsVOAqIXABPLIrt9CaLgIEUpgXwxK3YRmdSBbKyxkmkAlgNYCa7kdRmrAZQJPLn/XK1jdSUrP4GSUV1nDS93GTvQ/Hrz/ef355402UcN3HaxHkTl01cN3HbxH0Tj+dx2NQPNvWDU/2Uh0GOAFX/dqqk5pCn+VRHXJnJe6oPUASgxoKd6quRpsemp4isDOR7qkimMpNrNadWVgXPCWy+f1N/ONXfjmtmQop08dy4aQVuWoHnVszctPiUyhpsp7o6v3upbFDIjYMrz3dPjeGRyhVYzVXLtWru7rXC4wDAKOXy6GlXfdzDUkpAItUD9Qysnu1eVsv3AIajVv/rZcQNOK1p6Mqn3VNT4mhCesalSXugGAWpe6nKmayqYybHwr0ueh6WfCykhyOaxbSeiZu8MQdwtKUXu696PLaQj5OtXNgDxfLRl+tiqdWZuDbTDMrthbVyXg90A6ZsVbNXL6m5Zl52tVlnEq/M1TLeMrd6p2X8TAw43DBCWrGVSfq4VPVEy3i7leqDlvGWmNXwLOMtG6uzeRC5fMtHOFQHdm6wjZiRbKim6izSMtJ6sd//C1FtzGp/ZwqyH69lhODc39s/", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Datetime = _t, Name = _t, value = _t]),
#"Added Suffix" = Table.TransformColumns(Source, {{"Datetime", each _ & ":00", type text}}),
#"Changed Type with Locale" = Table.TransformColumnTypes(#"Added Suffix", {{"Datetime", type datetime}}, "en-NA"),
#"Changed Type" = Table.TransformColumnTypes(#"Changed Type with Locale",{{"value", type number}}),
//Group by Name
// then interpolate each subgroup
#"Grouped Rows" = Table.Group(#"Changed Type", {"Name"}, {
{"All", each fnStraightLineInterpolation(_)}}),
//Remove unneeded column and Expand the table
#"Removed Columns" = Table.RemoveColumns(#"Grouped Rows",{"Name"}),
#"Expanded All" = Table.ExpandTableColumn(#"Removed Columns", "All", {"Datetime", "Name", "Value"}, {"Datetime", "Name", "Value"}),
#"Changed Type1" = Table.TransformColumnTypes(#"Expanded All",{{"Datetime", type datetime}, {"Name", type text}, {"Value", type number}})
in
#"Changed Type1"
Part of output after interpolation
Graphic output before and after processing
Out of town now. But have a method for straight line Interpolation of missing data points ignoring missing data at very start or end. Will post later this week.
What kind of interpolation? Linear or something else?
Or just a straight line interpolation between the points surrounding the missing values?
If the latter, how do you want to extend the values for the beginning/end where there is no boundary value?
@ronrsnfld straight line interpolation between the points surrounding the missing values is what I'm looking for.
For the straight line interpretation solution, how will you handle the situation where there is no starting value as at the top of your table?
@ronrsnfld Is it possible to make it by max and min value's available just like how it is calculated using dax in below link
https://community.powerbi.com/t5/Community-Blog/Linear-Interpolation-with-Power-BI/ba-p/341202
That solution does not show values prior to the first given value. In your table, there are a number of empty values prior to the first known value. That requires extending the trend line. And my question is how do you want to handle that if you are doing a straight line Interpolation between given values?
@ronrsnfld Thanks for all the inputs but unfortunately I do not have any idea on how to manage it.
could you please suggest a best method of implementation where we can fill those missing data.
The suggestions would have to do with the shape of the data. I'm sure someone better versed in statistics would have a better idea, but the data you present does not seem lend itself to a least-squares trend.
Here is a graph of your data for the first day:
For where you have one or two blanks within the data, I suppose you could just "connect the dots". But what do you do for the first twelve data points? There is nothing to connect to, and the pattern is not well defined. Although anything can be programmed, you are the one that will need to decide what to do. I don't know of a viable method to interpolate long periods given this kind of data. One or two missing are probably not a problem. But you have much more than that at the start of your data.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.