Solved: Re: Fact Table Data Structure for non-business Tim...

Anonymous · ‎07-23-2020

Hi all,

I am new to PBI, but have been using Excel for many many years. I recently become interested in analyzing some time stamp data in PBI for its cloud based reporting rather than in Excel, and I'm in need of some advice. I have gone through many tutorials and I believe I understand how to use a star schema for the tables, and I believe that I can work out my "Dimension" tables quite easily. Where I need advice is the structuring of the "Fact" table.

Background: I analyze data that is generated from behaviors recorded on video, applying codes and timestamps to the video, and exporting these time stamps. The difference in time between some of the behavior codes is the meat of the data that is used.

Here is an example of the exported data:

Observation id	Behavior	Behavioral category	tStamp (s)
Subject1 Event1	ST	Ve	3.467
Subject1 Event1	AI	Ve	3.967
Subject1 Event1	RTOA	Rr	5.467
Subject1 Event1	FI	Fu	5.667
Subject1 Event1	FTOA	Ft	5.833
Subject1 Event1	RTOD	Rr	9.2
Subject1 Event1	FTOD	Ft	10.667
Subject1 Event1	RTIA	Rr	11.533
Subject1 Event1	FTIA	Ft	12.633

I have queried the data into PBI, and done some transformations, shown in the following M code:

let
    Source = Csv.Document(File.Contents("FILENAME.csv"),[Delimiter=",", Columns=4, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    #"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Observation id", type text}, {"Behavior", type text}, {"Behavioral category", type text}, {"tStamp (s)", type number}}),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Changed Type", "Observation id", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Subject", "Event"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Subject", type text}, {"Event", type text}}),
    #"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"Event", "Subject", "Behavior", "Behavioral category", "tStamp (s)"}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type1", "Index", 1, 1, Int64.Type),
    #"Added Custom" = Table.AddColumn(#"Added Index", "NextObs", each if [Behavior]="ST" then 1 else 0),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "ObsID", each List.Sum(List.FirstN(#"Added Custom"[NextObs],[Index]))),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"NextObs"}),
    #"Reordered Columns1" = Table.ReorderColumns(#"Removed Columns",{"Index", "Subject", "Event", "Behavior", "Behavioral category", "tStamp (s)", "ObsID"})
in
    #"Reordered Columns1"

you can see that this code adds an index, and some colums to add an observation ID that increments up every time there is an ST behavior code. This is simply because each observation starts with the ST code. in order to look at the specific quantities that I want to analyze, I need to take the difference in time between the tmiestamps of specific behavior codes.

Here is an example of what I mean:

Quantity	Start Behavior	Stop Behavior
Total	ST	DE
At	AI	AO
Ft	FI	FO
FTt	FTOA	FTID
RTt	RTOA	RTID
DCt	ST	DC

Previously I was doing this analysis in Excel, and basically parsing the vertical data from each observation into its own row using a macro like this:

Event	Subject	Observation	Total	At	Ft	FTt	RTt	DCt
1	1	1	DE-ST	AO-AI	FO-AI	FTID-FTOA	RTID-RTOA	DC-ST
1	1	2	DE-ST	AO-AI	FO-AI	FTID-FTOA	RTID-RTOA	DC-ST
1	2	1	DE-ST	AO-AI	FO-AI	FTID-FTOA	RTID-RTOA	DC-ST

This leads me my questions:

In order to be able to analyze the quantities for each event, subject, or even observation, do I have to build the same type/structure table of the data as I have done in excel? Or should I be able to keep the data structured as I have it in my query?
If I shouldn't structure a fact table like I have done, what is the most efficient way of doing the quantity calculations?
Is there a best practice for structuring the data for an analysis like this in PBI?
are there any specific tutorials I should be looking at? I've done/watched a bunch of standard tutorials, but I am having trouble getting past this simple step!

If necessary I can supply example data and pbix files as well.

thanks in advance!

v-zhenbw-msft · ‎07-28-2020

Hi @Anonymous ,

1. Try to open the pbix with Power BI Store version.

2. If you want to get the table like the following screenshot, you have to delete the Observation.id column and Behavioral category column, then pivot the Behavioral column.

3. If you want to filter the fact table based on Observation.id column, you need to stay Observation.id column before pivoting the Behavioral column.

Then you can create a relationship between the fact table and a dim table that contains unique Observation.id.

Best regards,

Community Support Team _ zhenbw

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BTW, pbix as attached.

View solution in original post

v-zhenbw-msft · ‎07-23-2020

Hi @Anonymous ,

Do you want to calculate the order within the group, and calculate the difference between the next Behavior and the current Behavior?

If yes, we can create two calculate columns to meet your requirement.

The first calculated column calculates the ranking within the group.

Rank group = 
RANKX(FILTER('Table 1','Table 1'[Behavior]=EARLIER('Table 1'[Behavior])),'Table 1'[tStamp (s)],,ASC,Dense)

The second calculated column calculates the difference.

difference = 
var _next_value = CALCULATE(SUM('Table 1'[tStamp (s)]),FILTER('Table 1','Table 1'[Rank group]=EARLIER('Table 1'[Rank group])+1 && 'Table 1'[Behavior]=EARLIER('Table 1'[Behavior])))
return
IF(ISBLANK(_next_value),BLANK(),_next_value - 'Table 1'[tStamp (s)])

If it doesn’t meet your requirement, could you please provide a mockup sample based on fake data?

It will be helpful if you can show us the exact expected result based on the tables.

Please upload your files to OneDrive For Business and share the link here. Please don't contain any Confidential Information or Real data in your reply.

Best regards,

Community Support Team _ zhenbw

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BTW, pbix as attached.

Anonymous · ‎07-24-2020

Thanks for the reply @v-zhenbw-msft

The method you posted may be useful for me as i delve into this analysis, but I don't believe its the base that I am looking for. I have linked example files and pictures of my current excel analysis for added insight into what I am trying to do. All data is made up.

PBI File:

ExampleAnalysis

Example data file:

ExampleData

Behavior time stamp definitions

BehaviorTimeDefinitions

Excel data structure with calculations for behavioral times example

ExcelDataStructure

Excel analysis cluster-stack plot example

ClusterStackAnalysis

information about the data and current analysis that may help:

currently, I read the data into excel with a macro, and then parse it based on the behavior code and observation number. this builds the initial structure of the data
from there, the calculated behavioral time values are then done horizontally from the main imported data. These calculations are easy arithmetic with some gating using if statements where necessary
the cluster stack plot is the main visual in the analysis. You can see that there are behavioral groups that stack on to each other, and are compared to other groups within each observation.
additional statistical analysis is done on each of the behaviors and their groups, which should be easy to do in PBI.

What I'm really interested in is being able to do similar analysis in PBI, but with the added benefit of being able to include data from all observations from different subjects and events all in the same data file. In that way, I will have a bigger data source for analysis. With PBI, it seems that I will also be able to filter the data much more effectively for reporting, which I'd like to be able to do at all levels: All data, Event(s), Subject(s), Observation(s), etc. From what I can tell in all of the tutorials is that its important to have the correct data structure, which is where I'm looking for help.

Thank you for putting some thought into this. Once I'm able to structure the data correctly for a wide array of analyses in PBI, I should be able to create some far more powerful reports.

v-zhenbw-msft · ‎07-27-2020

Hi @Anonymous ,

We can add an index column and use some transformations to meet your requirement.

1. We need to group the Behavior and Behavioral category.

2. Then we can add a customer column.

3. We removed other columns and expand the customer column.

4. Then we only leave three columns.

5. At last we need to pivot this table and get the result table like your screenshot.

You can add some columns based on your calculate logic, like this,

If you have any question, please kindly ask here and we will try to resolve it.

Best regards,

Community Support Team _ zhenbw

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BTW, pbix as attached.

Anonymous · ‎07-27-2020

@v-zhenbw-msft , Thank you again for the swift reply!

First off, I was undable to open your pbix file; it says that i need to update to the most recent version of Power BI (I do have the most current version). I tried both 32 and 64 bit versions, even tried backdating to the June Update with no luck.

Second, I was able to recreate the table you made, so thank you for the good instrutions. I did find that the index you addded to the new table did not reference the "DC" data correctly to the correct observation number. I believe I have fixed this by using my ObsNumber column from before and using that as the index column. This does however create some "null" values in the DC column. PLease see the .pbix file in the following link to uderstand what I mean. I have added the DE-ST column and a DC-ST column to test the "null" values.

ExampleAnalysis_v2

Third, here are my current questionsthat were brought up with the table structure you built here:

Will having the Null values in the DC column create any errors? Not all behaviors hapen within every observation, so they need to not interfere with analysis on that observation
Should this new pivoted table be the main fact table at the center of the star schema layout? If so, I believe I will ahve to connect the tables to be able to filter by event and subject as well?
I wanted to make sure that this is the best/mst efficient way of creating this analysis; it already seems quite powerful

Thank you again for the help!

v-zhenbw-msft · ‎07-28-2020

Hi @Anonymous ,

1. Try to open the pbix with Power BI Store version.

2. If you want to get the table like the following screenshot, you have to delete the Observation.id column and Behavioral category column, then pivot the Behavioral column.

3. If you want to filter the fact table based on Observation.id column, you need to stay Observation.id column before pivoting the Behavioral column.

Then you can create a relationship between the fact table and a dim table that contains unique Observation.id.

Best regards,

Community Support Team _ zhenbw

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

BTW, pbix as attached.

Fact Table Data Structure for non-business Time Stamp Analysis help

Helpful resources

Fabric Data Days

Power BI Monthly Update - October 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

Fabric Data Days starts November 4th!

Fact Table Data Structure for non-business Time Stamp Analysis help

Helpful resources

Fabric Data Days

Power BI Monthly Update - October 2025

FabCon Atlanta 2026

How to Get Your Question Answered Quickly