Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Anonymous
Not applicable

Duplicate/expand huge number of rows

Hi all,

 

I'm working on a forecast capacity report, and are pulling a Sharepoint list with 100.000+ rows where each row is a future job. I've got the following columns:

 

tblProductions

[ProductionStartDate], [ProductionEndDate], [EngineeringHours], [ProductionHours], [TestHours], [CalculatedNumberOfWorkDays]

 

tblCalendar (all dates from today and 365 days ahead)

[Date]

 

Result should be a column graph visual showing how many manhours needed every day

 

What I have tried to do is to add a custom column in tblCalendar and expanded tblProductions. I guess I could then create calculated fields to divide the number of hours over the production period, remove all records outside the production period, but havent really started on that part because:

 

It takes an hour just to pull the query preview. I gave up loading the data to the report after a few hours of waiting time. I assume what's happening is it runs through all 100.000 rows for each date, which I understand takes time. Note the tblProduction is filtered to only show a fraction of the 100.000 records, in fact only two records in my testing, but it still runs through all 100.000+ records.

 

Question is - are there ways to speed this up? Or even better, does anyone have ideas to simplify my logic? 

 

Cheers

3 REPLIES 3

It sounds like the filter isn't folding back to SharePoint. In PowerQuery, ensure the filter is the step after the Navigation step. This ensures the connector appends the filter to the call. You can see this in action in my post here: https://marqueeinsights.com/how-to-data-mine-a-sharepoint-list-with-power-bi/

 

Also, you may want to check your column data types as I show in the post above. That might be contributing to the error with the rollup.

 

Hope this helps!

--Treb, Power BI MVP

amitchandak
Super User
Super User

If You create a column, the time should be taken while creation. If taken as measure time will be taken while run time. 100K does not seem a big number.

 

Can you share the formula you created?

In between just check your system usage. Is power BI getting RAM and CPU or their other resources using it.

Anonymous
Not applicable

Thanks for replying, Amit. My description was probably not clear. The logic I'm using is to extract data in the query editor and this is what takes so long to update. Let me try to explain in a different way:

 

I have the following table (100 000 records and growing, but only showing two record for reference):

IDProductionStartDateProductionEndDateEngineeringHoursProductionHoursTestHoursCalculatedNumberOfWorkDays
12020/1/12020/1/10301058
22020/1/62020/1/24501002015

 

The number of hours for each type will be spread evenly (Example: EngineeringHours: 30/8 and 50/15) across the calculated number of workdays and presented in a graph. Something like this:

GraphExample.JPG

 

What would be the best way to achieve this?

 

Cheers

 

 

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.