Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi all,
I am rather new to PowerBI but I hope some of you can help me with this question.
I am working with an Oracle database holding data on compettition participation for customers. Each competition takes place over 6 days, and customers can compete numerous times on the same day.
The data I am working with is structured like this:
Column 1: CompetitionNumber
Column 2: CompetitionDay
Column 3: CustomerID
So as an example a couple of row could look like:
| CompetitionNumber | CompetitionDay | CustomerID |
| 1 | 1 | 778856 |
| 1 | 1 | 778856 |
| 1 | 2 | 778856 |
| 1 | 2 | 808012 |
What I am trying to do is to get a measure of how many customers participated each day, so what I am doing is a Group BY on CompetitionNumber and CompetitionDay and counting the number of distinct rows.
However I am running into severe PowerBI performance issues - with PowerBI slowly filling up my PC's RAM over severtal hours doing the Gropu By until PowerBI crashes. The table I am working is very large - holding around 100+ million rows. However, when I try to do this via SQLDeveloper on the Oracle database itself it takes around 10 minutes for Oracle to compute the GroupBY.
As I see it there three different ways forward:
What do you think - is there something else I could try to overcome this obstacle?
Many thanks beforehand!
Solved! Go to Solution.
@andershh When connecting to Oracle database you have an option to write SQL query. So in my opinion it is better option to bring data in the format required through SQL and let Oracle processing happen on the database as you've already mentioned.
Another option is to use R script within query editor. So only dataset is imported in power bi desktop, run R script to perform grouping via Run R Script option which adds R step to the applied steps. I think that would be second best option as R is very powerful in performing this type of operation and you will notice less amount of time being spend on grouping.
@andershh When connecting to Oracle database you have an option to write SQL query. So in my opinion it is better option to bring data in the format required through SQL and let Oracle processing happen on the database as you've already mentioned.
Another option is to use R script within query editor. So only dataset is imported in power bi desktop, run R script to perform grouping via Run R Script option which adds R step to the applied steps. I think that would be second best option as R is very powerful in performing this type of operation and you will notice less amount of time being spend on grouping.
@ankitpatira - Thank you for taking the time to try and help me! I think the SQL idea is very good (to be honest I hadn't noticed the option when connecting to a data source). However, since I am using the same Oracle connection for other PowerBI reports and data tasks - I haven't been able to add a duplicate connection to the database with the SQL inserted. If I change the configuration of the Oracle data source, then I will change it for all my Power BI reports that utilizes it - right?
Nevertheless, if this hadn't been the case I am certain your suggestion would have been an excellent solution - I am therefore marking this a solution.
@AlexChen - Also a big thank you to you for taking the time to write a thorough reply. I have moved forward with the third option I specified in the original. I think your suggestion for fixing the table is spot on - however I do not have the admin rights to the database (I can only extract data from it), so I can't utilize your excellent suggestion. But many thanks nevertheless!
Hi,
I think the 3rd option is the best.
But I found the another issue is that there are too many duplicated records in your table.
Please refer to my steps below:
1. create a table call competition_copy,
create table competition_copy as select * from competition;
2. remove duplicated records in competition_copy,
delete from competition_info_copy a where a.rowid > any (select b.rowid from competition_info_copy b where b.competition_number = a.competition_number
and b.competition_day = a.competition_day and b.customer_id = a.customer_id)
3. split competition_copy into severial smaller table such as :
create table competition_a as select * from competition_copy a where a.competition_number = 1;
create table competition_b as select * from competition_copy a where a.competition_number = 2;
Remember to add index for these columns that will be used for query frequently.
4. create an insert trigger for table competition.
When you insert a record into table competition, for example, the new records’ competition_number =1, competition_day = 3, customer_id = XX, Check if this record existed in competition_a table or not. If not , insert into competition_a table.
5. Now you can create a table to get your group by records by combine several smaller tables’ combined records
Table =
Union (
Summarize(
Competition_a,
Competition_a[competition_number],
Competition_a[competition_day],
Competition_a[customer],
“count”, countax(competition_a, customer_id)
),
Summarize(
Competition_b,
Competition_b[competition_number],
Competition_b[competition_day],
Competition_b[customer],
“count”, countax(competition_b, customer_id)
),
….
)
7. Now you can create a measure to show your count of customers quickly. And it will be auto refreshed using your on-premise gateway.
Best Regards
Alex
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 40 | |
| 38 | |
| 36 | |
| 29 | |
| 28 |
| User | Count |
|---|---|
| 126 | |
| 88 | |
| 78 | |
| 66 | |
| 65 |