The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi,
I have very large tables having more than 1 million records and assume they look like following. For a particular application, I am interested only on data regarding customers of prodtype "A".
To have a star schema, would below import SQL queries work? And are they the most efficient?
Applications Table : SELECT * FROM Applications WHERE ProdType = 'A'
Alerts Table:
WITH A_Customer_Codes AS(
SELECT
DISTINCT CustomerID
FROM
Alert
WHERE ProdType = 'A'
)
,
Alerts AS (
SELECT
*
FROM Alert
CusID IN (SELECT CustomerID FROM A_Customer_Codes)
I am asking this because If we import alert table as SELECT * FROM Alert and join using CustomerID/CusID, then it will load a huge amount of data. Thanks a lot for your guidance.
Solved! Go to Solution.
You can try this
SELECT *
FROM Alert
WHERE CusID IN (
SELECT DISTINCT CustomerID
FROM Applications
WHERE ProdType = 'A'
);
Thanks,
Pankaj
If this solution helps, please accept it and give a kudos (Like), it would be greatly appreciated.
Hi @luckygirl ,
@pankajnamekar25 , proposed query efficiently filters the Alert table by selecting customers from the Applications table with ProdType = 'A', meeting your requirement. This method minimizes unnecessary data loading. To enhance performance, please ensure that CustomerID in Applications and CusID in Alert are indexed.
Let us know if further optimizations are needed.
If your issue is resolved, please mark it as the accepted solution. This will help others facing a similar issue.
Hi @luckygirl ,
@pankajnamekar25 , proposed query efficiently filters the Alert table by selecting customers from the Applications table with ProdType = 'A', meeting your requirement. This method minimizes unnecessary data loading. To enhance performance, please ensure that CustomerID in Applications and CusID in Alert are indexed.
Let us know if further optimizations are needed.
If your issue is resolved, please mark it as the accepted solution. This will help others facing a similar issue.
You can try this
SELECT *
FROM Alert
WHERE CusID IN (
SELECT DISTINCT CustomerID
FROM Applications
WHERE ProdType = 'A'
);
Thanks,
Pankaj
If this solution helps, please accept it and give a kudos (Like), it would be greatly appreciated.
The queries look good. Depending on your capacity the 1M rows will not be a huge deal. I would start by trying to load all of it and see how big your PBIX file is after you load the data. You might be surprised how much you can load in the model and still be efficient and fast. I find it best practice as well to load more than you may need and then filter it for visuals so you don't leave out something you may potentially need later.