The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.
Hi All
I'm completelly new to Power Query so have been struggling and stumbling along
I have a data source (3 CSV files totalling 2.5m rows) which I've pulled into one table. It has the below columns plus 20 more
Table A: 'Combined Table'
...and then ran the below steps
let
Source = Table.Combine({CUSTOMER_LTV_NOV_0_6300000, CUSTOMER_LTV_NOV_6300000_7300000, #"CUSTOMER_LTV_NOV_7300000+"}),
#"Added Custom" = Table.AddColumn(Source, "Custom", each [CONFIRMED_DATE]),
#"Changed Type" = Table.TransformColumnTypes(#"Added Custom",{{"Custom", type date}}),
#"Inserted Year" = Table.AddColumn(#"Changed Type", "Year", each Date.Year([Custom]), Int64.Type),
#"Removed Columns" = Table.RemoveColumns(#"Inserted Year",{"Custom"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Year", "Confirmed Year"}}),
#"Added Custom1" = Table.AddColumn(#"Renamed Columns", "Custom", each [GROSS_REVENUE_EUR]-[PAYMENT_EUR]-[COMMISSION_FEE_EUR]),
#"Renamed Columns1" = Table.RenameColumns(#"Added Custom1",{{"Custom", "Net Revenue"}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Renamed Columns1",{{"Net Revenue", type number}}),
#"Removed Other Columns" = Table.SelectColumns(#"Changed Type1",{"CLIENT_ID"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Other Columns"),
#"Added Custom2" = Table.AddColumn(#"Removed Duplicates", "Custom", (CurrentClient) => Table.AddIndexColumn ( Table.Sort ( Table.SelectRows (Source, (NewClient) => NewClient[CLIENT_ID] = CurrentClient[CLIENT_ID]), { "CONFIRMED_DATE", Order.Ascending}), "OrderCount", 1, 1 )),
#"Removed Other Columns1" = Table.SelectColumns(#"Added Custom2",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns1", "Custom", {"BOOKING_ID", "CLIENT_ID", "OrderCount"}, {"Custom.BOOKING_ID", "Custom.CLIENT_ID", "Custom.OrderCount"})
in
#"Expanded Custom"
I've tried to load the output to the datamodel but it takes forever to load. I left it overnight and it did not complete as I believe it's calling all 2.5m rows for each rank
Please help! Is there anyway to reorder the sorting to speed this up?
Hello @inzimam - Based on your script, it looks like the final result you are trying to achieve is a table with three columns - order count (an index of the client's order dates sorted ascending) grouped by booking ID and client ID. Make sure the three tables being appended are not loaded to the data model, then please try this new table instead. Based on the script you provided, this should achieve the same result in a more efficient way. Unnecessary transformations are omitted with the result being achieved by grouping the data instead of getting the distinct client IDs and adding an aggregation by means of referencing previous steps.
let
Source = Table.Combine({CUSTOMER_LTV_NOV_0_6300000, CUSTOMER_LTV_NOV_6300000_7300000, #"CUSTOMER_LTV_NOV_7300000+}),
#"Removed Other Columns" = Table.SelectColumns(Source,{"BOOKING_ID", "CONFIRMED_DATE", "CLIENT_ID"}),
#"Grouped Rows" = Table.Group(#"Removed Other Columns", {"BOOKING_ID", "CLIENT_ID"}, {{"Data", each Table.AddIndexColumn ( Table.Sort ( _, {{"CONFIRMED_DATE", Order.Ascending}} ), "OrderCount", 1, 1, Int64.Type )}}),
#"Expanded Data" = Table.ExpandTableColumn(#"Grouped Rows", "Data", {"OrderCount"}, {"OrderCount"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Data",{{"OrderCount", Int64.Type}})
in
#"Changed Type"
Hello @inzimam ,
If my post helped you, please give me a 👍kudos and mark this post with Accept as Solution.
I reworked your code a little. For example, you needed 4 steps to generate the “Year” column.
let
// Replace the Source Step
Source = Table.FromRecords({
[BOOKING_ID = 5537540, CLIENT_ID = 6033044, CONFIRMED_DATE = #date(2019, 4, 2), GROSS_REVENUE_EUR = 10, PAYMENT_EUR = 2, COMMISSION_FEE_EUR = 5],
[BOOKING_ID = 7752055, CLIENT_ID = 3625175, CONFIRMED_DATE = #date(2022, 9, 5), GROSS_REVENUE_EUR = 10, PAYMENT_EUR = 2, COMMISSION_FEE_EUR = 5],
[BOOKING_ID = 7877069, CLIENT_ID = 5082637, CONFIRMED_DATE = #date(2022, 12, 8), GROSS_REVENUE_EUR = 10, PAYMENT_EUR = 2, COMMISSION_FEE_EUR = 5],
[BOOKING_ID = 8040605, CLIENT_ID = 6268429, CONFIRMED_DATE = #date(2023, 2, 15), GROSS_REVENUE_EUR = 10, PAYMENT_EUR = 2, COMMISSION_FEE_EUR = 5],
[BOOKING_ID = 6258427, CLIENT_ID = 2057527, CONFIRMED_DATE = #date(2020, 6, 24), GROSS_REVENUE_EUR = 10, PAYMENT_EUR = 2, COMMISSION_FEE_EUR = 5],
[BOOKING_ID = 5707824, CLIENT_ID = 6145096, CONFIRMED_DATE = #date(2019, 8, 7), GROSS_REVENUE_EUR = 10, PAYMENT_EUR = 2, COMMISSION_FEE_EUR = 5]
}),
Type = Table.TransformColumnTypes(Source,{{"CONFIRMED_DATE", type date}, {"BOOKING_ID", Int64.Type}, {"CLIENT_ID", Int64.Type}, {"GROSS_REVENUE_EUR", Currency.Type}, {"PAYMENT_EUR", Currency.Type}, {"COMMISSION_FEE_EUR", Currency.Type}}),
ColumConfirmedYear = Table.AddColumn(Type, "Confirmed Year", each Date.Year([CONFIRMED_DATE]), Int64.Type),
ColumnNetRevenue = Table.AddColumn(ColumConfirmedYear, "Net Revenue", each [GROSS_REVENUE_EUR]-[PAYMENT_EUR]-[COMMISSION_FEE_EUR], type number),
Column = Table.SelectColumns(ColumnNetRevenue,{"CLIENT_ID"}),
Duplicates = Table.Distinct(Column),
Custom = Table.AddColumn(Duplicates, "Custom", (CurrentClient) => Table.AddIndexColumn ( Table.Sort ( Table.SelectRows (ColumConfirmedYear[[BOOKING_ID], [CLIENT_ID], [CONFIRMED_DATE]], (NewClient) => NewClient[CLIENT_ID] = CurrentClient[CLIENT_ID]), { "CONFIRMED_DATE", Order.Ascending}), "OrderCount", 1, 1 )),
RemovedColumns = Table.SelectColumns(Custom,{"Custom"}),
Expanded = Table.ExpandTableColumn(RemovedColumns, "Custom", {"BOOKING_ID", "CLIENT_ID", "OrderCount"}, {"BOOKING_ID", "CLIENT_ID", "OrderCount"}),
Type2 = Table.TransformColumnTypes(Expanded,{{"BOOKING_ID", Int64.Type}, {"CLIENT_ID", Int64.Type}, {"OrderCount", Int64.Type}})
in
Type2
Best regards from Germany
Manuel Bolz
🟦Follow me on LinkedIn
🟨How to Get Your Question Answered Quickly
🟩Fabric Community Conference
🟪My Solutions on Github
Thanks for your quick reply -
Just to be clear, the table image I shared was just a snippet - theree are 2.5m rows of data in total from the 3 files
Source = Table.Combine({CUSTOMER_LTV_NOV_0_6300000, CUSTOMER_LTV_NOV_6300000_7300000, #"CUSTOMER_LTV_NOV_7300000+"}),
Can I just replace the Source part of your code with the above? Sorry if this is a silly question
Hello @inzimam,
If my post helped you, please give me a 👍kudos and mark this post with Accept as Solution.
Please check that you have deactivated your three sub-queries. So "Enable load" should be disabled. This can also be seen from the fact that they are shown in italics.
Best regards from Germany
Manuel Bolz
🟦Follow me on LinkedIn
🟨How to Get Your Question Answered Quickly
🟩Fabric Community Conference
🟪My Solutions on Github
Thank you, where do I go to deactivate them. I have changed their Load to connection Only but can't see where to disable completely
Hello @inzimam,
Right click the query and then uncheck Enable loading".
If my post helped you, please give me a 👍kudos and mark this post with Accept as Solution.
Best regards from Germany
Manuel Bolz
🟦Follow me on LinkedIn
🟨How to Get Your Question Answered Quickly
🟩Fabric Community Conference
🟪My Solutions on Github
For some reason I don't have that option
Hello @inzimam,
If my post helped you, please give me a 👍kudos and mark this post with Accept as Solution.
1. right clickt on your query
2. uncheck "Enable load"
3 the 3 queries are deactivated if they are shown in italics.
Best regards from Germany
Manuel Bolz
🟦Follow me on LinkedIn
🟨How to Get Your Question Answered Quickly
🟩Fabric Community Conference
🟪My Solutions on Github
Hello @inzimam,
If my post helped you, please give me a 👍kudos and mark this post with Accept as Solution.
The following code should then be the correct call.
let
Source = Table.Combine({CUSTOMER_LTV_NOV_0_6300000, CUSTOMER_LTV_NOV_6300000_7300000, #"CUSTOMER_LTV_NOV_7300000+"}),
Type = Table.TransformColumnTypes(Source,{{"CONFIRMED_DATE", type date}, {"BOOKING_ID", Int64.Type}, {"CLIENT_ID", Int64.Type}, {"GROSS_REVENUE_EUR", Currency.Type}, {"PAYMENT_EUR", Currency.Type}, {"COMMISSION_FEE_EUR", Currency.Type}}),
ColumConfirmedYear = Table.AddColumn(Type, "Confirmed Year", each Date.Year([CONFIRMED_DATE]), Int64.Type),
ColumnNetRevenue = Table.AddColumn(ColumConfirmedYear, "Net Revenue", each [GROSS_REVENUE_EUR]-[PAYMENT_EUR]-[COMMISSION_FEE_EUR], type number),
Column = Table.SelectColumns(ColumnNetRevenue,{"CLIENT_ID"}),
Duplicates = Table.Distinct(Column),
Custom = Table.AddColumn(Duplicates, "Custom", (CurrentClient) => Table.AddIndexColumn ( Table.Sort ( Table.SelectRows (ColumConfirmedYear[[BOOKING_ID], [CLIENT_ID], [CONFIRMED_DATE]], (NewClient) => NewClient[CLIENT_ID] = CurrentClient[CLIENT_ID]), { "CONFIRMED_DATE", Order.Ascending}), "OrderCount", 1, 1 )),
RemovedColumns = Table.SelectColumns(Custom,{"Custom"}),
Expanded = Table.ExpandTableColumn(RemovedColumns, "Custom", {"BOOKING_ID", "CLIENT_ID", "OrderCount"}, {"BOOKING_ID", "CLIENT_ID", "OrderCount"}),
Type2 = Table.TransformColumnTypes(Expanded,{{"BOOKING_ID", Int64.Type}, {"CLIENT_ID", Int64.Type}, {"OrderCount", Int64.Type}})
in
Type2
Best regards from Germany
Manuel Bolz
🟦Follow me on LinkedIn
🟨How to Get Your Question Answered Quickly
🟩Fabric Community Conference
🟪My Solutions on Github
Thank you
I've tried to run this and i get this error
DataFormat.Error: We couldn't convert to Number.
Details:
UNKNOWN
Looks like it's at the Expanded step
Hello
Yes, but I can't see any error message on your screenshot. You can't find the "enable loading" button because you're using Power Query in Excel. We started with Power BI. However, there is a similar button here.
If my post helped you, please give me a 👍kudos and mark this post with Accept as Solution.
Best regards from Germany
Manuel Bolz
🟦Follow me on LinkedIn
🟨How to Get Your Question Answered Quickly
🟩Fabric Community Conference
🟪My Solutions on Github
Hello @inzimam
At which step exactly? Can you show a screenshot of your output?
If my post helped you, please give me a 👍kudos and mark this post with Accept as Solution.
Best regards from Germany
Manuel Bolz
🟦Follow me on LinkedIn
🟨How to Get Your Question Answered Quickly
🟩Fabric Community Conference
🟪My Solutions on Github