Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
Sohaib
Helper II
Helper II

extract the data with respect to few columns

Hi I have these kind of data which i extract from the raw data set. I need to furthure extract it with respect to few columns

Sohaib_1-1712239676190.png

Sohaib_2-1712239866928.png

 

As you can see the above two mentioned columns for every "Eutrancells" unique value their is 0 to 23 "hour" row data for every "Date". I need to calculate the Max value of "RCC User" columns between the range of 0 to 24 hour for every "date" row for each Eutracells and only give the max value of "RRC user" column fo unique "EUtracells" and for each "date" value. Please help

1 ACCEPTED SOLUTION

Ok, but keep in mind that for L_JCL014_I with date 17.3.2024 you have 2 rows with same RRC_USERS max value!

 

dufoq3_0-1712245200798.png

 

let
    Source = Csv.Document(File.Contents("c:\Address\Hourly Table 20240404 192948.csv"),[Delimiter=",", Columns=55, Encoding=1250, QuoteStyle=QuoteStyle.None]),
    PromotedHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    TransformHeaders = Table.TransformColumnNames(PromotedHeaders, each Text.Trim(Text.BetweenDelimiters(_, "Hourly Table[", "]"))),
    ChangedType = Table.TransformColumns(TransformHeaders,{{"Date New", each Date.From(DateTime.From(_, "en-US")), type date}, {"RRC_USERS", each Number.From(_, "en-US"), type number}}),
    GroupedRows = Table.Group(ChangedType, {"EutranCells", "Date New"}, {{"Max RRC_USERS Row", each Table.SelectRows(_, (x)=> x[RRC_USERS] = List.Max([RRC_USERS])), type table}}),
    CombinedMaxRRC_USERS_Rows = Table.Combine(GroupedRows[Max RRC_USERS Row])
in
    CombinedMaxRRC_USERS_Rows

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

View solution in original post

24 REPLIES 24
Sohaib
Helper II
Helper II

Hi @dufoq3  thanks for your all kind of help. I am kind stuck with new thing. Currently I have to a situation of data in which i have not just only max but the 5 top value of max "VOICE_TRAFFIC_3G_Erl" in 7 days of data with respect to utrancel and date. 
Let me explain you in detail.
1. We have tables of 7 days hourly base data. For each day we have separate csv file.
2. The query we require is we took data from all 7 days or any days provided tables and it give us table with respect to top 5 values of "VOICE_TRAFFIC_3G_Erl". Means no matter how many days data we provide it only took top 5 value data of "VOICE_TRAFFIC_3G_Erl" . Means it check all the values of each date with hour respect and give us back the max top 5 value of all the data provided for 7 days. And all other relevant column included with it.
3. Currently I am attaching 5 days hourly data but it is not an issue as the query will automatically select the top 5 values from all the data. So whether it will be 5 days or any number of days it just has to select 5 top value "VOICE_TRAFFIC_3G_Erl"  in all the data with respect to unique "Utrancell" column. Means the last table have for every Utrancell we have 5 entries of data which is the top 5 values of "VOICE_TRAFFIC_3G_Erl" and all other relevant columns included.

Link of Tables 

@dufoq3 waiting for your reply please help

Hi, try this. Change folder address.

 

let
    Source = Folder.Files("c:\Downloads\PowerQueryForum\Sohaib\CS BH Tables\"),
    CsvToTable = Table.TransformColumns(Source, {{"Content", each Table.PromoteHeaders(Csv.Document(_)), type table}}),
    Combined = Table.Combine(CsvToTable[Content]),
    ChangedType = Table.TransformColumns(Combined, {{"Date New", each Date.From(Text.BeforeDelimiter(_, " ")), type date}, {"VOICE_TRAFFIC_3G_Erl", each Number.From(_, "en-US"), type number}}),
    GroupedRows = Table.Group(ChangedType, {"Utrancell"}, {{"Max5", each Table.MaxN(_, {"VOICE_TRAFFIC_3G_Erl"}, 5), type table}}),
    CombinedMax5 = Table.Combine(GroupedRows[Max5])
in
    CombinedMax5

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Sohaib
Helper II
Helper II

need two result one for weekly and one monthly

Try this:

 

Weekly:

let
    Source = Folder.Files("c:\Downloads\PowerQueryForum\Sohaib\"),
    FilteredHiddenFiles = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
    FilteredCsv = Table.SelectRows(FilteredHiddenFiles, each [Extension] = ".csv"),
    BinaryToTable = Table.TransformColumns(FilteredCsv, {{"Content", each Table.PromoteHeaders(Csv.Document(_,[Delimiter=",", Encoding=1250, QuoteStyle=QuoteStyle.None])), type table}}),
    CombinedTables = Table.Combine(BinaryToTable[Content]),
    TransformHeaders = Table.TransformColumnNames(CombinedTables, each Text.Trim(Text.BetweenDelimiters(_, "Hourly Table[", "]"))),
    ChangedType = Table.TransformColumns(TransformHeaders,{{"Date New", each Date.From(DateTime.From(_, "en-US")), type date}, {"RRC_USERS", each Number.From(_, "en-US"), type number}}),
    Ad_YearWeek = Table.AddColumn(ChangedType, "YearWeek", each Date.Year([Date New])*100 +  Date.WeekOfYear([Date New]), Int64.Type),
    GroupedRows = Table.Group(Ad_YearWeek, {"EutranCells", "YearWeek"}, {{"Max RRC_USERS Row", each Table.SelectRows(_, (x)=> x[RRC_USERS] = List.Max([RRC_USERS])), type table}}),
    CombinedMaxRRC_USERS_Rows = Table.Combine(GroupedRows[Max RRC_USERS Row])
in
    CombinedMaxRRC_USERS_Rows

 

Monthly:

let
    Source = Folder.Files("c:\Downloads\PowerQueryForum\Sohaib\"),
    FilteredHiddenFiles = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
    FilteredCsv = Table.SelectRows(FilteredHiddenFiles, each [Extension] = ".csv"),
    BinaryToTable = Table.TransformColumns(FilteredCsv, {{"Content", each Table.PromoteHeaders(Csv.Document(_,[Delimiter=",", Encoding=1250, QuoteStyle=QuoteStyle.None])), type table}}),
    CombinedTables = Table.Combine(BinaryToTable[Content]),
    TransformHeaders = Table.TransformColumnNames(CombinedTables, each Text.Trim(Text.BetweenDelimiters(_, "Hourly Table[", "]"))),
    ChangedType = Table.TransformColumns(TransformHeaders,{{"Date New", each Date.From(DateTime.From(_, "en-US")), type date}, {"RRC_USERS", each Number.From(_, "en-US"), type number}}),
    Ad_YearMonth = Table.AddColumn(ChangedType, "YearMonth", each Date.Year([Date New])*100 + Date.Month([Date New]), Int64.Type),
    GroupedRows = Table.Group(Ad_YearMonth, {"EutranCells", "YearMonth"}, {{"Max RRC_USERS Row", each Table.SelectRows(_, (x)=> x[RRC_USERS] = List.Max([RRC_USERS])), type table}}),
    CombinedMaxRRC_USERS_Rows = Table.Combine(GroupedRows[Max RRC_USERS Row])
in
    CombinedMaxRRC_USERS_Rows

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

thanks. But I also sort out this as well. In this way I learned a alot.

Sohaib
Helper II
Helper II

Hi @dufoq3 , one other query also arise need this data with respect to weekly or monthly, Currently we evaluate it on day level. I will attached the data set. Please help.

Data set 

Sohaib
Helper II
Helper II

Yes, but the data set i provided you need to extract their all other column value of the same row of maxium rrc column

Ok, but keep in mind that for L_JCL014_I with date 17.3.2024 you have 2 rows with same RRC_USERS max value!

 

dufoq3_0-1712245200798.png

 

let
    Source = Csv.Document(File.Contents("c:\Address\Hourly Table 20240404 192948.csv"),[Delimiter=",", Columns=55, Encoding=1250, QuoteStyle=QuoteStyle.None]),
    PromotedHeaders = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    TransformHeaders = Table.TransformColumnNames(PromotedHeaders, each Text.Trim(Text.BetweenDelimiters(_, "Hourly Table[", "]"))),
    ChangedType = Table.TransformColumns(TransformHeaders,{{"Date New", each Date.From(DateTime.From(_, "en-US")), type date}, {"RRC_USERS", each Number.From(_, "en-US"), type number}}),
    GroupedRows = Table.Group(ChangedType, {"EutranCells", "Date New"}, {{"Max RRC_USERS Row", each Table.SelectRows(_, (x)=> x[RRC_USERS] = List.Max([RRC_USERS])), type table}}),
    CombinedMaxRRC_USERS_Rows = Table.Combine(GroupedRows[Max RRC_USERS Row])
in
    CombinedMaxRRC_USERS_Rows

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Hey dufoq3 how are you? I need to understand if  entries appaer or the entries have same value like 0 we have multiple entries in the result how to avoid multiple entries if by chance max rrc users column have same value or we have 0 value, Because at 0 value there are multiple entries appear. I need to pick only one randomly if the value is for 2 entries have same result or in case 0 just 1 random entry should be given in answer

Hi, if you want to pick only 1 row, you can add this in GroupedRows step:

 

= Table.Group(ChangedType, {"EutranCells", "Date New"}, {{"Max RRC_USERS Row", each Table.FirstN(Table.SelectRows(_, (x)=> x[RRC_USERS] = List.Max([RRC_USERS])), 1), type table}})

dufoq3_0-1720078308268.png

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Ok 1 last thing if i want to select source as a folder not as CSV or Xlxs file what whould be the change in code

 

Load your excel document from folder as new query. Paste here that your code and also screenshot of imported table.


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

let
Source = Folder.Files("C:\Users\dell\Documents\Working\Data Validation\KPI Calculations 28-2-2024\KPI Calculation new method working\Data For Busy Hour Tables"),
#"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File (3)", each #"Transform File (3)"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
#"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File (3)"}),
#"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File (3)", Table.ColumnNames(#"Transform File (3)"(#"Sample File (3)"))),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Hourly Table[EutranCells]", type text}, {"Hourly Table[Date New]", type datetime}, {"Hourly Table[EnodeB new]", type text}, {"Hourly Table[Hour]", Int64.Type}, {"Hourly Table[Begin Time New]", type datetime}, {"Hourly Table[End Time New]", type datetime}, {"Hourly Table[Granularity]", type text}, {"Hourly Table[Availability_4G]", type number}, {"Hourly Table[DL_THP_CELL_4G_Mbps]", type number}, {"Hourly Table[DL_PRB_Usage]", type number}, {"Hourly Table[DL_THP_USER_4G_Mbps]", type number}, {"Hourly Table[DL_THP_USER_QCI1_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI2_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI3_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI4_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI5_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI6_4G_Mbps]", type number}, {"Hourly Table[DL_THP_USER_QCI7_4G_Mbps]", type number}, {"Hourly Table[DL_THP_USER_QCI8_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI9_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_VOL_4G_GB]", type number}, {"Hourly Table[ERAB Attempt]", Int64.Type}, {"Hourly Table[ERAB_DCR]", Int64.Type}, {"Hourly Table[ERAB_Drops]", Int64.Type}, {"Hourly Table[ERAB_SSR]", Int64.Type}, {"Hourly Table[ERAB_Failures]", Int64.Type}, {"Hourly Table[RRC_USERS]", type number}, {"Hourly Table[UL_THP_CELL_4G_Mbps]", type number}, {"Hourly Table[UL_PRB_Usage]", type number}, {"Hourly Table[UL_THP_USER_4G_Mbps]", type number}, {"Hourly Table[DL_Spectral_Eff]", type number}, {"Hourly Table[Avg_Agg_CQI]", type number}, {"Hourly Table[UL_VOL_4G_GB]", type number}, {"Hourly Table[VOLTE_SESSION_TIME_minutes]", Int64.Type}, {"Hourly Table[VOLTE_DCR]", Int64.Type}, {"Hourly Table[VOLTE_SSR]", Int64.Type}, {"Hourly Table[VoLTE_INTEGRITY_SESSION]", Int64.Type}, {"Hourly Table[VoLTE_INTEGRITY_UE]", Int64.Type}, {"Hourly Table[VOLTE_Erl]", Int64.Type}, {"Hourly Table[VOLTE_USERS]", type number}, {"Hourly Table[Reported Rank 1 %]", type any}, {"Hourly Table[Reported Rank 2 %]", type any}, {"Hourly Table[Reported Rank 3 %]", type any}, {"Hourly Table[Reported Rank 4 %]", type any}, {"Hourly Table[UL RSSI PUSCH New]", type any}, {"Hourly Table[UL RSSI PUCCH]", type any}, {"Hourly Table[QPSK Samples (%)]", type any}, {"Hourly Table[16 QAM Samples (%)]", type any}, {"Hourly Table[64QAM Samples(%)]", type any}, {"Hourly Table[RLC DL BLER]", type any}, {"Hourly Table[RLC UL BLER]", type any}, {"Hourly Table[UL SINR PUCCH]", type any}, {"Hourly Table[UL SINR PUSCH]", type any}, {"Hourly Table[PDCCH CFI Utilization]", type any}, {"Hourly Table[CCE Aggregation Level]", type any}})
in
#"Changed Type"

Have you read my whole request?


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

let Source = Folder.Files("C:\Users\dell\Documents\Working\Data Validation\KPI Calculations 28-2-2024\KPI Calculation new method working\Data For Busy Hour Tables"), #"Filtered Hidden Files1" = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true), #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File (3)", each #"Transform File (3)"([Content])), #"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}), #"Removed Other Columns1" = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File (3)"}), #"Expanded Table Column1" = Table.ExpandTableColumn(#"Removed Other Columns1", "Transform File (3)", Table.ColumnNames(#"Transform File (3)"(#"Sample File (3)"))), #"Changed Type" = Table.TransformColumnTypes(#"Expanded Table Column1",{{"Source.Name", type text}, {"Hourly Table[EutranCells]", type text}, {"Hourly Table[Date New]", type datetime}, {"Hourly Table[EnodeB new]", type text}, {"Hourly Table[Hour]", Int64.Type}, {"Hourly Table[Begin Time New]", type datetime}, {"Hourly Table[End Time New]", type datetime}, {"Hourly Table[Granularity]", type text}, {"Hourly Table[Availability_4G]", type number}, {"Hourly Table[DL_THP_CELL_4G_Mbps]", type number}, {"Hourly Table[DL_PRB_Usage]", type number}, {"Hourly Table[DL_THP_USER_4G_Mbps]", type number}, {"Hourly Table[DL_THP_USER_QCI1_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI2_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI3_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI4_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI5_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI6_4G_Mbps]", type number}, {"Hourly Table[DL_THP_USER_QCI7_4G_Mbps]", type number}, {"Hourly Table[DL_THP_USER_QCI8_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_THP_USER_QCI9_4G_Mbps]", Int64.Type}, {"Hourly Table[DL_VOL_4G_GB]", type number}, {"Hourly Table[ERAB Attempt]", Int64.Type}, {"Hourly Table[ERAB_DCR]", Int64.Type}, {"Hourly Table[ERAB_Drops]", Int64.Type}, {"Hourly Table[ERAB_SSR]", Int64.Type}, {"Hourly Table[ERAB_Failures]", Int64.Type}, {"Hourly Table[RRC_USERS]", type number}, {"Hourly Table[UL_THP_CELL_4G_Mbps]", type number}, {"Hourly Table[UL_PRB_Usage]", type number}, {"Hourly Table[UL_THP_USER_4G_Mbps]", type number}, {"Hourly Table[DL_Spectral_Eff]", type number}, {"Hourly Table[Avg_Agg_CQI]", type number}, {"Hourly Table[UL_VOL_4G_GB]", type number}, {"Hourly Table[VOLTE_SESSION_TIME_minutes]", Int64.Type}, {"Hourly Table[VOLTE_DCR]", Int64.Type}, {"Hourly Table[VOLTE_SSR]", Int64.Type}, {"Hourly Table[VoLTE_INTEGRITY_SESSION]", Int64.Type}, {"Hourly Table[VoLTE_INTEGRITY_UE]", Int64.Type}, {"Hourly Table[VOLTE_Erl]", Int64.Type}, {"Hourly Table[VOLTE_USERS]", type number}, {"Hourly Table[Reported Rank 1 %]", type any}, {"Hourly Table[Reported Rank 2 %]", type any}, {"Hourly Table[Reported Rank 3 %]", type any}, {"Hourly Table[Reported Rank 4 %]", type any}, {"Hourly Table[UL RSSI PUSCH New]", type any}, {"Hourly Table[UL RSSI PUCCH]", type any}, {"Hourly Table[QPSK Samples (%)]", type any}, {"Hourly Table[16 QAM Samples (%)]", type any}, {"Hourly Table[64QAM Samples(%)]", type any}, {"Hourly Table[RLC DL BLER]", type any}, {"Hourly Table[RLC UL BLER]", type any}, {"Hourly Table[UL SINR PUCCH]", type any}, {"Hourly Table[UL SINR PUSCH]", type any}, {"Hourly Table[PDCCH CFI Utilization]", type any}, {"Hourly Table[CCE Aggregation Level]", type any}}) in #"Changed Type"

 

Sohaib_1-1712254423028.png

 

Hi,

 

I recommend you delete the query of which you provided the code. Also delete connected group:

dufoq3_0-1712314749281.png

 

Create new blank query and paste there this code.

  • in Source step, replace whole address with your FOLDER addres where you have your .csv files stored. This query filters only .csv files, so you can have more files with other extensions in same folder or subfolders, but if you have other .csv files in same folder or subfolders, either move them or add additiional filters at after Source step.
let
    Source = Folder.Files("c:\Address\PowerQueryForum\Sohaib\"),
    FilteredHiddenFiles = Table.SelectRows(Source, each [Attributes]?[Hidden]? <> true),
    FilteredCsv = Table.SelectRows(FilteredHiddenFiles, each [Extension] = ".csv"),
    BinaryToTable = Table.TransformColumns(FilteredCsv, {{"Content", each Table.PromoteHeaders(Csv.Document(_,[Delimiter=",", Encoding=1250, QuoteStyle=QuoteStyle.None])), type table}}),
    CombinedTables = Table.Combine(BinaryToTable[Content]),
    TransformHeaders = Table.TransformColumnNames(CombinedTables, each Text.Trim(Text.BetweenDelimiters(_, "Hourly Table[", "]"))),
    ChangedType = Table.TransformColumns(TransformHeaders,{{"Date New", each Date.From(DateTime.From(_, "en-US")), type date}, {"RRC_USERS", each Number.From(_, "en-US"), type number}}),
    GroupedRows = Table.Group(ChangedType, {"EutranCells", "Date New"}, {{"Max RRC_USERS Row", each Table.SelectRows(_, (x)=> x[RRC_USERS] = List.Max([RRC_USERS])), type table}}),
    CombinedMaxRRC_USERS_Rows = Table.Combine(GroupedRows[Max RRC_USERS Row])
in
    CombinedMaxRRC_USERS_Rows

 


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

thanks. It works like a charm

You're welcome.


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

dufoq3
Super User
Super User

Hi, we would like to help you, but provide sample data in usable format (not a screenshot) and expected result based on sample data please.


Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

AugPowerBI_Carousel

Power BI Monthly Update - August 2024

Check out the August 2024 Power BI update to learn about new features.

August Carousel

Fabric Community Update - August 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors