The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
HI All,
My data set contains a field to store the SQL query that was execute by certain users and I would need to indentify a way to parse the SQL query text from that column and add the table values to new column in Power Query or using DAX in a Power BI report.
I am looking for regex to use and indentify table names. If there are more than 1 tables or more then it would need to concatenate them with comma.
Regex should be something like this (?:FROM|from|join|into)\s?(\[?\w+\]?\.\[?\w+\]?)
Please help.
Solved! Go to Solution.
Power Query does not support RegExp natively. You can run R or Python scripts though.
Be aware that Power Query is case sensitive. Be aware that joins can be done via comma too.
Hi @AnjanaPothineni ,
You can try this
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WKk7NSU0uUdBScAvy91UISUzKSTVUitWJVgp29XF1DlFI1NPSUUjSQ5FXSFTw9PNzDVLw8vf0g4gZKSQp+PsBVZdk5qZ6uijYAvVAmGDDPP2CXYNCgLpC/GHqNUDSmSk6Col5mbmJOToKRalpRanFGQX5mXklmgpQ6/EpQnVyLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"SQL QUERY" = _t]),
ExtractTableNames = Table.AddColumn(Source, "Table", each
let
SQL = [SQL QUERY],
Words = Text.Split(Text.Select(SQL, {"A".."Z", "a".."z", "0".."9", " ", "_"}), " "),
TableNames = List.Select(Words, each List.Contains({"FROM", "JOIN", "INTO"}, Text.Upper(_))),
NextWords = List.Transform(TableNames, each Text.AfterDelimiter(SQL, _ & " ")),
CleanedNames = List.Transform(NextWords, each Text.BeforeDelimiter(_, " ")),
NonEmptyNames = List.Select(CleanedNames, each _ <> "")
in
Text.Combine(NonEmptyNames, ", ")
)
in
ExtractTableNames
Final output
Or create a dax
NewColumn =
VAR Query = 'Table'[SQL QUERY]
VAR FromTable =
IF(
SEARCH("FROM ", Query, 1, 0) > 0,
MID(Query, SEARCH("FROM ", Query, 1, 0) + 5, FIND(" ", Query & " ", SEARCH("FROM ", Query, 1, 0) + 5) - SEARCH("FROM ", Query, 1, 0) - 5),
BLANK()
)
VAR JoinTable =
IF(
SEARCH("JOIN ", Query, 1, 0) > 0,
MID(Query, SEARCH("JOIN ", Query, 1, 0) + 5, FIND(" ", Query & " ", SEARCH("JOIN ", Query, 1, 0) + 5) - SEARCH("JOIN ", Query, 1, 0) - 5),
BLANK()
)
VAR IntoTable =
IF(
SEARCH("INTO ", Query, 1, 0) > 0,
MID(Query, SEARCH("INTO ", Query, 1, 0) + 5, FIND(" ", Query & " ", SEARCH("INTO ", Query, 1, 0) + 5) - SEARCH("INTO ", Query, 1, 0) - 5),
BLANK()
)
RETURN
CONCATENATE(
CONCATENATE(
FromTable,
IF(NOT(ISBLANK(JoinTable)), ", " & JoinTable, "")
),
IF(NOT(ISBLANK(IntoTable)), ", " & IntoTable, "")
)
Final output
Best regards,
Albert He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Please provide usable sample data, not screenshots.
Be aware that SQL joins can be done via comma too.
Hi @AnjanaPothineni ,
You can try this
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WKk7NSU0uUdBScAvy91UISUzKSTVUitWJVgp29XF1DlFI1NPSUUjSQ5FXSFTw9PNzDVLw8vf0g4gZKSQp+PsBVZdk5qZ6uijYAvVAmGDDPP2CXYNCgLpC/GHqNUDSmSk6Col5mbmJOToKRalpRanFGQX5mXklmgpQ6/EpQnVyLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"SQL QUERY" = _t]),
ExtractTableNames = Table.AddColumn(Source, "Table", each
let
SQL = [SQL QUERY],
Words = Text.Split(Text.Select(SQL, {"A".."Z", "a".."z", "0".."9", " ", "_"}), " "),
TableNames = List.Select(Words, each List.Contains({"FROM", "JOIN", "INTO"}, Text.Upper(_))),
NextWords = List.Transform(TableNames, each Text.AfterDelimiter(SQL, _ & " ")),
CleanedNames = List.Transform(NextWords, each Text.BeforeDelimiter(_, " ")),
NonEmptyNames = List.Select(CleanedNames, each _ <> "")
in
Text.Combine(NonEmptyNames, ", ")
)
in
ExtractTableNames
Final output
Or create a dax
NewColumn =
VAR Query = 'Table'[SQL QUERY]
VAR FromTable =
IF(
SEARCH("FROM ", Query, 1, 0) > 0,
MID(Query, SEARCH("FROM ", Query, 1, 0) + 5, FIND(" ", Query & " ", SEARCH("FROM ", Query, 1, 0) + 5) - SEARCH("FROM ", Query, 1, 0) - 5),
BLANK()
)
VAR JoinTable =
IF(
SEARCH("JOIN ", Query, 1, 0) > 0,
MID(Query, SEARCH("JOIN ", Query, 1, 0) + 5, FIND(" ", Query & " ", SEARCH("JOIN ", Query, 1, 0) + 5) - SEARCH("JOIN ", Query, 1, 0) - 5),
BLANK()
)
VAR IntoTable =
IF(
SEARCH("INTO ", Query, 1, 0) > 0,
MID(Query, SEARCH("INTO ", Query, 1, 0) + 5, FIND(" ", Query & " ", SEARCH("INTO ", Query, 1, 0) + 5) - SEARCH("INTO ", Query, 1, 0) - 5),
BLANK()
)
RETURN
CONCATENATE(
CONCATENATE(
FromTable,
IF(NOT(ISBLANK(JoinTable)), ", " & JoinTable, "")
),
IF(NOT(ISBLANK(IntoTable)), ", " & IntoTable, "")
)
Final output
Best regards,
Albert He
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly
Hi @Anonymous Thank you so much for you reply. I tried DAX option and I am getting the results like this. (As our users have all sorts of comments in their code that this regex is also picking on like below screen shot
Please see belwo screenshot. How do I only identify genuine tables names alone.
Hignlighted in yellow.
Please provide usable sample data, not screenshots.
Be aware that SQL joins can be done via comma too.
See this short video, you can find how to use regex in Power Query
https://youtu.be/-M4Mh4QFxrg?si=m3G2zd7I0l2jJTMp
Power Query does not support RegExp natively. You can run R or Python scripts though.
Be aware that Power Query is case sensitive. Be aware that joins can be done via comma too.