The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
hello all, I am trying to extract Domain and sub domain, so far this M query,
let splitUrlandTitle = Splitter.SplitTextByDelimiter("/", QuoteStyle.None)([UrlandTitle]), splitsplitUrlandTitle2 = List.Reverse(Splitter.SplitTextByDelimiter(".", QuoteStyle.None)(splitUrlandTitle{2}?)) in splitsplitUrlandTitle2{2}?
gives the best result, it returns "lemoncity" but does not work on url's without a leading "/". I have url's like
https://desktop.lemoncity.ca.gov
http://desktop.lemoncity.ca.gov
desktop.lemoncity.ca.gov
lemoncity.ca.gov/more/more
lemoncity.ca.gov:404
Is there a way to insert a condition where if the text does not contain "/" then extract using another method?
thanks
Solved! Go to Solution.
Does your text always contain ca.gov as the domain extension? If yes, below will work.
(But if your code contains various other domain extensions also, then we will need to think about other strategy)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WyigpKSi20tdPSS3OLskv0MtJzc3PS84sqdRLTtRLzy9TitWBKCKkBq8kuqB+bn5RKpjAKm1lYmACtxif69DMgSnHpywWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [UrlandTitle = _t]),
#"Added Custom" = Table.AddColumn(Source, "Domain Name", each List.Last(Text.SplitAny(Text.BeforeDelimiter([UrlandTitle], ".ca.gov"), "./"))),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Sub Domain Name", each List.Last(List.RemoveLastN(Text.Split(Text.AfterDelimiter(Text.BeforeDelimiter(if Text.Contains([UrlandTitle],"//") then [UrlandTitle] else "//" & [UrlandTitle], ".ca.gov"), "//"), "."), 1)))
in
#"Added Custom1"
Does your text always contain ca.gov as the domain extension? If yes, below will work.
(But if your code contains various other domain extensions also, then we will need to think about other strategy)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WyigpKSi20tdPSS3OLskv0MtJzc3PS84sqdRLTtRLzy9TitWBKCKkBq8kuqB+bn5RKpjAKm1lYmACtxif69DMgSnHpywWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [UrlandTitle = _t]),
#"Added Custom" = Table.AddColumn(Source, "Domain Name", each List.Last(Text.SplitAny(Text.BeforeDelimiter([UrlandTitle], ".ca.gov"), "./"))),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Sub Domain Name", each List.Last(List.RemoveLastN(Text.Split(Text.AfterDelimiter(Text.BeforeDelimiter(if Text.Contains([UrlandTitle],"//") then [UrlandTitle] else "//" & [UrlandTitle], ".ca.gov"), "//"), "."), 1)))
in
#"Added Custom1"
Thank you for the M, I am getting great results, there are a few where the extract in not happening but in similar strings the extract is happening? like these stings below from the title column
Anonymous OMG Name on "omg.ca.gov"
Session Obsession on "workinfo.workinginfo.ca.gov"
Open reversal Blahblah on somecity.ca.gov
Open reversal on oops.ca.gov
OEM Misunderstanding at oops.ca.gov
Thanks for the help
Use this
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZAxC4MwEIX/SnAusYOTWwulk3XoqA5RryqaXPCixX9fjbagFB0SHnnfyzsuipzSGE2+6+ZAtUHNG5CossoMPBO8wN5JTjN0xOya20dXYgv2+mv73tn7Fe9Nt/nnix9gF4VqkNgRC4M7ewgJDBWLHZTFEogdCz6BqBqtMKVFWe6NbV2pF/JZFFavgqEGxVrooSXRsGsjynQ8U5pQwnY7a3pqQU0r4BawoKJO5SNihMrHUibMmks+", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [UrlandTitle = _t]),
#"Added Custom" = Table.AddColumn(Source, "Domain Name", each [a = List.Last(Text.Split([UrlandTitle]," ")), b=List.Last(Text.SplitAny(Text.BeforeDelimiter(Text.Replace(a,"""",""), ".ca.gov"), "./"))][b]),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Sub Domain Name", each [a = List.Last(Text.Split([UrlandTitle]," ")), b=List.Last(List.RemoveLastN(Text.Split(Text.AfterDelimiter(Text.BeforeDelimiter(Text.Replace(if Text.Contains(a,"//") then a else "//" & a,"""",""), ".ca.gov"), "//"), "."), 1))][b])
in
#"Added Custom1"
Thank you for the response, I appreciate your time. I tried the new code and it seems to extract too many characters to the left of ca.gov (in some cases).
I did modify your original M function, SplitAny to include a space and a colon and it worked on all but one record. 1184 of 1185. So happy with that. thanks again.
List.Last(Text.SplitAny(Text.BeforeDelimiter([UrlandTitle], ".ca.gov"), ":./ ")))
The record it did not work on is a continuous string like,
https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwjHpcuWiZOAAxUCV2wGHQyUABQQ... lemoncounty.ca.gov%2Fa%2F98615&usg=AOvVaw2UleUK97W2GSs_U1-1P55p&opi=89978449