Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
rixmcx59
Helper V
Helper V

Extract domain parts

hello all, I am trying to extract Domain and sub domain, so far this M query,

let splitUrlandTitle = Splitter.SplitTextByDelimiter("/", QuoteStyle.None)([UrlandTitle]), splitsplitUrlandTitle2 = List.Reverse(Splitter.SplitTextByDelimiter(".", QuoteStyle.None)(splitUrlandTitle{2}?)) in splitsplitUrlandTitle2{2}? 

gives the best result, it returns "lemoncity" but does not work on url's without a leading "/". I have url's like

https://desktop.lemoncity.ca.gov 

http://desktop.lemoncity.ca.gov 

desktop.lemoncity.ca.gov

lemoncity.ca.gov/more/more

lemoncity.ca.gov:404

Is there a way to insert a condition where if the text does not contain "/" then extract using another method?

thanks

1 ACCEPTED SOLUTION
Vijay_A_Verma
Super User
Super User

Does your text always contain ca.gov as the domain extension? If yes, below will work.

(But if your code contains various other domain extensions also, then we will need to think about other strategy)

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WyigpKSi20tdPSS3OLskv0MtJzc3PS84sqdRLTtRLzy9TitWBKCKkBq8kuqB+bn5RKpjAKm1lYmACtxif69DMgSnHpywWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [UrlandTitle = _t]),
    #"Added Custom" = Table.AddColumn(Source, "Domain Name", each List.Last(Text.SplitAny(Text.BeforeDelimiter([UrlandTitle], ".ca.gov"), "./"))),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Sub Domain Name", each List.Last(List.RemoveLastN(Text.Split(Text.AfterDelimiter(Text.BeforeDelimiter(if Text.Contains([UrlandTitle],"//") then [UrlandTitle] else "//" & [UrlandTitle], ".ca.gov"), "//"), "."), 1)))
in
    #"Added Custom1"

 

 

View solution in original post

4 REPLIES 4
Vijay_A_Verma
Super User
Super User

Does your text always contain ca.gov as the domain extension? If yes, below will work.

(But if your code contains various other domain extensions also, then we will need to think about other strategy)

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WyigpKSi20tdPSS3OLskv0MtJzc3PS84sqdRLTtRLzy9TitWBKCKkBq8kuqB+bn5RKpjAKm1lYmACtxif69DMgSnHpywWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [UrlandTitle = _t]),
    #"Added Custom" = Table.AddColumn(Source, "Domain Name", each List.Last(Text.SplitAny(Text.BeforeDelimiter([UrlandTitle], ".ca.gov"), "./"))),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Sub Domain Name", each List.Last(List.RemoveLastN(Text.Split(Text.AfterDelimiter(Text.BeforeDelimiter(if Text.Contains([UrlandTitle],"//") then [UrlandTitle] else "//" & [UrlandTitle], ".ca.gov"), "//"), "."), 1)))
in
    #"Added Custom1"

 

 

Thank you for the M, I am getting great results, there are a few where the extract in not happening but in similar strings the extract is happening? like these stings below from the title column

 

Anonymous OMG Name on "omg.ca.gov"
Session Obsession on "workinfo.workinginfo.ca.gov"
Open reversal Blahblah on somecity.ca.gov
Open reversal on oops.ca.gov
OEM Misunderstanding at oops.ca.gov

Thanks for the help

Use this

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("hZAxC4MwEIX/SnAusYOTWwulk3XoqA5RryqaXPCixX9fjbagFB0SHnnfyzsuipzSGE2+6+ZAtUHNG5CossoMPBO8wN5JTjN0xOya20dXYgv2+mv73tn7Fe9Nt/nnix9gF4VqkNgRC4M7ewgJDBWLHZTFEogdCz6BqBqtMKVFWe6NbV2pF/JZFFavgqEGxVrooSXRsGsjynQ8U5pQwnY7a3pqQU0r4BawoKJO5SNihMrHUibMmks+", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [UrlandTitle = _t]),
    #"Added Custom" = Table.AddColumn(Source, "Domain Name", each [a = List.Last(Text.Split([UrlandTitle]," ")), b=List.Last(Text.SplitAny(Text.BeforeDelimiter(Text.Replace(a,"""",""), ".ca.gov"), "./"))][b]),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Sub Domain Name", each [a = List.Last(Text.Split([UrlandTitle]," ")), b=List.Last(List.RemoveLastN(Text.Split(Text.AfterDelimiter(Text.BeforeDelimiter(Text.Replace(if Text.Contains(a,"//") then a else "//" & a,"""",""), ".ca.gov"), "//"), "."), 1))][b])
in
    #"Added Custom1"

Thank you for the response, I appreciate your time. I tried the new code and it seems to extract too many characters to the left of ca.gov (in some cases).

I did modify your original M function, SplitAny to include a space and a colon and it worked on all but one record. 1184 of 1185. So happy with that. thanks again.

 List.Last(Text.SplitAny(Text.BeforeDelimiter([UrlandTitle], ".ca.gov"), ":./ "))) 

 

The record it did not work on is a continuous string like,

https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&ved=2ahUKEwjHpcuWiZOAAxUCV2wGHQyUABQQ...  lemoncounty.ca.gov%2Fa%2F98615&usg=AOvVaw2UleUK97W2GSs_U1-1P55p&opi=89978449

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors