Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
h_l
Post Patron
Post Patron

String categorizing, a little challenge, Please help! Thanks!

Looking for hero’s and experts’ help on string process.

Generally speak, there are URLs need to be classified into different level & category.

But the URLs structure & value are so much varies…. The problem has confused me over the past days & nights.

Here are the examples: (Sorry I cannot attach file, the table format data is at the end of the post)

图片.png

 

Theoretically, a full URL example as following, but, only red-highlighted structure/value are mandatory, the rest all are optional.

https://subdomain.domain.com/country/language/business-group/machine-series/machine-type/machine-SN/page-category/content-id-seo-friendly-name

Plus, any content-id can be accessed no matter the “category” is aligned or not.

E.g. following one accessory content can be accessed by both “accessories” category and “brochures”

https://sdm1.domain.com/us/en/accessories/acc002031

https://sdm1.domain.com/us/en/brochures/acc002031

 

Under this situation, I need to categories them as above table shows.

If you already have great idea, please ignore my following threads…

Please let me know the DAX can be used for: searching string, returning value, judging format etc.

Million THANKs in advance!

 

Here are my threads, but have no idea how to write the DAX:

The easiest is to identify “Product in URL” or not:

IF “/products/” is found, then return “Yes”, otherwise, return “No”.

 

 

Content ID: Looking for if there is any value matches format of “content id”,

- starts with “/bh” + 6 digits numbers
- starts with “/dl” + 6 digits numbers
- starts with “/vid” + 6 digits numbers
- starts with “/acc” + 6 digits numbers
- starts with “inte”+ characters after the first followed “-“
- contains fixed value “latest-news”

IF any of above format is matched, fetch the ID & fill in “Content ID”
ELSE, find if there is any value contains:
- “/brochures/”
- “/downloads/” (here is a point, you can see there is “Download Category” ends with /downloads value…
- “/documents/” (here is a point again, see “Documents Category” ends with /documents)
- “/videos/”
if any of above value is matched, return “Other” in Content ID
ELSE, fill “Not Content”

 

Page Type:

IF: find “.com” value, then count how many characters afterwards, if the counting is <=7, then return “Home Page”

ELSE, IF matches anything in “Content ID” step, return “Content Page”

ELSE, find “warrantylookup”, if there is, return “Function”

ELSE, find “partslookup”, if there is, return “Function Page”

ELSE, IF ends with “/documents” or “/documents/”, return “Function Page”

ELSE, IF ends with “/downloads” or “/downloads/”, return “Function Page”

ELSE, IF contains “/products/” return “Product Home”

 

Function Category:

IF: find “.com” value, and count how many characters afterwards, if the counting is <=7, then return “Home Page”

ELSE,

IF contains “/brochures/”, return “Brochure Content”
ELSE, IF contains “/Videos/”, return “Video Content”

ELSE: IF contains “/downloads/” AND not ends with “/downloads/”, return “Downloads Content”

ELSE, find “warrantylookup”, if there is, return “Warranty Lookup”

ELSE, find “partslookup”, if there is, return “Parts Lookup”

ELSE, IF ends with “/documents” or “/documents/”, return “Documents Category”

ELSE, IF ends with “/downloads” or “/downloads/”, return “Downloads Category”

ELSE, IF contains “/products/” return “Product Home”

 

Looking for your contribution and sincerely appreciate that!

Have a safe day.

H

 

Table Data:

URLProduct in URLsPage TypeFunction CategoryContent ID
https://sdm1.domain.com/NoHome PageHome PageNot Content
https://sdm1.domain.com/in/enNoHome PageHome PageNot Content
https://sdm1.domain.com/de/de/NoHome PageHome PageNot Content
https://sdm1.domain.com/il/en/brochures/bh087691NoContent PageBrochure Contentbh087691
https://sdm1.domain.com/us/en/brochures/bh087691-seo-sample-titleNoContent PageBrochure Contentbh087691
https://sdm1.domain.com/us/es/brochures/latest-newsNoContent PageBrochure Contentlatest-news
https://sdm1.domain.com/it/es/brochures/inte-4adlyqNoContent PageBrochure Contentinte-4adlyq
https://sdm1.domain.com/it/es/brochures/quick-resourcesNoContent PageBrochure ContentOther
https://sdm1.domain.com/documents/inte-75423NoContent PageDocuments Contentinte-75423
https://sdm1.domain.com/hk/en/videos/vid002031NoContent PageVideo Contentvid002031
https://sdm1.domain.com/hk/en/accessories/acc002031NoContent PageAccessories Contentacc002031
https://sdm1.domain.com/hk/en/brochures/acc002031NoContent PageAccessories Contentacc002031
https://sdm1.domain.com/us/en/partslookupNoFunction PageParts LookupNot Content
https://sdm1.domain.com/us/en/downloads/dl202102NoContent PageDownload Contentdl202102
https://sdm1.domain.com/us/en/downloads/dl202102-download-seo-sample-titleNoContent PageDownload Contentdl202102
https://sdm1.domain.com/us/en/products/medical-cares/optical-series/optic-x202/2ab6/brochures/bh2021...YesContent PageBrochure Contentbh202192
https://sdm1.domain.com/us/en/products/medical-cares/optical-series/optic-x202/documentsYesFunction PageDocuments CategoryNot Content
https://sdm1.domain.com/us/en/products/medical-cares/optical-series/optic-x202/downloadsYesFunction PageDownloads CategoryNot Content
https://sdm1.domain.com/us/en/products/medical-cares/optical-series/optic-x202/downloads/dl202102YesContent PageDownload Contentdl202102
https://sdm1.domain.com/us/en/products/medical-cares/optical-series/optic-x202/partslookup/YesFunction PageParts LookupNot Content
https://sdm1.domain.com/us/en/products/medical-cares/optical-series/optic-x202YesProduct HomeProduct HomeNot Content
https://sdm1.domain.com/us/en/products/medical-cares/optical-series/optic-x202/4gh6YesProduct HomeProduct HomeNot Content
3 REPLIES 3
h_l
Post Patron
Post Patron

Is there anyone could help or give some thread?

Thanks~

Vera_33
Resident Rockstar
Resident Rockstar

Hi @h_l 

 

I would prefer M to do it, text munipulation is more intuitive...did not write all the conditions in DAX, just point a direction, the video is not correctly returned now

Vera_33_0-1615447951039.png

Product in URL = IF(SEARCH("/products/",[URL],,-1)=-1,"No","Yes")

Content ID = 
VAR BH = SEARCH("/bh",[URL],,-1)
VAR DL = SEARCH("/dl",[URL],,-1)
VAR VID = SEARCH("/vid",[URL],,-1)
RETURN
SWITCH(TRUE(),
BH >-1, MID([URL],BH+1,8),
DL >-1, MID([URL],DL+1,8),
VID >-1, MID([URL],VID+1,9),
"TEST")

 

if you go with M, you can split URL to a list, then match and extract the value accordingly

 

Vera_33_1-1615448803655.png

 

@Vera_33, I am very new to Power BI, it's the first time I know M, learning on it.

Thanks for reply.

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.