Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
memote1
Helper I
Helper I

Help standardizing format from inconsistently formatted data in PQ

Hello,

 

I have two datasets I am merging using an ID number. The ID number comes from an external source, and we manually enter it into our system. Unfortunately, it is rarely entered in the correct format, making it impossible to match records. I manually go into our system to fix them, but I am hoping for a solution that doesn't require me to. The proper format is "023459-000000-MS-01". The first 6 digits never change, and always have the leading zero. The next 6 digits are unique, and the "-MS-01" doesn't change.

 

Below is an example of how the ID number has been entered, with the corresponding correctly formatted ID number.


Thank you!

 

Raw ID NumberProperly Formatted ID Number
023459000208023459-000208-MS-01
023459-000235-MS-01023459-000235-MS-01
23459-00125-MS-01023459-00125-MS-01
23459000447023459-000447-MS-01
23459000302-MS-01023459-000302-MS-01
023459--000375-MS-01023459-000375-MS-01
023459000170-MS-01023459-000170-MS-01
023459-000288-ms-01023459-000288-MS-01
023459-00260023459-000260-MS-01
23459-00120-ms-01023459-000120-MS-01
1 ACCEPTED SOLUTION
ronrsnfld
Super User
Super User

You can use this formula in the Add Custom Column dialog:

= [a=Text.Upper([Raw ID Number]),
 b=Text.PositionOf(a,"23459"),
 c=Text.RemoveRange(a,0,7-b),
 d=Text.Replace(c,"MS-01",""),
 e=Text.Trim(d,"-"),
 f=Number.ToText(
     Number.From(e),"""023459-""000000""-MS-01""")][f]

ronrsnfld_0-1740418512436.png

 

ronrsnfld_2-1740418582979.png

 

 

 

 

View solution in original post

6 REPLIES 6
ronrsnfld
Super User
Super User

You can use this formula in the Add Custom Column dialog:

= [a=Text.Upper([Raw ID Number]),
 b=Text.PositionOf(a,"23459"),
 c=Text.RemoveRange(a,0,7-b),
 d=Text.Replace(c,"MS-01",""),
 e=Text.Trim(d,"-"),
 f=Number.ToText(
     Number.From(e),"""023459-""000000""-MS-01""")][f]

ronrsnfld_0-1740418512436.png

 

ronrsnfld_2-1740418582979.png

 

 

 

 

v-ssriganesh
Community Support
Community Support

Hi @memote1,

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

dufoq3
Super User
Super User

Hi @memote1, check this:

 

Output

dufoq3_0-1740142095922.png

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjAyNjG1NDAwMDKwUIrVgQnogkSMTXV9g3UNDMHiMGFDIwxRoFoTE3MUvrGBEZIqqJkgQ43NTTEkgMKG5gaY6kFusLDQzS3GEDcyM0BzlAFMWSwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Raw ID Number" = _t]),
    Ad_CorrectID = Table.AddColumn(Source, "Correct ID", each 
        [ col = [Raw ID Number],
          a = Text.PositionOf(col, "9") +1,
          b = Text.Start(col, a),
          start = Text.PadStart(b, 6, "0"),
          mid = Text.PadStart(Text.BeforeDelimiter(Text.TrimStart(Text.RemoveRange(col, 0, Text.Length(b)), {"-"}), "-"), 6, "0"),
          result = Text.Combine({start, mid, "MS-01"}, "-")
        ][result], type text)
in
    Ad_CorrectID

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

Omid_Motamedise
Super User
Super User

You can use Fuzzy merge


If my answer helped solve your issue, please consider marking it as the accepted solution.
Akash_Varuna
Super User
Super User

Hi @memote1 , Add a custom column to generate the corrected format
Text.PadStart(Text.Middle(Text.Remove(Text.Lower([Raw ID Number]), "-ms-01"), 6), 6, "0") & "-" &
Text.PadStart(Text.Middle(Text.Remove(Text.Lower([Raw ID Number]), "-ms-01"), 6, Text.Length(Text.Remove(Text.Lower([Raw ID Number]), "-ms-01")) - 6), 6, "0") &
"-MS-01"
If this post helped please do give a kudos and accept this as a solution
Thanks In Advance

v-ssriganesh
Community Support
Community Support

Hi @memote1,
Thanks for posting your query in Microsoft fabric community forum.

I have reproduced your scenario and successfully transformed the data using Power Query in Power BI Desktop. Below are the detailed steps I followed, along with the expected output. I am also attaching a screenshot and PBIX file for your reference.

 

Step 1: I used the following sample data to simulate your scenario:

Raw ID Number

023459000208

023459-000235-MS-01

23459-00125-MS-01

23459000447

23459000302-MS-01

023459--000375-MS-01

023459000170-MS-01

023459-000288-ms-01

023459-00260

23459-00120-ms-01

 

Step 2: Load Data into Power Query

  1. Open Power BI Desktop.
  2. Click on Home → Enter Data (or load from Excel/CSV).
  3. Enter the Raw ID Numbers into a table.
  4. Click OK, then click Transform Data to open Power Query Editor.

 

Step 3: Convert All IDs to Uppercase

  • Click on the Raw ID Number column.
  • Go to Transform → Format → UPPERCASE.

Step 4: Extract Only Numeric Digits

  • Click on Add Column → Custom Column.
  • Use the following formula to extract only numbers:

CODE:

CopyEdit

Text.Select([Raw ID Number], {"0".."9"})

  • Rename this column to NumericOnly.

Step 5: Extract the Unique 6-Digit Part

  • Click Add Column → Custom Column.
  • Use the formula below:
    CODE:

CopyEdit

Text.Middle([NumericOnly], 6)

  • Rename it as UniquePart.

Step 6: Ensure 6 Digits with Leading Zeros

  • Click Add Column → Custom Column.
  • Use this formula:

CODE:

CopyEdit

Text.PadStart([UniquePart], 6, "0")

  • Rename it as PaddedUniquePart.

Step 7: Construct the Final ID Format

  • Click Add Column → Custom Column.
  • Use this formula:

CODE:

CopyEdit

"023459-" & [PaddedUniquePart] & "-MS-01"

  • Rename it as Formatted ID Number.

 

Step 8: Finalizing the Data

  • Keep only Raw ID Number and Formatted ID Number.
  • Click Close & Apply.

 

Output:

vssriganesh_0-1740113593549.png

  • For your reference, please check the attached .pbix file

If this helps, then please Accept it as a solution and dropping a "Kudos" so other members can find it more easily.
Thank you.

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors