Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Anonymous
Not applicable

Messy data handling in 1 column

Hi everyone, I've got a file format I'm trying to handle which puts a whole bunch of different information into the 1 column.

 

It has date fields - eg 31/07/2018 

It has user names 

It has invoice numbers

 

There are other columns shown too which are necessary but I wanted to be able to split out the data in the aforementioned column into separate columns. 

 

E.g. 1 column dates, another for users etc

 

Is there some sort of functions I can use to do this or DAX code that takes regular expressions (REGEX) to do this too?

 

thank you!

 

 

EDIT: Example image from Edit Query page attached - this shows how time fields and date fields are shown in the one column. I don't show it here but there are also user names in the same field that I'd like to parse. Hope this helps.

 

pbi question.JPG

23 REPLIES 23
v-eachen-msft
Community Support
Community Support

Hi @Anonymous ,

 

Do you mean to get the result like the following image?

3-1.PNG

 

Community Support Team _ Eads
If this post helps, then please consider Accept it as the solution to help the other members find it.
Anonymous
Not applicable

Hi @v-eachen-msft  - what was the DAX code you used to do what you had in your image?

Anonymous
Not applicable

@v-eachen-msft - yes - that's precisely what I'm trying to be able to do. How were you able to do that?

Greg_Deckler
Community Champion
Community Champion

Need to see examples. Text preferred. Please see this post regarding How to Get Your Question Answered Quickly: https://community.powerbi.com/t5/Community-Blog/How-to-Get-Your-Question-Answered-Quickly/ba-p/38490



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

Thanks Greg - added some extra details for clarification

Is there any pattern to this? Like how do you know what rows go with ?



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...
Anonymous
Not applicable

Hi @Greg_Deckler Greg - hopefully my response to @HotChilli above explains the pattern. I'm going to try the from examples idea now.

@ImkeF  @edhans  Any magic pixie dust you guys have lying around?



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Hi @Anonymous 

Please give us some sample data using the directions linked to below, as well as a picture or another table of what the expected output would be.

 

Thanks.

 

How to get good help fast. Help us help you.
How to Get Your Question Answered Quickly
How to provide sample data in the Power BI Forum



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Anonymous
Not applicable

Hi @edhans  and @Ashish_Mathur  - my apologies. Was busy the last 2 days. The recent examples shown in suggested solutions (above) are what I'm trying to get to but next time I post an issue like this I'll make sure to include before/after expected results. Thanks for looking into this though. Now just need to wait on the code or magic pixie dust as to how to solve this - ha!

Thanks for replying.  Are you able to post data as per previous requests?

 

Anonymous
Not applicable

@HotChilli here is an example XLS... here is an example of original data

 

ColumnValue
31/01/20201
Joe Bloggs 
1412123 
12314124 
52342344 
1/02/20204
N/A 
1314234234 
N/A 
10203030 
31312330 
Joe Bloggs 
2/02/20205
3/02/20203
Jane Doe 
31/01/20205
N/A 
12312344 
1/02/20204
2/02/20203

 

Expected Output is this

 

DateUserValue
31/01/2020Joe Bloggs1
1/02/2020Joe Bloggs4
2/02/2020Joe Bloggs5
3/02/2020Joe Bloggs3
31/01/2020Jane Doe5
01/02/2020Jane Doe4
02/02/2020Jane Doe3

 

 

Hi,

This data does not make any sense whatsoever.  For 2/2/2020 and 3/2/2020, there are no names in your source data.  So then why should there be Joe Bloggs against these 2 dates?


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/
Anonymous
Not applicable

Thanks @Ashish_Mathur. The names are not meant to be lined up against the dates - that's the reason for my posting. If things were lined up fine then I would have been able to solve it but unfortunately the data does not come out like that.

 

Take a look at the table again. I edited the data so you can see what happens when the next user name comes up in the last. In this case, I added Jane Doe. The rules are that the top section refers to Joe Bloggs, the next user is Jane Doe. The date values are the only row where value data is recorded against.

 

Power Query code:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bc4xCoAwDAXQq5TMhTY/8QA6OniB0lG6CA7eH4yKbQVJl7zwk6ZEwiFyQEQkT0zZJ5r31U3bXsph5G5iZTCk9ZCLtMIAUXsNbC3etXrLEsY2tfiTqPQZW06sKogFIB38/BHdxeFJdSKU8wk=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Column = _t, Value = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column", type date}, {"Value", Int64.Type}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Value] <> null)),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "User", each "Joe Bloggs")
in
    #"Added Custom"

Notice that I cheated on the User column.

Do the simplest thing that works is usually my motto.

If your data only has one user I'll be very surprised.

Anonymous
Not applicable

Thank you!

There are different users... the data types are:

  • Dates
  • Usernames
  • NA values (ignore)
  • Invoice numbers (also to ignore)

I thought I'd do something with filters like you. Would it be possible for the filter for the user name to change dynamically based on a separate list, such as a mapping file?

Anonymous
Not applicable

Thank you!

 

There are different users... the types of data are:

  • Dates
  • User Names
  • NA values (to ignore)
  • Invoice numbers (also to ignore)

I had thought of doing something with filters like you have. Would it be possible to have the filter for user name dynamically change based on a separate list - like a mapping file?

@Anonymous   A possible way to solve this would be to use calculated columns. The assumption here is that the value appears next to the date. What makes it a bit ambiguous is that the occurrence of name is not consistent, do we pick the pevious non-blank name or the next? Also, you do not have test data with different names, so not sure if everything is correct. 

 

In any case, here are the steps:

 

1. Add an index to the table (Transfom Data)

2. Add the below columns:

 

Possible Name = IF(ISBLANK(Messy[Value]),  IF(AND( ISERROR((ISNUMBER(VALUE(Messy[Column])))), Messy[Column]<>"N/A"), Messy[Column], ""), "")

Prev Idx = 
 CALCULATE (
    MAX(Messy[Index] ),
    FILTER (
       ALL(Messy),
       NOT(ISBLANK(Messy[Possible Name])) && 
       Messy[Possible Name] <> "" &&
       Messy[Index]< EARLIER(Messy[Index]))
    )

Next Idx = 
 CALCULATE (
    MAX(Messy[Index] ),
    FILTER (
       ALL(Messy),
       NOT(ISBLANK(Messy[Possible Name])) && 
       Messy[Possible Name] <> "" &&
       Messy[Index]> EARLIER(Messy[Index]))
    )

Picked Idx = IF(ISBLANK(Messy[Prev Idx]), Messy[Next Idx], Messy[Prev Idx])

Final Name = CALCULATE(MAX(Messy[Possible Name]), FILTER(ALL(Messy), Messy[Index]=EARLIER(Messy[Picked Idx])))

 

 

3. Filter the table where the value column is non-blank. 

 

This is how it looks for me. 

PBI help messy.PNG

I'll chime in here if that's ok.

If you duplicate the column, you can then change the data type e.g. date or time.  Powerbi will parse the column and return an error for non-compliant fields.  Right-click on the column and 'Remove Errors'

 

The format of data provided looks reasonably straightforward so this should work.  For more complex situations, try adding a column 'from examples' and give a few examples (i.e. on several different rows).  Power Query will make a good effort at trying to get what you want.  It doesn't always work but it's pretty good.  You need to examine the M code generated as a sanity check.

Anonymous
Not applicable

Thanks Hot Chilli 

 

Will have a go at the M query one. I tried the duplicating columns but I think the problem is that whilst that would give me the dates, I need to find a way to bring in the other data too not get rid of them.

 

As you can see below, there is a user name, below that are dates and further below are times.

 

There is another column I'm trying to keep from this table which only has values on the date row.

 

There are other users as you scroll down and the table follows the same kind of format of User name then a whole bunch of dates and times. It then goes to the next user and a whole bunch of dates and times. Again, the value I'm trying to aggregate for users is only on the date row.

 

Imagine the pattern below repeats.

 

Power BI - single column parsing example.JPG

 

I'd like to be able to get a table that shows the User in 1 column, date in another and the value in the next.

 

I will try the from examples idea next.

 

Thank you!

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.