Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply

Combine data and number column in Power Query

Hello,

 

I need to some help, I think it should be fairly simple but for the life of me I can't find the solution. I have some not so great data which I need to find the duplicates, to do that I need to combine a number column with a data column to group by and see the duplicate items. 

 

Number ColumnDateRef
000116/08/20210001 16/08/2021

 

Unfortunately I just keep getting an error message so I'm clearly doing something wrong.

 

Any help would be mightly appreciated. 

1 ACCEPTED SOLUTION
mahoneypat
Employee
Employee

FYI that you can Ctrl click on your number and date columns to select them both, and then choose Remove Duplicates from the ribbon under Remove Rows.  To concatenate them, they both need to be type text, so you'll need something like:

 

= Number.ToText([Number Column]) & " " & Date.ToText([Date Column], "yyyyMMMdd")

 

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


View solution in original post

6 REPLIES 6
Syndicate_Admin
Administrator
Administrator

@Shelley-Baynton 

 

Hope I'm understanding correctly.  You are trying to make a key field you can use to remove duplicates.

 

If this is correct this is what I do in these situations:

 

Clean up the data 1st then make your key field.

 

Since i don't want to lose my data, I'll start of by creating duplicates of the columns I'm trying to convert into a key field and do my cleaning on those columns especially since the column merge will remove the columns being merged, this means I only lose the duplicate columns not the originals.

 

Here are the steps:

  • Right clicking each column header and select Duplicate Column
    LarryAlx_0-1629134587497.png
    • (No need to rename these new columsn since these will be throw away columns)
  • In order to remove leading and trailing spaces from the data, select the 2 newly copied columns
    • While they are highlighted right click on the header of one of them --> Change Type --> Text
      LarryAlx_3-1629135167151.png
    • While they are still highlighted, if not reselect them, right click on the header of one of them --> Transform --> Trim
      LarryAlx_4-1629135243181.png
    • You could filter, sort and replace through the copied columns to clean up the data but since you are just looking to remove duplicates, you could skip this and come back to it once your data set is smaller after deduping, making it a little quicker to clean up.
  • Finally Create your key by selecting both copied columns again, if they are not already selected, and right clicking on the header of one them, one more time, then select Merge Columns
    LarryAlx_5-1629135425367.png
    • Choose a seperator if you want to add one and name your new column

      LarryAlx_6-1629135479182.png

       

  • This will create your new key field that can be used to remove duplicates.  One way I remove duplicates is using the group by option.
    LarryAlx_7-1629135606824.png

Hope this helps.

LarryAlx
Frequent Visitor

@Shelley-Baynton 

 

Hope I'm understanding correctly.  You are trying to make a key field you can use to remove duplicates.

 

If this is correct this is what I do in these situations:

 

Clean up the data 1st then make your key field.

 

Since i don't want to lose my data, I'll start of by creating duplicates of the columns I'm trying to convert into a key field and do my cleaning on those columns especially since the column merge will remove the columns being merged, this means I only lose the duplicate columns not the originals.

 

Here are the steps:

  • Right clicking each column header and select Duplicate Column
    LarryAlx_0-1629134587497.png
    • (No need to rename these new columsn since these will be throw away columns)
  • In order to remove leading and trailing spaces from the data, select the 2 newly copied columns
    • While they are highlighted right click on the header of one of them --> Change Type --> Text
      LarryAlx_3-1629135167151.png
    • While they are still highlighted, if not reselect them, right click on the header of one of them --> Transform --> Trim
      LarryAlx_4-1629135243181.png
    • You could filter, sort and replace through the copied columns to clean up the data but since you are just looking to remove duplicates, you could skip this and come back to it once your data set is smaller after deduping, making it a little quicker to clean up.
  • Finally Create your key by selecting both copied columns again, if they are not already selected, and right clicking on the header of one them, one more time, then select Merge Columns
    LarryAlx_5-1629135425367.png
    • Choose a seperator if you want to add one and name your new column

      LarryAlx_6-1629135479182.png

       

  • This will create your new key field that can be used to remove duplicates.  One way I remove duplicates is using the group by option.
    LarryAlx_7-1629135606824.png

Hope this helps.

mahoneypat
Employee
Employee

FYI that you can Ctrl click on your number and date columns to select them both, and then choose Remove Duplicates from the ribbon under Remove Rows.  To concatenate them, they both need to be type text, so you'll need something like:

 

= Number.ToText([Number Column]) & " " & Date.ToText([Date Column], "yyyyMMMdd")

 

Pat

 





Did I answer your question? Mark my post as a solution! Kudos are also appreciated!

To learn more about Power BI, follow me on Twitter or subscribe on YouTube.


@mahoneypa HoosierBI on YouTube


goncalogeraldes
Super User
Super User

As @sean_w said it is quite hard to give a straight answer without a little bit more of context. My suggestion would be to, prior to the combine step in the Query Editor, changing the type of the two columns to "type any" and then combining the two. After the combine is performed, changing back to "date" and "Int64.Type" (i guess its this type you have now) to maintain data consistency and quality.

 

Hope this answer solves your problem! If you need any additional help please tag me in your reply.
If my reply provided you with a solution, pleased mark it as a solution ✔️ or give it a kudoe 👍
Thanks!

Best regards,
Gonçalo Geraldes

sean_w
Frequent Visitor

Without seeing the data types, how you're trying to combine the data, and the error, this is a bit of a shot in the dark.  Given you're dealing with two different data types, it's likely that you're attempting to use a function that doesn't handle the data types.  Here are a couple resources that might be helpful:

 

Power Query Merge Columns help article:
https://support.microsoft.com/en-us/office/merge-columns-power-query-80ec9e1e-1eb6-4048-b500-d5d42d9...

 

Power Query Function references:

https://docs.microsoft.com/en-us/powerquery-m/number-totext

https://docs.microsoft.com/en-us/powerquery-m/date-totext

 

Hi Sean,

Thanks for the feedback - @mahoneypat has managed to provide a solution 😊

Helpful resources

Announcements
Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

February 2024 Update Carousel

Power BI Monthly Update - February 2024

Check out the February 2024 Power BI update to learn about new features.

Fabric Career Hub

Microsoft Fabric Career Hub

Explore career paths and learn resources in Fabric.

Fabric Partner Community

Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.

Top Solution Authors
Top Kudoed Authors