Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Anonymous
Not applicable

Messy Raw Data

hi, I'm hoping somebody here is able to guide me on the best way to transform this data coming from a very messy raw data file. The main problem is that the column headers & data are spread over 2 rows.

 

The objective is to transform the data & have 1 row of headers & data (per order)
I have the raw data in 2 different formats (txt and xls list), but neither one seems to be easier to work with than the other

 

Thanks in advance for anybody who was able to assist on this one

5 REPLIES 5
samdthompson
Memorable Member
Memorable Member

Double row header is not a problem:

 

1. Tanspose data

2. Merge first two columns

3. Transpose data

4. Promote first row as headers

 

 

// if this is a solution please mark as such. Kudos always appreciated.
Anonymous
Not applicable

I don't believe that handles the double rows of data

oh sorry, i missed the part on rows being affected too. Urgh, horrible situation. Okay, 

 

1. Bring in the data and make sure its in an order where the rows are in their order ie row 1a, row 1b, row row 2a, row 2b etc.

2. Add index column starting at 1 and filter for even numbers only,

3. Duplicate query and change the filter step to odd numbers only

4. On the 'odds' table create a custom column which is index +1.

5. Merge the two tables using the index from the 'evens' and the custom from the 'odds'

6. Expand out the data.

 

So long as the sorting in the first step is solid. this will work fine. Do the first suggestion too, sorting out the double row headers.

 

 

// if this is a solution please mark as such. Kudos always appreciated.
Anonymous
Not applicable

Thanks for responding again. Unfortunately the data is sometimes over 1 row, sometimes 2 and sometimes 3 (depending on what elements are updated).

 

data.png

Ah right.

1. When you bring it in the first column will have the DO/SO item wil nulls for any 2nd or 3rd line.

2. Use the fill down option to populate those down the column.

3. Group by the DO/SO column and get a count of rows for each and expand out again.

4. Merge all the columns other than DO/SO and the count column together and then pivot them by the count.

5. Split the columns by what ever delimiter you chose when you merged them

6. I cant quite imagine how many columns you will end up with after this process but there will be a bunch to delete

 

 

// if this is a solution please mark as such. Kudos always appreciated.

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors