Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
denkersmahlzeit
Frequent Visitor

Beginner question on data transformation

Dear community,

I have questions concerning data transformation.

Attached is a raw xlsx-file containing public data from the German Federal Statistical Office on student numbers.

It contains data per university (rows) on number of students specified by nationality (three top columns read “Germans” “Foreigners” and “Total”) and sex (“male” “female” “total” for each top column). It starts with data for the winter semester 1998/99 and continues on for each winter semester.

Here is a image snippet to get a first impression of the wide table.

Number of students per universityNumber of students per university

 

My goal is to prepare dashboards that visualize the data of specific subsets of universities and compare these subsets in development across time.

My question is:

  • is it good practice to initially transform this originally wide table in a long table?
  • would you do this transformation all from within Power Query Editor?
  • what would be your first steps to transform the data for best use?

 

Here is the link to the complete table: 
student numbers over years 

Thank you!

 

4 REPLIES 4
mlsx4
Memorable Member
Memorable Member

Hi @denkersmahlzeit 

 

1) Yes, you must do it. Otherwise, it wouldn't work.

2) I'm not sure if you will be able to achieve everything through Power Query, but almost everything.

3) It's difficult the part of getting the values through years in the way it seems to be structured, but I will do:
(In excel will fill all the row with the year)
- Transpose the table

- Fill down the column 1 to get the nationality

- Promote headers

-Unpivot columns

- Fix the values by: duplicating Atributte comparing with a conditional value to get the year per row, fill again, filter "extra rows" that I don't need, rewrite the attribute by extracting value before delimter and then removing extra columns.

 

But that's my approach. Maybe there is something that doesn't work.

 

My "dummie" data is the following:

mlsx4_6-1691494001784.png

 

And my advanced editor code:

 

let
    Origen = Excel.Workbook(File.Contents("C:\Users\Equipo2\Downloads\ex.xlsx"), null, true),
    Hoja1_Sheet = Origen{[Item="Hoja3",Kind="Sheet"]}[Data],
    #"Tabla transpuesta" = Table.Transpose(Hoja1_Sheet),
    #"Rellenar hacia abajo" = Table.FillDown(#"Tabla transpuesta",{"Column1"}),
    #"Encabezados promovidos" = Table.PromoteHeaders(#"Rellenar hacia abajo", [PromoteAllScalars=true]),
    #"Tipo cambiado" = Table.TransformColumnTypes(#"Encabezados promovidos",{{"Semester", type text}, {"Hoschulen", type text}, {"1989", Int64.Type}, {"HS1240", Int64.Type}, {"HS1241", Int64.Type}, {"HS1242", Int64.Type}, {"HS1243", Int64.Type}, {"HS1244", Int64.Type}, {"1990", Int64.Type}, {"HS1240_1", Int64.Type}, {"HS1241_2", Int64.Type}, {"HS1242_3", Int64.Type}, {"HS1243_4", Int64.Type}, {"HS1244_5", Int64.Type}}),
    #"Columnas con nombre cambiado" = Table.RenameColumns(#"Tipo cambiado",{{"1989", "Year"}}),
    #"Columna de anulación de dinamización" = Table.UnpivotOtherColumns(#"Columnas con nombre cambiado", {"Semester", "Hoschulen"}, "Atributo", "Valor"),
    #"Columna duplicada" = Table.DuplicateColumn(#"Columna de anulación de dinamización", "Atributo", "Atributo - Copia"),
    #"Tipo cambiado1" = Table.TransformColumnTypes(#"Columna duplicada",{{"Valor", type text}}),
    #"Columna condicional agregada" = Table.AddColumn(#"Tipo cambiado1", "Personalizado", each if [#"Atributo - Copia"] = "Year" then [Valor] else if [#"Atributo - Copia"] = [Valor] then [Valor] else null),
    #"Rellenar hacia abajo1" = Table.FillDown(#"Columna condicional agregada",{"Personalizado"}),
    #"Filas filtradas" = Table.SelectRows(#"Rellenar hacia abajo1", each ([#"Atributo - Copia"] <> "1990" and [#"Atributo - Copia"] <> "Year")),
    #"Texto extraído antes del delimitador" = Table.TransformColumns(#"Filas filtradas", {{"Atributo - Copia", each Text.BeforeDelimiter(_, "_"), type text}}),
    #"Columnas quitadas" = Table.RemoveColumns(#"Texto extraído antes del delimitador",{"Atributo"}),
    #"Columnas con nombre cambiado1" = Table.RenameColumns(#"Columnas quitadas",{{"Atributo - Copia", "Universitat"}, {"Valor", "Total"}})
in
    #"Columnas con nombre cambiado1"

  

Thank you for your reply! I learned a lot while following what you did.

However, when I follow your example there are now two columns which contain for each row years (1989 in "Year" and 1990 in "Personalizado"), but only one column with Total amount. 
So it seems that I cannot tell the values for each year apart anymore. And in the original data there a many more years than just two.

 

Is there a way to put all years into one column in continuous order?

Hi @denkersmahlzeit 

Yes... I know. Is there any possibility I can get a link with the code? The other one didn't work

Hi, I updated the link, now it should contain the original data! Thanks for taking a look.

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors