The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi everyone,
At the moment I am trying to work around Spark M-code restrictions to create an index column within Data Factory's Power Query.
While the option to add an index column is present in the interface, when used it will return the error:
"Expression.Error: The Power Query Spark Runtime does not support the function Table.AddIndexColumn."
I am now trying to use a Power Query formula in "add custom column" to create an index column based on row count.
It is succesful in creating a table with one index column but it is then impossible to join/merge that table to the table which is in need of an index column (there is no join condition since the table contains only the one column).
It is also possible to create an index column as a "Table.AddColumn" but this gives several errors and results in an indexing of every record in that table for the amount of records in that table (so with 7 rows, it will create 7 rows for every one of those rows and gives them an index from 1 to 7).
I feel like I'm getting close to a viable work-around but I am missing something to implement it succesfully.
Any and all help will be much appreciated!
Re:
While the option to add an index column is present in the interface, when used it will return the error:
"Expression.Error: The Power Query Spark Runtime does not support the function Table.AddIndexColumn."
It's a bit of a surpise that it's in the interface and then fails; I wonder if it's a legacy thing: PQ added an argument to AddIndexColumn, being the last one, the data type. Try missing it out, ie:
Instead of:
= Table.AddIndexColumn(PreviousStep, "Index", 1, 1, Int64.Type)
try:
= Table.AddIndexColumn(PreviousStep, "Index", 1, 1)
Starting with a table (Table1) with a single column headed stuff:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
AddedIndex = Table.FromColumns({Source[stuff],{1..Table.RowCount(Source)}},{"stuff","Index"})
in
AddedIndex
Further, if you have more than 1 column in your table you can try:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
AddIndex=Table.FromColumns(Table.ToColumns(Source) & {{1..Table.RowCount(Source)}}, Table.ColumnNames(Source) & {"Index"})
in
AddIndex
The technique behind this works perfectly, however, ADF still gives an error about the dataset being dynamic in length which it cannot handle. I currently can't get the exact wording of the error because I get an internal server error when trying to load datasets into ADF power query.