Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
rvp_ordix
Frequent Visitor

Why are text lengths different?

I have a table, which I can't refresh due to the error "MashupException.Error: DataSource.Error: Microsoft SQL: String or binary data would be truncated."
My problem is that I have trouble debugging the issue and applying Text.Start to the affected columns doesn't work probably?  The different ways of gauging the maximum string length in these columns give different results, 120 or 126 depending on how I do it. 

An example script is below, and the returned length for each block is added as a comment. The Column's width in the destination table is 120, and the dataflow refresh fails with the above error, so 1 and 4 are likely correct, but I don't understand how these differing results are possible at all. 

TabTextCut = Table.TransformColumnTypes(Table.TransformColumns(TabInhalt, {{"t_name", each Text.Start(_, 120)}}), {{"t_name", type text}})
,TargetTable = Table.AddColumn(
//temp-table, used by methods 1-3
    TabTextCut, "text_length", each [t_name] //Add column "text_length" as copy of "t_name
  )

  //126
  ,LongestString = List.Max(
    List.Transform(
      Table.ToList(
        Table.SelectColumns(
          TargetTable, "text_length"
        )
      ), Text.Length
    )
  )

  //120
  ,LongestStringAlt = List.Max(
    List.Transform(
      Table.ToList(
        Table.SelectColumns(
          Table.TransformColumns(
            TargetTable, {{"text_length", (tex as nullable text) as nullable text => Text.From(Text.Length(tex)}}
          ), "text_length"
        )
      ), Number.FromText
    )
  )

  //120
  ,LongestStringTable = Table.First(
    Table.Sort(
      Table.TransformColumns(
        TargetTable, {{"text_length", Text.Length}}
      ), {"text_length", Order.Descending}
    )
  )[text_length]

  //126
  ,LongestStringDirect = List.Max(
    List.Transform(
      Table.ToList(
        Table.SelectColumns(TabTextCut, "t_name") //Explicitely not TargetTable
      )
      , Text.Length
    )

Edit:
I have, as a test, cut off every string after the first letter with Text.Start(_, 1). The debug function below still manages to return maximum string lengths of 4 somehow. base_table is the table I am examining, column_length_map stores each column name of base_table, their data type and, for text types, their allowed length.

(base_table as table, column_length_map as table) as table => 
let
  StringCols = Table.SelectRows(column_length_map, each [data_type] = "char" or [data_type] = "varchar"),
  test2 = Table.AddColumn(Table.SelectColumns(StringCols, "staging_column"), "length", each
      List.Max(
        List.Transform(
          Table.ToList(
            Table.SelectColumns(base_table, Record.Field(_, "staging_column"))
          )
          , (val) => Text.Length(val)
        )
      )
  )
  , test3 = Table.Join(test2, "staging_column", Table.SelectColumns(column_length_map, {"staging_column", "max_length"}), "staging_column", JoinKind.Inner)
in
  test3
1 ACCEPTED SOLUTION

Bit late I suppose but it wasn't an encoding issue. The dataset I was working on had quotation marks as part of the text and it seems that Text.Start and Text.Length both ignore those usually. If I convert the column to a list, however, then suddenly Text.Length counts the quotation marks and that's where the disparity in lengths came from. 
I'm also assuming (and this is somewhat hard to verify) that the quotation marks are counted towards length when writing to my warehouse, but since the string Test.Start(_, 2) would cut the string '"testing"' to '"te' this leads to the errors I encountered.

View solution in original post

5 REPLIES 5
v-kongfanf-msft
Community Support
Community Support

Hi @rvp_ordix ,

 

Did Ibendlin reply solve your problem? If so, please mark it as the correct solution, and point out if the problem persists.  

 

Best Regards,
Adamk Kong

lbendlin
Super User
Super User

Remember that strings are Unicode (specifically UCS2) so two bytes per character.  I guess the extra two bytes come from the string terminator 0x0000

Bit late I suppose but it wasn't an encoding issue. The dataset I was working on had quotation marks as part of the text and it seems that Text.Start and Text.Length both ignore those usually. If I convert the column to a list, however, then suddenly Text.Length counts the quotation marks and that's where the disparity in lengths came from. 
I'm also assuming (and this is somewhat hard to verify) that the quotation marks are counted towards length when writing to my warehouse, but since the string Test.Start(_, 2) would cut the string '"testing"' to '"te' this leads to the errors I encountered.

Hi @rvp_ordix ,

 

Thanks for your feeback.

 

Best regards,

Adamk Kong

Oh dear, would be both possible as well as irritating. I'll take another look, thanks. 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors