Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
rvp_ordix
Helper I
Helper I

Why are text lengths different?

I have a table, which I can't refresh due to the error "MashupException.Error: DataSource.Error: Microsoft SQL: String or binary data would be truncated."
My problem is that I have trouble debugging the issue and applying Text.Start to the affected columns doesn't work probably?  The different ways of gauging the maximum string length in these columns give different results, 120 or 126 depending on how I do it. 

An example script is below, and the returned length for each block is added as a comment. The Column's width in the destination table is 120, and the dataflow refresh fails with the above error, so 1 and 4 are likely correct, but I don't understand how these differing results are possible at all. 

TabTextCut = Table.TransformColumnTypes(Table.TransformColumns(TabInhalt, {{"t_name", each Text.Start(_, 120)}}), {{"t_name", type text}})
,TargetTable = Table.AddColumn(
//temp-table, used by methods 1-3
    TabTextCut, "text_length", each [t_name] //Add column "text_length" as copy of "t_name
  )

  //126
  ,LongestString = List.Max(
    List.Transform(
      Table.ToList(
        Table.SelectColumns(
          TargetTable, "text_length"
        )
      ), Text.Length
    )
  )

  //120
  ,LongestStringAlt = List.Max(
    List.Transform(
      Table.ToList(
        Table.SelectColumns(
          Table.TransformColumns(
            TargetTable, {{"text_length", (tex as nullable text) as nullable text => Text.From(Text.Length(tex)}}
          ), "text_length"
        )
      ), Number.FromText
    )
  )

  //120
  ,LongestStringTable = Table.First(
    Table.Sort(
      Table.TransformColumns(
        TargetTable, {{"text_length", Text.Length}}
      ), {"text_length", Order.Descending}
    )
  )[text_length]

  //126
  ,LongestStringDirect = List.Max(
    List.Transform(
      Table.ToList(
        Table.SelectColumns(TabTextCut, "t_name") //Explicitely not TargetTable
      )
      , Text.Length
    )

Edit:
I have, as a test, cut off every string after the first letter with Text.Start(_, 1). The debug function below still manages to return maximum string lengths of 4 somehow. base_table is the table I am examining, column_length_map stores each column name of base_table, their data type and, for text types, their allowed length.

(base_table as table, column_length_map as table) as table => 
let
  StringCols = Table.SelectRows(column_length_map, each [data_type] = "char" or [data_type] = "varchar"),
  test2 = Table.AddColumn(Table.SelectColumns(StringCols, "staging_column"), "length", each
      List.Max(
        List.Transform(
          Table.ToList(
            Table.SelectColumns(base_table, Record.Field(_, "staging_column"))
          )
          , (val) => Text.Length(val)
        )
      )
  )
  , test3 = Table.Join(test2, "staging_column", Table.SelectColumns(column_length_map, {"staging_column", "max_length"}), "staging_column", JoinKind.Inner)
in
  test3
1 ACCEPTED SOLUTION

Bit late I suppose but it wasn't an encoding issue. The dataset I was working on had quotation marks as part of the text and it seems that Text.Start and Text.Length both ignore those usually. If I convert the column to a list, however, then suddenly Text.Length counts the quotation marks and that's where the disparity in lengths came from. 
I'm also assuming (and this is somewhat hard to verify) that the quotation marks are counted towards length when writing to my warehouse, but since the string Test.Start(_, 2) would cut the string '"testing"' to '"te' this leads to the errors I encountered.

View solution in original post

5 REPLIES 5
v-kongfanf-msft
Community Support
Community Support

Hi @rvp_ordix ,

 

Did Ibendlin reply solve your problem? If so, please mark it as the correct solution, and point out if the problem persists.  

 

Best Regards,
Adamk Kong

lbendlin
Super User
Super User

Remember that strings are Unicode (specifically UCS2) so two bytes per character.  I guess the extra two bytes come from the string terminator 0x0000

Bit late I suppose but it wasn't an encoding issue. The dataset I was working on had quotation marks as part of the text and it seems that Text.Start and Text.Length both ignore those usually. If I convert the column to a list, however, then suddenly Text.Length counts the quotation marks and that's where the disparity in lengths came from. 
I'm also assuming (and this is somewhat hard to verify) that the quotation marks are counted towards length when writing to my warehouse, but since the string Test.Start(_, 2) would cut the string '"testing"' to '"te' this leads to the errors I encountered.

Hi @rvp_ordix ,

 

Thanks for your feeback.

 

Best regards,

Adamk Kong

Oh dear, would be both possible as well as irritating. I'll take another look, thanks. 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors