Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Elitlogik
Frequent Visitor

Binary.Decompress possible error?

Binary.Length(Web.Contents("https://epss.cyentia.com/epss_scores-current.csv.gz"))

 

Returns 1391340

 

Binary.Length(Binary.Decompress(Web.Contents("https://epss.cyentia.com/epss_scores-current.csv.gz"), Compression.GZip))

 

Returns 63

 

Why? Please help me.

5 REPLIES 5
jennratten
Super User
Super User

Sometimes the data will be chunked when compressed or encoded so you may be experiencing data loss if it's not handled accordingly.  Try querying the headers to get more info like below. The content length should be included in the response.

let
 searchText = "Power Query"
in
 Web.Headers(
 "https://www.bing.com",
 [
 RelativePath = "search",
 Query = [q = searchText]
 ]
 )

 

Hi, thank you for your response.

 

The Gzipped CSV from cyentia.com is what I want to fetch and analyze in Power BI.

Downloading it manually, Unzip with 7zip and open in Excel works fine.

Using the code in the original post results in a table with only one row.

 

Isn't it strange that a Compressed file is 1.4 MB compressed and 63 characters Decompressed?

I found your post experiencing exactly the same error. It seems to be a problem with the compression format: If I use this code;

 

let
    web_path = "https://epss.cyentia.com/epss_scores-current.csv.gz",
    file = Binary.Decompress(Web.Contents(web_path), Compression.GZip),
    src=Lines.FromBinary(file)
...

 

 

...then I only get the topmost line from the file;

#model_version:v2023.03.01,score_date:2024-08-09T00:00:00+0000

 

But if I decompress the gzip file from cyentia.com and re-compress it in 7-zip using the default parameters for gzip and then use this code:

 

let
    compressed_file_path = "C:\temp\epss_scores-2024-08-09.csv.gz",
    file = Binary.Decompress(File.Contents(compressed_file_path), Compression.GZip),
    src=Lines.FromBinary(file),
...

 

 

...then I get the entire file:

#model_version:v2023.03.01,score_date:2024-08-09T00:00:00+0000
cve,epss,percentile
CVE-1999-0001,0.00383,0.73334
CVE-1999-0002,0.02080,0.89267
CVE-1999-0003,0.04409,0.92508
...

 

I have therefore filed a service request with support.first.org asking...

 

  1. ...if any special compression settings have been used (and if so, it is possible just to use the default parameters so that Power BI’s Binary.Decompress can decompress it)?

  2. If the compression format cannot be changed into the default one, if is then possible to know which compression parameters that have been used so that I can file a feature request/bug report to Microsoft regarding the Binary.Decompress function?

Did you find a workaround yourself?

 

BR,

 

-- 

Jakob

 

 

 

That particular compression method is very old and is limited in bits.  

I understand. You mean that the Binary.Decompress with the Compression.GZip parameter is old code? Because GZip as a standard is still vewry widely used on the web.

 

Strangely enough it decompresses another 8 MB file from another source just fine.

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors