- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
UUID JSON export from Mongo DB
We are exporting JSON Files from a Mongo DB with the Data Factory. The UUID in the MongoDB is represented like this "5aa399c0-2701-401e-8e83-26b5ce82abc7". In the exported JSON File whe geht the following values:
"_id": {
"$binary": "wJmjWgEnHkCOgya1zoKrxw==",
"$type": "03"
},
Any ideas how I can get the representation as a GUID in the export file instead the "ugly" binary value?
I already tried modifying the connection string to the MongoDB but nothing worked. And there is no setting to configure the driver used by the data factory to read the values.
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The 'ugly' string is base64 encoded.
We get the MongoDB file with a copy task from the and write it to a blob storage. Afterwards we import it with databricks and decode the base64.
The function we use:
def base64ToGUID(b64str):
try:
s = bitstring.BitArray(bytes=base64.b64decode(b64str)).hex
def rev2(s_):
def chunks(n):
for i in range(0, len(s_), n):
yield s_[i:i+n]
return "".join(list(chunks(2))[::-1])
return "-".join([rev2(s[:8]),rev2(s[8:][:4]),rev2(s[12:][:4]),s[16:][:4],s[20:]])
except:
return None
We read the JSON to a spark dataframe, convert it to a pandas dataframe and apply the decoding:
# Create pandas dataframe
df_pandas = df_spark.toPandas()
#iterate over rows to decode the columns
for index, row in df_pandas.iterrows():
row["row_name_1"] = base64ToGUID(row["row_name_1"])
row["row_name_2"] = base64ToGUID(row["row_name_2"])
row["row_name_n"] = base64ToGUID(row["row_name_n"])
Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Alternative solutions can be found here: Convert UUID from MongoDb to UNIQUEIDENTIFIER
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The 'ugly' string is base64 encoded.
We get the MongoDB file with a copy task from the and write it to a blob storage. Afterwards we import it with databricks and decode the base64.
The function we use:
def base64ToGUID(b64str):
try:
s = bitstring.BitArray(bytes=base64.b64decode(b64str)).hex
def rev2(s_):
def chunks(n):
for i in range(0, len(s_), n):
yield s_[i:i+n]
return "".join(list(chunks(2))[::-1])
return "-".join([rev2(s[:8]),rev2(s[8:][:4]),rev2(s[12:][:4]),s[16:][:4],s[20:]])
except:
return None
We read the JSON to a spark dataframe, convert it to a pandas dataframe and apply the decoding:
# Create pandas dataframe
df_pandas = df_spark.toPandas()
#iterate over rows to decode the columns
for index, row in df_pandas.iterrows():
row["row_name_1"] = base64ToGUID(row["row_name_1"])
row["row_name_2"] = base64ToGUID(row["row_name_2"])
row["row_name_n"] = base64ToGUID(row["row_name_n"])
Hope this helps.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your reply. I'm looking for another way because we don't use spark.
@GraceGu could you suggest anything? Can I use Power Query (Dataflow Gen2 in Fabric) or transform it in Data Pipeline before/after Copy?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Would you mind filling a support case for this? We'll need engineering to track this one down. Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I faced the same problem. any solutions?

Helpful resources
Join us at the Microsoft Fabric Community Conference
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Fabric Monthly Update - February 2025
Check out the February 2025 Fabric update to learn about new features.

Subject | Author | Posted | |
---|---|---|---|
02-21-2025 03:13 AM | |||
02-16-2024 08:16 AM | |||
06-20-2023 06:05 AM | |||
04-12-2024 02:31 AM | |||
02-14-2025 07:06 PM |
User | Count |
---|---|
6 | |
4 | |
2 | |
1 | |
1 |
User | Count |
---|---|
13 | |
10 | |
5 | |
5 | |
4 |