Don't miss your chance to take the Fabric Data Engineer (DP-700) exam on us!
Learn moreNext up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now
I have a column that needs some advanced parsing, that seems to be best done with Python. However, there is another column that I am not touching in Python that is being altered. Can anyone explain what is going on under the hood?
# 'dataset' holds the input data for this script
import pandas as pd
ds = dataset
ds['ROAD_NAME_CLEAN'] = ds['ROAD_NAME'].str.extract(r'(\d+\s+)?([NnEeSsWw](\.\s|\s|\.))?([^\(]*)(\s?\(.*\))?')[3]
ds['REFERENCE_ROAD_NAME_CLEAN'] = ds['REFERENCE_ROAD_NAME'].str.extract(r'(\d+\s+)?([NnEeSsWw](\.\s|\s|\.))?([^\(]*)(\s?\(.*\))?')[3]
dataset = ds
Before:
After:
@maiios - Sorry, what is being affected that shouldn't? It's hard to compare the pictures.
Sorry... the CENSUS_TRACT is chaning from a string to a floating point, even though the column is still a string. Basically, the CENSUS_TRACT is supposed to be a six digit number, but the decimal, and dropping the leading zeros causes issues.
I could reformat the string, but I want to understand why its happening.
@dm-p You have any Python skillz??
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 5 | |
| 4 | |
| 3 | |
| 3 | |
| 2 |
| User | Count |
|---|---|
| 8 | |
| 6 | |
| 6 | |
| 6 | |
| 5 |