Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
I run the script in jupyter notebook. The result of text (string) looks normal as below:
Note: The result is Thai language.
When I try to get data in the power bi via python script (same script), the result of text (string) is shown below:
The python script I used looks like below:
from sqlalchemy import create_engine
import pandas as pd
engine = create_engine('mssql+pyodbc://xx.xxx.xx.xx/DB?driver=SQL+Server')
connection = engine.connect()
stmt = "SELECT * FROM XXX"
df = pd.read_sql(stmt, connection)
I try some methods of dataframe like .str.encode('utf-8') but it does not help.
Please anyone help to figure it out.
Handling text encoding issues in Python can be tricky, especially when dealing with data from various sources. Encoding problems often arise due to mismatches between the expected and actual character sets. Here are some steps to troubleshoot and fix text encoding issues when getting data via a Python script:
### 1. Identify the Encoding
First, identify the encoding of the source data. Common encodings include UTF-8, ISO-8859-1 (Latin-1), and Windows-1252. If the source of your data specifies an encoding, use that. If not, you might need to guess or inspect the data.
### 2. Read Data with Correct Encoding
When reading data from files, databases, or web APIs, ensure you're using the correct encoding.
**Example for Reading a File:**
```python
with open('data.txt', 'r', encoding='utf-8') as file:
data = file.read()
```
### 3. Handling Web Data
When fetching data from the web using libraries like `requests`, check the response's encoding.
**Example with `requests`:**
```python
import requests
response = requests.get('https://example.com')
response.encoding = 'utf-8' # Set the correct encoding
data = response.text
```
### 4. Common Encoding and Decoding
When you have a byte string (binary data) that needs to be decoded to a regular string, or vice versa, use `encode()` and `decode()` methods.
**Example:**
```python
# Decoding bytes to string
byte_data = b'\xe2\x98\x83'
string_data = byte_data.decode('utf-8')
# Encoding string to bytes
string_data = '☃'
byte_data = string_data.encode('utf-8')
```
### 5. Handling Errors
Use error handling strategies to manage encoding and decoding issues. The `errors` parameter in `decode` and `encode` methods can handle errors gracefully.
**Options include:**
- `errors='ignore'`: Ignore characters that can’t be decoded.
- `errors='replace'`: Replace problematic characters with a placeholder (e.g., `?`).
**Example:**
```python
byte_data = b'\xe2\x98\x83\x97'
string_data = byte_data.decode('utf-8', errors='ignore') # Ignoring errors
```
### 6. Using `chardet` for Automatic Detection
The `chardet` library can automatically detect the encoding of your data. Install it using `pip install chardet`.
**Example:**
```python
import chardet
raw_data = b'\xe2\x98\x83\x97'
result = chardet.detect(raw_data)
encoding = result['encoding']
# Decode using the detected encoding
string_data = raw_data.decode(encoding)
```
### 7. Dealing with Pandas
If you’re working with data in pandas, specify the encoding when reading data.
**Example:**
```python
import pandas as pd
df = pd.read_csv('data.csv', encoding='utf-8')
```
### Example Python Script
Here’s a full example demonstrating handling encoding when fetching data from a URL and reading from a file:
```python
import requests
import chardet
# Fetch data from a URL
url = 'https://example.com/data'
response = requests.get(url)
response.encoding = 'utf-8' # Set the encoding to UTF-8
data = response.text
# Write data to a file with a specific encoding
with open('output.txt', 'w', encoding='utf-8') as file:
file.write(data)
# Read the data back from the file
with open('output.txt', 'rb') as file:
raw_data = file.read()
# Detect the encoding
detected_encoding = chardet.detect(raw_data)['encoding']
print(f"Detected encoding: {detected_encoding}")
# Decode the data using the detected encoding
decoded_data = raw_data.decode(detected_encoding)
print(decoded_data)
```
### Conclusion
By understanding and handling text encoding properly, you can avoid common pitfalls and ensure that your data is processed correctly in Python. Use the right tools and methods to identify, decode, and encode your data to maintain its integrity and readability.
Read More
Adobe Commerce Development Services
Pimcore Development Services
Magento Developers
Maybe you try to encode only text column after query data to dataframe like this link below. (same your problem but in R)
Thank you ritthikrai.
I try to encode as below but it still represent in unknown characters.
df['column'] = df['column].str.encode('utf-8')
Text encoding issues in Python can be quite common, especially when dealing with data from various sources that might use different character sets. Here’s a comprehensive guide to handle and fix text encoding problems when fetching data via a Python script.
### Steps to Handle Text Encoding Issues
#### 1. Identify the Encoding
Understanding the encoding of your source data is crucial. Common encodings include UTF-8, ISO-8859-1 (Latin-1), and Windows-1252. If the source specifies an encoding, use that. If not, you may need to guess or detect the encoding.
#### 2. Read Data with Correct Encoding
When reading data from files, web APIs, or other sources, ensure you specify the correct encoding.
**Example for Reading a File:**
```python
with open('data.txt', 'r', encoding='utf-8') as file:
data = file.read()
```
#### 3. Fetching Web Data with Correct Encoding
When fetching data from the web, libraries like `requests` can help. Check and set the response's encoding.
**Example with `requests`:**
```python
import requests
response = requests.get('https://example.com/data')
response.encoding = 'utf-8' # Set the correct encoding
data = response.text
```
#### 4. Decoding and Encoding Strings
When dealing with byte strings, use `decode()` and `encode()` methods to convert between bytes and strings.
**Example:**
```python
# Decoding bytes to string
byte_data = b'\xe2\x98\x83'
string_data = byte_data.decode('utf-8')
# Encoding string to bytes
string_data = '☃'
byte_data = string_data.encode('utf-8')
```
#### 5. Handling Errors
Use error handling strategies to manage encoding and decoding issues. The `errors` parameter in `decode` and `encode` methods can handle errors gracefully.
**Options include:**
- `errors='ignore'`: Ignore characters that can’t be decoded.
- `errors='replace'`: Replace problematic characters with a placeholder (e.g., `?`).
**Example:**
```python
byte_data = b'\xe2\x98\x83\x97'
string_data = byte_data.decode('utf-8', errors='ignore') # Ignoring errors
```
#### 6. Using `chardet` for Automatic Detection
The `chardet` library can automatically detect the encoding of your data. Install it using `pip install chardet`.
**Example:**
```python
import chardet
raw_data = b'\xe2\x98\x83\x97'
result = chardet.detect(raw_data)
encoding = result['encoding']
# Decode using the detected encoding
string_data = raw_data.decode(encoding)
```
#### 7. Handling Encoding in Pandas
If you’re working with data in pandas, specify the encoding when reading data.
**Example:**
```python
import pandas as pd
df = pd.read_csv('data.csv', encoding='utf-8')
```
### Example Python Script
Here’s a complete example demonstrating how to handle encoding when fetching data from a URL and reading from a file:
```python
import requests
import chardet
# Fetch data from a URL
url = 'https://example.com/data'
response = requests.get(url)
response.encoding = 'utf-8' # Set the encoding to UTF-8
data = response.text
# Write data to a file with a specific encoding
with open('output.txt', 'w', encoding='utf-8') as file:
file.write(data)
# Read the data back from the file
with open('output.txt', 'rb') as file:
raw_data = file.read()
# Detect the encoding
detected_encoding = chardet.detect(raw_data)['encoding']
print(f"Detected encoding: {detected_encoding}")
# Decode the data using the detected encoding
decoded_data = raw_data.decode(detected_encoding)
print(decoded_data)
```
### Conclusion
Handling text encoding properly ensures that your data remains intact and readable. By following these steps and using the right tools, you can effectively manage and resolve text encoding issues in your Python scripts. Always verify the source encoding and handle potential errors to ensure your data processing is robust and error-free.
Read More
Adobe Commerce Development Services
Pimcore Development Services
Magento Developers
Or you try to set locale to UTF-8 before you query data.
for more information see link this below.
https://webkul.com/blog/setup-locale-python3/
https://www.programcreek.com/python/example/497/locale.setlocale
Check out the July 2025 Power BI update to learn about new features.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
User | Count |
---|---|
72 | |
67 | |
51 | |
38 | |
26 |
User | Count |
---|---|
89 | |
52 | |
45 | |
39 | |
38 |