The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.
08-21-2025 21:57 PM - last edited 08-21-2025 22:32 PM
🧾 In short: this notebook walks you through retrieving live data from X (formerly Twitter), applying AI for insights (from free open source models), and visualizing results in Fabric, blending practical use cases in weather monitoring with AI integration.
(This notebook can effectively be imported and run freely by anyone with a Microsoft Fabric Trial)
To retrieve live tweets in this demo, you’ll first set up access to the X (Twitter) Developer API, then store your token securely in Azure Key Vault.
Go to the X Developer Portal: https://developer.x.com/en
Select the Free plan (sufficient for this notebook) or go ahead for the paid versions.
In the Developer Portal:
Security note: Treat tokens like passwords. Never commit them to source control and never print them in logs.
4) Store the token securely (Azure Key Vault)
Recommended: Save the Bearer Token as a secret in Azure Key Vault. Give the secret a clear name (e.g., TwitterBearer).
Now that your X Developer access is set up and your token is safely stored, it's time to retrieve the latest weather notices directly from the Bureau of Meteorology (QLD).
Check out the official X account: @BOM_Qld
Note: This code uses Python only and not PySpark. PySpark requires Spark jobs, which add overhead. For large datasets, PySpark offers better performance, but since this demo collects only ~100 rows of data, Python is sufficient.
Install the required packages (if not already installed)
#pip install tweepy
import tweepy
import pandas as pd
from notebookutils import mssparkutils
from notebookutils.visualization import display
# (Recommended Approach) Store and Get the Bearer Token secret from Azure Key Vault
bearer_token = mssparkutils.credentials.getSecret("https://< Add Your Key Vault Here >.vault.azure.net/", "TwitterBearer")
# (Alternative Approach) Swap the above with the below and enter bearer token directly
# bearer_token = "YOUR_BEARER_TOKEN_HERE"
# Connect to X API, authenticate with the Bearer Token and pause if you hit your X API rate limits
client = tweepy.Client(bearer_token=bearer_token, wait_on_rate_limit=True)
# Get user id
user = client.get_user(username="BOM_Qld")
# Fetch the latest N tweets
tweets = client.get_users_tweets(
id=user.data.id,
max_results=100, # Change as required (and as your X (twitter) plan allows)
tweet_fields=["created_at", "text"]
)
# Convert to DataFrame and format date
df = pd.DataFrame([{"created_at": t.created_at, "text": t.text} for t in tweets.data])
df["created_at"] = pd.to_datetime(df["created_at"]).dt.strftime("%Y%m%d")
# Display the collected data
display(df)
The data from X (i.e. 100 rows of tweets as specified)
AI can help us classify text into categories and summarise information for deeper insights.
We’ll now use AI models to enrich the collected tweets so that they can be visualised in Power BI more effectively.
Microsoft Fabric provides built-in AI Functions, built with OpenAI models such as:
⚠️However, at the time of creating this notebook, these built-in functions are not available in the Trial version of Fabric.
Therefore many interested and potential users have not been able to trial these AI functions.
To work around this limitation, we’ll use open-source AI models from Hugging Face. In particular, Facebook (Meta) models.
These Facebook (Meta) models are widely adopted, open-source, and safe to experiment with in a demo context.
That said:
Let’s load the models and apply them to our dataset 👇
from transformers import pipeline
# Load AI models (download once, cached locally in Fabric runtime)
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")
# Categories for classification
categories = ["Severe Warning", "Flood", "Heat", "Forecast", "Outlook"]
# Apply AI models to the data
df["summary"] = df["text"].apply(lambda t: summarizer(t, max_length=30, min_length=5, do_sample=False)[0]["summary_text"])
df["category"] = df["text"].apply(lambda t: classifier(t, candidate_labels=categories)["labels"][0])
# Display the dataframe showing the TWO NEW AI generated columns!
display(df)
With just those few lines of code, we get our two new AI generated summary and category columns!
You may now want the actual weather from BoM.
The Bureau of Meteorology (BoM) provides updated weather information across Australia every 30 minutes.
The below notebook code fetches the latest weather data specifically for Brisbane.
Tip: To get data for other locations, visit https://www.bom.gov.au/ and find the Product and Station codes for your desired area.
import requests
import pandas as pd
import plotly.express as px
# Variables (Navigate to https://www.bom.gov.au/ to identify your desired Product and Station)
PRODUCT = "IDQ60801"
STATION = "94576"
URL = f"https://www.bom.gov.au/fwo/{PRODUCT}/{PRODUCT}.{STATION}.json"
HOURS = 3 # The last N hours to display
# Get the data
r = requests.get(URL, headers={"User-Agent": "Mozilla/5.0"}, timeout=20)
r.raise_for_status()
data = r.json().get("observations", {}).get("data", [])
df = pd.DataFrame(data)
if df.empty:
raise RuntimeError("No data returned from BoM")
# Data cleansing - Fromat timestamps, drop duplicates, and sort columns
df["Timestamp"] = pd.to_datetime(df["local_date_time_full"], format="%Y%m%d%H%M%S", errors="coerce")
df = df.dropna(subset=["Timestamp"]).drop_duplicates("Timestamp").sort_values("Timestamp").reset_index(drop=True)
# Rename the column names for readibility
rename_map = {
"air_temp": "Air Temp (°C)",
#"rel_hum": "Humidity (%)",
"wind_spd_kmh": "Wind (km/h)",
"rain_trace": "Rain (mm)"
}
df = df.rename(columns={k: v for k, v in rename_map.items() if k in df.columns})
# Limiting the display to the last N hours (per set Variable)
cutoff = df["Timestamp"].max() - pd.Timedelta(hours=HOURS)
df = df[df["Timestamp"] >= cutoff].reset_index(drop=True)
# Ensure all metric columns are numeric
for col in rename_map.values():
if col in df.columns:
df[col] = pd.to_numeric(df[col], errors="coerce")
# Select only numeric columns for plotting
ycols = [c for c in rename_map.values() if c in df.columns and pd.api.types.is_numeric_dtype(df[c])]
# Plot the dataframe with Plotly
fig = px.line(df, x="Timestamp", y=ycols, title=f"Brisbane ({STATION}) – Last {HOURS} Hours BoM Observations")
fig.update_layout(
plot_bgcolor="white",
paper_bgcolor="white",
font=dict(color="black"),
xaxis=dict(showgrid=True, gridcolor="lightgrey"),
yaxis=dict(showgrid=True, gridcolor="lightgrey"),
legend=dict(title="Metric")
)
fig.show()
The Plotly visual above is interactive.
For some who may be wondering about visualising the data within the Notebook with Power BI:
While QuickVisualize provides a familiar Power BI-like interaction, its current functionality is extremely limited.
Power BI remains the world’s leading BI tool, but using powerbiclient inside a notebook gives only a lightweight version (useful perhaps for spinning up a quick semantic model), not the full Power BI experience.
Interesting to see how this evolves over time…
from powerbiclient import QuickVisualize, get_dataset_config
# Prepare the same data for Power BI as used for Plotly
df_pbi = df[["Timestamp"] + ycols]
# renders the quick report in the Fabric notebook cell
qv = QuickVisualize(get_dataset_config(df_pbi))
qv.set_size(500, 1400)
qv
(The Notebook is attached for you to import and run)
https%3A%2F%2Fgithub.com%2F