01
Introduction
Data visualization not only requires beautiful charts but also a friendly interactive interface for display. Streamlit, a web application framework specifically designed for data science, allows us to quickly transform Python data analysis into interactive applications.
It is simple to use yet powerful, making it particularly suitable for creating data-driven applications.
This article will guide readers step by step in building a complete data visualization web application, from basic components to complex interactions, fully showcasing Streamlit’s application techniques.
02
Environment Setup and Basic Page
First, let’s create a basic application framework:
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
# Configure the page
st.set_page_config(
page_title="Data Analysis Dashboard",
page_icon="π",
layout="wide",
initial_sidebar_state="expanded"
)
# Create sample data
@st.cache_data
def load_data():
dates = pd.date_range('2024-01-01', periods=100)
df = pd.DataFrame({
'Date': dates,
'Sales': np.random.normal(1000, 200, 100).cumsum(),
'Profit': np.random.normal(200, 50, 100).cumsum(),
'Region': np.random.choice(['East China', 'North China', 'South China', 'West'], 100),
'Product Category': np.random.choice(['Electronics', 'Clothing', 'Food', 'Home'], 100)
})
return df
# Load data
df = load_data()
# Create page title
st.title('π Data Analysis Dashboard')
st.markdown('### Sales Data Visualization Analysis')
This code accomplishes:
-
Importing necessary libraries -
Configuring basic page settings -
Creating data loading function -
Setting cache decorator -
Adding page title
03
Creating Sidebar Filters
Let’s add interactive filters:
def create_filters():
st.sidebar.header('Data Filtering')
# Date range selector
date_range = st.sidebar.date_input(
"Select Date Range",
value=(df['Date'].min(), df['Date'].max()),
min_value=df['Date'].min(),
max_value=df['Date'].max()
)
# Region multi-select
regions = st.sidebar.multiselect(
"Select Region",
options=df['Region'].unique(),
default=df['Region'].unique()
)
# Product category multi-select
categories = st.sidebar.multiselect(
"Select Product Category",
options=df['Product Category'].unique(),
default=df['Product Category'].unique()
)
# Filter data
filtered_df = df[
(df['Date'].dt.date >= date_range[0]) &
(df['Date'].dt.date <= date_range[1]) &
(df['Region'].isin(regions)) &
(df['Product Category'].isin(categories))
]
return filtered_df
# Get filtered data
filtered_df = create_filters()
This code implements:
-
Creating sidebar filters -
Adding date range selector -
Adding multi-select filters -
Implementing data filtering logic
04
Creating Main Charts
Now let’s add core visualization components:
def create_charts(data):
# Create two-column layout
col1, col2 = st.columns(2)
with col1:
# Sales trend chart
st.subheader('Sales Trend Analysis')
fig_trend = px.line(
data,
x='Date',
y=['Sales', 'Profit'],
title='Sales and Profit Trend'
)
st.plotly_chart(fig_trend, use_container_width=True)
# Regional distribution pie chart
st.subheader('Regional Distribution')
region_data = data.groupby('Region').agg({
'Sales': 'sum'
}).reset_index()
fig_pie = px.pie(
region_data,
values='Sales',
names='Region',
title='Sales Proportion by Region'
)
st.plotly_chart(fig_pie, use_container_width=True)
with col2:
# Product category bar chart
st.subheader('Product Category Analysis')
category_data = data.groupby('Product Category').agg({
'Sales': 'sum',
'Profit': 'sum'
}).reset_index()
fig_bar = px.bar(
category_data,
x='Product Category',
y=['Sales', 'Profit'],
title='Comparison of Sales and Profit by Product Category',
barmode='group'
)
st.plotly_chart(fig_bar, use_container_width=True)
# Data table
st.subheader('Detailed Data')
st.dataframe(
data.style.highlight_max(axis=0),
height=300
)
# Create charts
create_charts(filtered_df)
This code demonstrates:
-
Creating dual-column layout -
Adding trend chart -
Creating pie and bar charts -
Displaying interactive data table
05
Adding Statistical Metrics
Let’s add some key metrics for display:
def create_metrics(data):
# Create four-column layout
m1, m2, m3, m4 = st.columns(4)
# Calculate key metrics
total_sales = data['Sales'].sum()
total_profit = data['Profit'].sum()
profit_rate = (total_profit / total_sales) * 100
avg_sales = data['Sales'].mean()
# Display metrics
with m1:
st.metric(
label="Total Sales",
value=f"{total_sales:,.0f}ε
",
delta=f"{data['Sales'].diff().mean():,.0f}ε
"
)
with m2:
st.metric(
label="Total Profit",
value=f"{total_profit:,.0f}ε
",
delta=f"{data['Profit'].diff().mean():,.0f}ε
"
)
with m3:
st.metric(
label="Profit Rate",
value=f"{profit_rate:.1f}%",
delta=f"{profit_rate - 20:.1f}%"
)
with m4:
st.metric(
label="Average Sales",
value=f"{avg_sales:,.0f}ε
"
)
# Add metrics display
create_metrics(filtered_df)
This code implements:
-
Creating four-column metrics layout -
Calculating key business metrics -
Adding trend change metrics -
Setting formatted display
06
Adding Advanced Interactive Features
Finally, let’s add some advanced features:
def create_advanced_features(data):
# Add data download feature
st.sidebar.markdown("---")
st.sidebar.subheader("Data Export")
# CSV download button
csv = data.to_csv(index=False).encode('utf-8')
st.sidebar.download_button(
label="Download CSV File",
data=csv,
file_name="sales_data.csv",
mime="text/csv"
)
# Add data analysis options
st.sidebar.markdown("---")
st.sidebar.subheader("Data Analysis")
analysis_type = st.sidebar.selectbox(
"Select Analysis Type",
["Trend Analysis", "Correlation Analysis", "Seasonal Analysis"]
)
if analysis_type == "Correlation Analysis":
# Create correlation heatmap
corr = data[['Sales', 'Profit']].corr()
fig_corr = go.Figure(data=go.Heatmap(
z=corr.values,
x=corr.index,
y=corr.columns,
colorscale='RdBu'
))
st.plotly_chart(fig_corr, use_container_width=True)
elif analysis_type == "Seasonal Analysis":
# Monthly statistics
data['Month'] = data['Date'].dt.month
monthly_data = data.groupby('Month')[['Sales', 'Profit']].mean()
st.line_chart(monthly_data)
# Add advanced features
create_advanced_features(filtered_df)
This code demonstrates:
-
Adding data export functionality -
Creating analysis type selector -
Implementing correlation analysis -
Adding seasonal analysis chart
07
Through this article, we have mastered the complete skills of creating professional data visualization web applications using Streamlit.
From basic components to advanced features, from data display to interactive analysis, we have equipped ourselves with the ability to develop data-driven applications.
In the future, data visualization applications will continue to develop in the following directions:
-
More powerful real-time data processing -
Richer interactive experiences -
Smarter data analysis
Mastering these skills will help us create more valuable data analysis tools.