Introduction
Plotly Express is a high-level Python visualization library. In fact, it’s a wrapper for Plotly.py that exposes a simple syntax for complex charts.
Inspired by Seaborn and ggplot2, it was specifically designed to have a terse, consistent and easy-to-learn API
With just a single import, you can make richly interactive plots in just a single function call, including
- faceting,
- maps
- animations
- trendlines
It comes with on-board datasets, color scales and themes. Some of the datasets include
- tips
- iris
- election
- gapminder
- medals_long
- medals_wide
- stocks
- wind
Installing Plotly
Plotly.Express is a regular part of the Plotly python package, so the easiest is to install it all.
# pip pip install plotly # anaconda conda install -c anaconda plotly
Plotly express require pandas to be installed
[In]: import plotly.express as px [Out]: ImportError: Plotly express requires pandas to be installed.
If you don't have pandas already installed, you will get an error. So, install pandas before trying to import plotly express
Get Started
Import libraries
import numpy as np import pandas as pd import plotly.express as px
Import datasets
tips = px.data.tips() gapminder = px.data.gapminder() iris = px.data.iris()
Create Plots
Bar plot
px.histogram(tips, x='smoker')
Histogram
px.histogram(tips, x='total_bill')
Customize Plots
Before we move to introducing a whole range of Plotly chart types, let’s explore basic techniques on how to update the axes, legend, titles, and labels.
For basic styling, we can add additional parameters to the function used to create the plot
This way, we can change/add the following things in the figure
- Figure main title
- Width and height
- Labels
- Category orders
- Spacing
- Color scheme
- Template
First let's see a simple figure with default style
df = px.data.tips() fig = px.histogram(df, x="day", y="total_bill", color="sex") fig.show()
Now, using parameters in the px.figure_type()
method, we can customize the following aspects of the the figure
- Figure title
- Width ad height
- Labels/variable name (manually specify axes labels)
- Order of the groups of categorical variables
- Color map
- Template
import plotly.express as px df = px.data.tips() fig = px.histogram(df, x="day", y="total_bill", color="sex", title="Receipts by Payer Gender and Day of Week", width=600, height=400, # replaces default labels by column name labels={ "sex": "Payer Gender", "day": "Day of Week", "total_bill": "Receipts" }, # replaces default order by column name category_orders = { "day": ["Thur", "Fri", "Sat", "Sun"], "sex": ["Male", "Female"] }, # replaces default color mapping by value color_discrete_map = { "Male": "RebeccaPurple", "Female": "MediumPurple" }, template="simple_white" ) fig.show()
In addition, a Plotly Express figure can be styled using the following methods as well.
- update_layout
- update_xaxes
- update_yaxes
These three functions allows us to update most of the parameters of the figure.
fig.update_layout(xaxis_title="X-axis title")
# Another method with update_layout
fig.update_layout(xaxis = {"title": "X-axis title"})
# update_xaxes
fig.update_xaxes(title_text="X-axis title")
# update y axis title
fig.update_yaxes(title_text="Y-axis title")
Global and Local Font Specification
You can set the figure-wide font with the layout.font
attribute, which will apply to all titles and tick labels, but this can be overridden for specific plot items like individual axes and legend titles etc. In the following figure, we set the figure-wide font to Courier New in blue, and then override this for certain parts of the figure.
df = px.data.iris() fig = px.scatter(df, x="sepal_length", y="sepal_width", color="species", title="Playing with Fonts") fig.update_layout( font_family="Courier New", font_color="blue", title_font_family="Times New Roman", title_font_color="red", legend_title_font_color="green" ) fig.update_xaxes(title_font_family="Arial") fig.show()
Options to customize title
Change Axes Labels
To update the labels/variable names, we can use labels
parameter for the main graphing function.
Using labels
we can select the variables of which we want to select the label and assign new label text. See example below
df = px.data.tips() fig = px.scatter(df, x="total_bill", y="tip", color="sex", labels=dict(total_bill="Total Bill ($)", tip="Tip ($)", sex="Payer Gender") ) fig.show()
Legend Options
Legend can be easily costomized by using legend parameter for update_layout()
function.
Example:
df = px.data.tips() fig = px.scatter(df, x="total_bill", y="tip", color="sex", title='Total bill vs Tip') fig.update_layout(title_x=0.5) fig.update_layout(legend=dict( title_font_color="red", title=dict(font_family="Times New Roman", font=dict(size=20)) )) fig.show()
dict can contain the following keys with approprate values
Color Map
Different color mapping scheme for categorical and continous data
- Categorical : Discete color map
- Continous : Countinous color map
Default color map for categorical data
df = px.data.tips() fig = px.scatter(df, x="total_bill", y="tip", color="smoker", title="String 'smoker' values mean discrete colors") fig.show()
Examples of discrete color map
Default color map for continous data
df = px.data.tips() fig = px.scatter(df, x="total_bill", y="tip", color="size", title="Numeric 'size' values mean continuous color") fig.show()
Examples of continous color map
More exmaples
fig = px.line(df, x = "date", y = "total_cases") fig.update_layout(title="Number of Cumulative Cases of COVID-19 in India from January 1, 2021 to September 24, 2021", title_x=0.5, xaxis_title="", font_size=15, template="seaborn", yaxis = dict(ticks = "outside", tickcolor='white', ticklen=10, title="Number of cumulative cases (in millions)"), xaxis = dict(ticks = "outside", tickcolor='white', ticklen=10) ) fig.show()
fig = px.line(df, y=['Shares', 'Comments', 'Positive', 'Negative'], width=1200, height=600) fig.update_layout( title="Weekly engagement trends from January 1, 2021 to September 24, 2021", margin=dict( t=120 ), title_x=0.5, yaxis = dict(ticks = "outside", tickcolor='white', ticklen=10, title="Average engagement per post"), xaxis = dict(ticks = "outside", tickcolor='white', ticklen=10, title=""), font_size=14, legend=dict( orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1, title="", ) ) fig.update_yaxes(type="log", showgrid=True, dtick=1) fig.show()
Adjust figure margin
Types of plots
Plots can be categorized based on the followings
- Number of variables
- Purpose of the plot
Plots based on number of variables
- Univariate plots
- Bivariate plots
- Multivariate plots
Plots based on the purpose of plots
- Distribution
- Ranking
- Part-to-whole
- Correlation
- Deviation
- Magnitude
- Change of over time
- Flow/movement
- Spatial
Distribution Plots
The function of distribution charts is to show how data is spread across a group. This helps you spot outliers and commonalities, as well as see the shape of your data. For example, public policy officials might want to see the demographic or income characteristics of a certain population.
Histogram
df = px.data.tips() fig = px.histogram(df, x="total_bill") fig.show()
Change number of bins
df = px.data.tips() fig = px.histogram(df, x="total_bill", nbins=20) fig.show()
Histogram with date data
df = px.data.stocks() fig = px.histogram(df, x="date") fig.update_layout(bargap=0.2) fig.show()
Several histograms for the different values of one column
df = px.data.tips() fig = px.histogram(df, x="total_bill", color="sex") fig.show()
Histogram with marginal plot
df = px.data.tips() fig = px.histogram(df, x="total_bill", y="tip", color="sex", marginal="rug", hover_data=df.columns) fig.show()
Box Plots
df = px.data.tips() fig = px.box(df, y="total_bill") fig.show()
If a column name is given as xargument, a box plot is drawn for each value of x.
df = px.data.tips() fig = px.box(df, x="time", y="total_bill") fig.show()
Box plots with points along side
df = px.data.tips() fig = px.box(df, x="time", y="total_bill", points="all") fig.show()
df = px.data.tips() fig = px.box(df, x="time", y="total_bill", color="smoker", notched=True, title="Box plot of total bill", hover_data=["day"] ) fig.show()
df = px.data.tips() fig = px.violin(df, y="total_bill") fig.show()
Violin plot with boxplot and ponts
df = px.data.tips() fig = px.violin(df, y="total_bill", box=True, # draw box plot inside the violin points='all', # can be 'outliers', or False ) fig.show()
Multiple violin plots
df = px.data.tips() fig = px.violin(df, y="tip", x="smoker", color="sex", box=True, points="all", hover_data=df.columns) fig.show()
Violin Plots with overlay
df = px.data.tips() fig = px.violin(df, y="tip", color="sex", violinmode='overlay', # draw violins on top of each other # default violinmode is 'group' as in example above hover_data=df.columns) fig.show()
Strip Plot
df = px.data.tips() fig = px.strip(df, x='day', y='tip') fig.show()
Part-to-whole Plots
The function of this type of plots is to show the relative frequency of each group in an easy to understand format
Pie Chart
We can use px.pie()
to create bar plots.
In px.pie, data visualized by the sectors of the pie is set in values. The sector labels are set in names.
df = px.data.gapminder().query("year == 2007").query("continent == 'Europe'") df.loc[df['pop'] < 2.e6, 'country'] = 'Other countries' # Represent only large countries fig = px.pie(df, values='pop', names='country', title='Population of European continent') fig.show()
Pie chart with repeated labels
# This dataframe has 244 lines, but 4 distinct values for `day` df = px.data.tips() fig = px.pie(df, values='tip', names='day') fig.show()
Styling Pie Charts
Setting the color of pie sectors with px.pie
df = px.data.tips() fig = px.pie(df, values='tip', names='day', color_discrete_sequence=px.colors.sequential.RdBu) fig.show()
Using an explicit mapping for discrete colors
df = px.data.tips() fig = px.pie(df, values='tip', names='day', color='day', color_discrete_map={'Thur':'lightcyan', 'Fri':'cyan', 'Sat':'royalblue', 'Sun':'darkblue'}) fig.show()
text orientation inside pie sectors
Correlation Plots
Correlation plots, also known as correlograms for more than two variables, help us to visualize the correlation between continuous variables
Scatter Plots
# x and y given as DataFrame columns import plotly.express as px df = px.data.iris() # iris is a pandas DataFrame fig = px.scatter(df, x="sepal_width", y="sepal_length") fig.show()
Scatter Plot with Color Scheme
df = px.data.iris() fig = px.scatter(df, x="sepal_width", y="sepal_length", color='petal_length') fig.show()
Scatter Plot with Different Symbols
The symbol argument can be mapped to a column as well. A wide variety of symbols are available.
df = px.data.iris() fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species", symbol="species") fig.show()
Scatter Plot With Faceting
df = px.data.tips() fig = px.scatter(df, x="total_bill", y="tip", color="smoker", facet_col="sex", facet_row="time") fig.show()
Scatter plot with trend line
df = px.data.tips() fig = px.scatter(df, x="total_bill", y="tip", trendline="ols") fig.show()
Displaying a single trendline with multiple traces
df = px.data.tips() fig = px.scatter(df, x="total_bill", y="tip", symbol="smoker", color="sex", trendline="ols", trendline_scope="overall") fig.show()
Trend line options
trendline_color_override="black"
trendline_scope="overall"
trendline="ols"
trendline_options=dict(log_x=True)
Trend line in log
df = px.data.gapminder(year=2007) fig = px.scatter(df, x="gdpPercap", y="lifeExp", trendline="ols", trendline_options=dict(log_x=True), title="Log-transformed fit on linear axes") fig.show()
Trend line and x axis in log
df = px.data.gapminder(year=2007) fig = px.scatter(df, x="gdpPercap", y="lifeExp", log_x=True, trendline="ols", trendline_options=dict(log_x=True), title="Log-scaled X axis and log-transformed fit") fig.show()
Dot Plot
Scatter plots where one axis is categorical are often known as dot plots.
df = px.data.medals_long() fig = px.scatter(df, y="nation", x="count", color="medal", symbol="medal") fig.update_traces(marker_size=10) fig.show()
Bubble Plot
Scatter plots with variable-sized circular markers are often known as bubble charts.
df = px.data.iris() fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species", size='petal_length', hover_data=['petal_width']) fig.show()