Introduction

Plotly Express is a high-level Python visualization library. In fact, it’s a wrapper for Plotly.py that exposes a simple syntax for complex charts.

Inspired by Seaborn and ggplot2, it was specifically designed to have a terse, consistent and easy-to-learn API

With just a single import, you can make richly interactive plots in just a single function call, including

  • faceting,
  • maps
  • animations
  • trendlines

It comes with on-board datasets, color scales and themes. Some of the datasets include

  • tips
  • iris
  • election
  • gapminder
  • medals_long
  • medals_wide
  • stocks
  • wind

Installing Plotly

Plotly.Express is a regular part of the Plotly python package, so the easiest is to install it all.

                            # pip 
                            pip install plotly

                            # anaconda
                            conda install -c anaconda plotly
                        

Plotly express require pandas to be installed

                            [In]: import plotly.express as px
                            [Out]: ImportError: Plotly express requires pandas to be installed.
                        

If you don't have pandas already installed, you will get an error. So, install pandas before trying to import plotly express

Get Started

Import libraries

                            import numpy as np
                            import pandas as pd
                            import plotly.express as px
                        

Import datasets

                            tips = px.data.tips()
                            gapminder = px.data.gapminder()
                            iris = px.data.iris()
                        

Create Plots

Bar plot

                            px.histogram(tips, x='smoker')
                        

Histogram

                            px.histogram(tips, x='total_bill')
                        

Customize Plots

Before we move to introducing a whole range of Plotly chart types, let’s explore basic techniques on how to update the axes, legend, titles, and labels.

For basic styling, we can add additional parameters to the function used to create the plot

This way, we can change/add the following things in the figure

  • Figure main title
  • Width and height
  • Labels
  • Category orders
  • Spacing
  • Color scheme
  • Template

First let's see a simple figure with default style

                            df = px.data.tips()
                            fig = px.histogram(df, x="day", y="total_bill", color="sex")
                            fig.show()
                        

Now, using parameters in the px.figure_type() method, we can customize the following aspects of the the figure

  • Figure title
  • Width ad height
  • Labels/variable name (manually specify axes labels)
  • Order of the groups of categorical variables
  • Color map
  • Template

import plotly.express as px

df = px.data.tips()

fig = px.histogram(df, 
    x="day", 
    y="total_bill", 
    color="sex",
    title="Receipts by Payer Gender and Day of Week",
    width=600, height=400,
    
    # replaces default labels by column name
    labels={ 
        "sex": "Payer Gender",  "day": "Day of Week", "total_bill": "Receipts"
    },

    # replaces default order by column name
    category_orders = { 
        "day": ["Thur", "Fri", "Sat", "Sun"], "sex": ["Male", "Female"]
    },

    # replaces default color mapping by value
    color_discrete_map = { 
        "Male": "RebeccaPurple", "Female": "MediumPurple"
    },

    template="simple_white"
)

fig.show()
                        

In addition, a Plotly Express figure can be styled using the following methods as well.

  • update_layout
  • update_xaxes
  • update_yaxes

These three functions allows us to update most of the parameters of the figure.

                            fig.update_layout(xaxis_title="X-axis title")

                            # Another method with update_layout
                            fig.update_layout(xaxis = {"title": "X-axis title"})

                            # update_xaxes
                            fig.update_xaxes(title_text="X-axis title")

                            # update y axis title
                            fig.update_yaxes(title_text="Y-axis title")

                        

Global and Local Font Specification

You can set the figure-wide font with the layout.font attribute, which will apply to all titles and tick labels, but this can be overridden for specific plot items like individual axes and legend titles etc. In the following figure, we set the figure-wide font to Courier New in blue, and then override this for certain parts of the figure.

                                df = px.data.iris()
fig = px.scatter(df, x="sepal_length", y="sepal_width", color="species",
                title="Playing with Fonts")
fig.update_layout(
    font_family="Courier New",
    font_color="blue",
    title_font_family="Times New Roman",
    title_font_color="red",
    legend_title_font_color="green"
)
fig.update_xaxes(title_font_family="Arial")
fig.show()
                            

Options to customize title

  • title
    • font
      • color
      • family
      • size
    • pad
      • b
      • l
      • r
      • t
    • text
    • x. Sets the x position with respect to `xref` in normalized coordinates from "0" (left) to "1" (right). Accepted values: number between or equal to 0 and 1. Default: 0.5
    • xanchorSets the title's horizontal alignment with respect to its x position. "left" means that the title starts at x, "right" means that the title ends at x and "center" means that the title's center is at x. "auto" divides `xref` by three and calculates the `xanchor` value automatically based on the value of `x`. Type: enumerated , one of ( "auto" | "left" | "center" | "right" ). Default: "auto"
    • xref. Sets the container `x` refers to. "container" spans the entire `width` of the plot. "paper" refers to the width of the plotting area only. Type: enumerated , one of ( "container" | "paper" ). Default: "container"
    • y. Sets the y position with respect to `yref` in normalized coordinates from "0" (bottom) to "1" (top). "auto" places the baseline of the title onto the vertical center of the top margin. Accepted values: number between or equal to 0 and 1. Default: 'auto'
    • yanchor. Sets the title's vertical alignment with respect to its y position. "top" means that the title's cap line is at y, "bottom" means that the title's baseline is at y and "middle" means that the title's midline is at y. "auto" divides `yref` by three and calculates the `yanchor` value automatically based on the value of `y`. Accepted values: one of ( "auto" | "top" | "middle" | "bottom" ). Default: 'auto'
    • yref. Sets the container `y` refers to. "container" spans the entire `height` of the plot. "paper" refers to the height of the plotting area only. Accepted values: one of ( "container" | "paper" ). Default: "container"

Change Axes Labels

To update the labels/variable names, we can use labels parameter for the main graphing function.

Using labels we can select the variables of which we want to select the label and assign new label text. See example below

                                df = px.data.tips()
                                fig = px.scatter(df, x="total_bill", y="tip", color="sex",
                                    labels=dict(total_bill="Total Bill ($)", tip="Tip ($)", sex="Payer Gender")
                                )
                                fig.show()
                            

Legend Options

Legend can be easily costomized by using legend parameter for update_layout() function.

Example:

df = px.data.tips()
fig = px.scatter(df, x="total_bill", y="tip", color="sex", title='Total bill vs Tip')
fig.update_layout(title_x=0.5)

fig.update_layout(legend=dict(
    title_font_color="red", 
    title=dict(font_family="Times New Roman",
              font=dict(size=20))
))

fig.show()
                            

dict can contain the following keys with approprate values

  • font
    • color. Accepted value: color (name or hexa code)
    • family. Value: name of font family. Eg: 'Arial', 'Times New Roman'
    • size. Accepted values: number greater than 1
  • bgcolor
  • bordercolor. Sets the color of the border enclosing the legend.
  • borderwidth: Default is 0. Accepted values: number greater than or equal to zero
  • grouptitlefont
    • color. Accepted value: color name or hexa code.
    • family. Accepted value: name of font family. Eg: 'Arial', 'Times New Roman'
    • size. Accepted values: number greater than 1
  • title
    • font
      • color.
      • family
      • size
    • side. Determines the location of legend's title with respect to the legend items. Defaulted to "top" with `orientation` is "h". Defaulted to "left" with `orientation` is "v". The "top left" options could be used to expand legend area in both x and y sides. Accepted values: "top" | "left" | "top left"
    • text. Sets the title of the legend. Value: string
  • x. Sets the x position (in normalized coordinates) of the legend. Defaults to "1.02" for vertical legends and defaults to "0" for horizontal legends. Accepted value: number between or equal to -2 and 3
  • xanchor. Sets the legend's horizontal position anchor. Accepted values: one of ( "auto" | "left" | "center" | "right" )
  • y. Sets the y position (in normalized coordinates) of the legend. Defaults to "1" for vertical legends, defaults to "-0.1" for horizontal legends on graphs w/o range sliders and defaults to "1.1" for horizontal legends on graph with one or multiple range sliders. Accepted values: number between or equal to -2 and 3
  • yanchor. Sets the legend's vertical position anchor This anchor binds the `y` position to the "top", "middle" or "bottom" of the legend. Value "auto" anchors legends at their bottom for `y` values less than or equal to 1/3, anchors legends to at their top for `y` values greater than or equal to 2/3 and anchors legends with respect to their middle otherwise. Accepted values: one of ( "auto" | "top" | "middle" | "bottom" )
  • itemwidth=30. Sets the width (in px) of the legend item symbols (the part other than the title.text). Accepted values: number greater than or equal to 30.
  • orientation="v". Accepted values: 'v' or 'h'

Color Map

Different color mapping scheme for categorical and continous data

  • Categorical : Discete color map
  • Continous : Countinous color map

Default color map for categorical data

                                df = px.data.tips()
                                fig = px.scatter(df, x="total_bill", y="tip", color="smoker",
                                                title="String 'smoker' values mean discrete colors")

                                fig.show()
                            

Examples of discrete color map

Default color map for continous data

                                df = px.data.tips()
                                fig = px.scatter(df, x="total_bill", y="tip", color="size",
                                    title="Numeric 'size' values mean continuous color")
                                fig.show()
                            

Examples of continous color map

More exmaples

fig = px.line(df, x = "date", y = "total_cases")
fig.update_layout(title="Number of Cumulative Cases of COVID-19 in India from January 1, 2021 to September 24, 2021", 
    title_x=0.5,
    xaxis_title="", 
    font_size=15,
    template="seaborn",
    yaxis = dict(ticks = "outside", tickcolor='white', ticklen=10, title="Number of cumulative cases (in millions)"),
    xaxis = dict(ticks = "outside", tickcolor='white', ticklen=10)
    )
fig.show()
                            
fig = px.line(df, y=['Shares', 'Comments', 'Positive', 'Negative'], width=1200, height=600)
fig.update_layout(
    title="Weekly engagement trends from January 1, 2021 to September 24, 2021",
    margin=dict(
        t=120
    ),
    title_x=0.5,
    yaxis = dict(ticks = "outside", tickcolor='white', ticklen=10, title="Average engagement per post"),
    xaxis = dict(ticks = "outside", tickcolor='white', ticklen=10, title=""),
    font_size=14,
    legend=dict(
        orientation="h",
        yanchor="bottom",
        y=1.02,
        xanchor="right",
        x=1,
        title="",
    )
)
fig.update_yaxes(type="log", showgrid=True, dtick=1)
fig.show()
                            

Adjust figure margin

  • margin
    • autoexpand. Turns on/off margin expansion computations. Legends, colorbars, updatemenus, sliders, axis rangeselector and rangeslider are allowed to push the margins by defaults. Type: Boolean. Default: True
    • b. Sets the bottom margin (in px). Default: 80
    • l. Sets the left margin (in px). Default: 80
    • r. Sets the right margin (in px). Default: 80
    • t. Sets the top margin (in px). Default: 80
    • pad. Sets the amount of padding (in px) between the plotting area and the axis lines. Accepted values: Number greater than 0. Default 0

Types of plots

Plots can be categorized based on the followings

  • Number of variables
  • Purpose of the plot

Plots based on number of variables

  • Univariate plots
  • Bivariate plots
  • Multivariate plots

Plots based on the purpose of plots

  • Distribution
  • Ranking
  • Part-to-whole
  • Correlation
  • Deviation
  • Magnitude
  • Change of over time
  • Flow/movement
  • Spatial

Distribution Plots

The function of distribution charts is to show how data is spread across a group. This helps you spot outliers and commonalities, as well as see the shape of your data. For example, public policy officials might want to see the demographic or income characteristics of a certain population.

Bar Plots

We can use both px.histogram() and px.bar() methods to create bar plots. Howevr, px.bar() has some specific parameters that can be used to customize a bar plot

Bar Plot tidy data format

                                px.histogram(tips, x='smoker')
                            

Bar Plot with Long Form Data

Dataset - Long Form Data


                                long_df = px.data.medals_long()

                                fig = px.bar(long_df, x="nation", y="count", color="medal", title="Long-Form Input")

                                fig.show()

                            

Bar Plot with Wide Form Data

Dataset - Wide Form Data

                                wide_df = px.data.medals_wide()
                                fig = px.bar(wide_df, x="nation", y=["gold", "silver", "bronze"], title="Wide-Form Input")
                                fig.show()
                            

Grouped Bar Plot

                                df = px.data.tips()
                                fig = px.histogram(df, x="sex", y="total_bill",
                                            color='smoker', barmode='group',
                                            height=400)
                                fig.show()
                            

Histogram aggregation functions

px.histogram() will aggregate y values by summing them by default, but the histfunc argument can be used to set this to avg to create what is sometimes called a "barplot" which summarizes the central tendency of a dataset, rather than visually representing the totality of the dataset.

Histogram functions

  • sum
  • count
  • avg
                                df = px.data.tips()
                                fig = px.histogram(df, x="sex", y="total_bill",
                                            color='smoker', barmode='group',
                                            histfunc='avg',
                                            height=400)
                                fig.show()
                            

Barpot with text

                                import plotly.express as px
                                df = px.data.medals_long()

                                fig = px.bar(df, x="medal", y="count", color="nation", text="nation")
                                fig.show()
                            

Format text for consistency

                                df = px.data.gapminder().query("continent == 'Europe' and year == 2007 and pop > 2.e6")
                                fig = px.bar(df, y='pop', x='country', text_auto='.2s',
                                            title="Controlled text sizes, positions and angles")
                                fig.update_traces(textfont_size=12, textangle=0, textposition="outside", cliponaxis=False)
                                fig.show()
                            

Facetted Bar Plots

                                df = px.data.tips()
                                fig = px.bar(df, x="sex", y="total_bill", color="smoker", barmode="group",
                                            facet_row="time", facet_col="day",
                                            category_orders={"day": ["Thur", "Fri", "Sat", "Sun"],
                                                            "time": ["Lunch", "Dinner"]})
                                fig.show()
                            

Histogram

                                df = px.data.tips()
                                fig = px.histogram(df, x="total_bill")
                                fig.show()
                            

Change number of bins

                                df = px.data.tips()
                                fig = px.histogram(df, x="total_bill", nbins=20)
                                fig.show()
                            

Histogram with date data

                                df = px.data.stocks()
                                fig = px.histogram(df, x="date")
                                fig.update_layout(bargap=0.2)
                                fig.show()
                            

Several histograms for the different values of one column

                                df = px.data.tips()
                                fig = px.histogram(df, x="total_bill", color="sex")
                                fig.show()
                            

Histogram with marginal plot

                                df = px.data.tips()
                                fig = px.histogram(df, x="total_bill", y="tip", color="sex", marginal="rug", hover_data=df.columns)
                                fig.show()
                            

Box Plots

                                df = px.data.tips()
                                fig = px.box(df, y="total_bill")
                                fig.show()
                            

If a column name is given as xargument, a box plot is drawn for each value of x.

                                df = px.data.tips()
                                fig = px.box(df, x="time", y="total_bill")
                                fig.show()
                            

Box plots with points along side

                                df = px.data.tips()
                                fig = px.box(df, x="time", y="total_bill", points="all")
                                fig.show()
                            

                                df = px.data.tips()
                                fig = px.box(df, x="time", y="total_bill", color="smoker",
                                notched=True, 
                                title="Box plot of total bill",
                                hover_data=["day"] 
                                )
                                fig.show()
                            
                                df = px.data.tips()
                                fig = px.violin(df, y="total_bill")
                                fig.show()
                            

Violin plot with boxplot and ponts

                                df = px.data.tips()
                                fig = px.violin(df, y="total_bill", box=True, # draw box plot inside the violin
                                                points='all', # can be 'outliers', or False
                                            )
                                fig.show()
                            

Multiple violin plots

                                df = px.data.tips()
                                fig = px.violin(df, y="tip", x="smoker", color="sex", box=True, points="all",
                                        hover_data=df.columns)
                                fig.show()
                            

Violin Plots with overlay

                                df = px.data.tips()
                                fig = px.violin(df, y="tip", color="sex",
                                                violinmode='overlay', # draw violins on top of each other
                                                # default violinmode is 'group' as in example above
                                                hover_data=df.columns)
                                fig.show()
                            

Strip Plot

                                df = px.data.tips()
                                fig = px.strip(df, x='day', y='tip')
                                fig.show()
                            

Part-to-whole Plots

The function of this type of plots is to show the relative frequency of each group in an easy to understand format

Pie Chart

We can use px.pie() to create bar plots.

In px.pie, data visualized by the sectors of the pie is set in values. The sector labels are set in names.

                                df = px.data.gapminder().query("year == 2007").query("continent == 'Europe'")
                                df.loc[df['pop'] < 2.e6, 'country'] = 'Other countries' # Represent only large countries
                                fig = px.pie(df, values='pop', names='country', title='Population of European continent')
                                fig.show()
                            

Pie chart with repeated labels

                                # This dataframe has 244 lines, but 4 distinct values for `day`
                                df = px.data.tips()
                                fig = px.pie(df, values='tip', names='day')
                                fig.show()
                            

Styling Pie Charts

Setting the color of pie sectors with px.pie

                                df = px.data.tips()
                                fig = px.pie(df, values='tip', names='day', color_discrete_sequence=px.colors.sequential.RdBu)
                                fig.show()
                            

Using an explicit mapping for discrete colors

                                df = px.data.tips()
fig = px.pie(df, values='tip', names='day', color='day',
             color_discrete_map={'Thur':'lightcyan',
                                 'Fri':'cyan',
                                 'Sat':'royalblue',
                                 'Sun':'darkblue'})
fig.show()
                            

text orientation inside pie sectors

Correlation Plots

Correlation plots, also known as correlograms for more than two variables, help us to visualize the correlation between continuous variables

Scatter Plots

                                # x and y given as DataFrame columns
                                import plotly.express as px
                                df = px.data.iris() # iris is a pandas DataFrame
                                fig = px.scatter(df, x="sepal_width", y="sepal_length")
                                fig.show()
                            

Scatter Plot with Color Scheme

                                df = px.data.iris()
                                fig = px.scatter(df, x="sepal_width", y="sepal_length", color='petal_length')
                                fig.show()
                            

Scatter Plot with Different Symbols

The symbol argument can be mapped to a column as well. A wide variety of symbols are available.

                                df = px.data.iris()
                                fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species", symbol="species")
                                fig.show()
                            

Scatter Plot With Faceting

                                df = px.data.tips()
                                fig = px.scatter(df, x="total_bill", y="tip", color="smoker", facet_col="sex", facet_row="time")
                                fig.show()
                            

Scatter plot with trend line

                                df = px.data.tips()
                                fig = px.scatter(df, x="total_bill", y="tip", trendline="ols")
                                fig.show()
                            

Displaying a single trendline with multiple traces

                                df = px.data.tips()
                                fig = px.scatter(df, x="total_bill", y="tip", symbol="smoker", color="sex", trendline="ols", trendline_scope="overall")
                                fig.show()
                            

Trend line options

  • trendline_color_override="black"
  • trendline_scope="overall"
  • trendline="ols"
  • trendline_options=dict(log_x=True)

Trend line in log

df = px.data.gapminder(year=2007)
fig = px.scatter(df, x="gdpPercap", y="lifeExp", 
                    trendline="ols", trendline_options=dict(log_x=True),
                    title="Log-transformed fit on linear axes")
fig.show()
                            

Trend line and x axis in log

df = px.data.gapminder(year=2007)
fig = px.scatter(df, x="gdpPercap", y="lifeExp", log_x=True, 
                trendline="ols", trendline_options=dict(log_x=True),
                title="Log-scaled X axis and log-transformed fit")
fig.show()
                            

Dot Plot

Scatter plots where one axis is categorical are often known as dot plots.

                                df = px.data.medals_long()
                                fig = px.scatter(df, y="nation", x="count", color="medal", symbol="medal")
                                fig.update_traces(marker_size=10)
                                fig.show()
                            

Bubble Plot

Scatter plots with variable-sized circular markers are often known as bubble charts.

df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species",
                 size='petal_length', hover_data=['petal_width'])
fig.show()