Stackable Position Adjustments

In [1]:
import pandas as pd

from lets_plot import *
from lets_plot.mapping import *
In [2]:
LetsPlot.setup_html()
In [3]:
df = pd.read_csv("https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/mpg.csv")
print(df.shape)
df.head()
(234, 12)
Out[3]:
Unnamed: 0 manufacturer model displ year cyl trans drv cty hwy fl class
0 1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compact
1 2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compact
2 3 audi a4 2.0 2008 4 manual(m6) f 20 31 p compact
3 4 audi a4 2.0 2008 4 auto(av) f 21 30 p compact
4 5 audi a4 2.8 1999 6 auto(l5) f 16 26 p compact

1. Default Presentation of 'stack' and 'fill'

Stacking is the default behaviour for most area plots. Fill makes it easier to compare proportions.

In [4]:
ggplot(df, aes(x="drv", fill=as_discrete("year"))) + \
    geom_bar() + \
    ggtitle("bar: position='stack' (default)")
Out[4]:
In [5]:
ggplot(df, aes(x="drv", fill=as_discrete("year"))) + \
    geom_bar(position='fill') + \
    ggtitle("bar: position='fill'")
Out[5]:
In [6]:
ggplot(df, aes(x="hwy", fill=as_discrete("year"))) + \
    geom_histogram() + \
    ggtitle("histogram: position='stack' (default)")
Out[6]:
In [7]:
ggplot(df, aes(x="hwy", fill=as_discrete("year"))) + \
    geom_histogram(position='fill') + \
    ggtitle("histogram: position='fill'")
Out[7]:
In [8]:
ggplot(df, aes(x="hwy", fill=as_discrete("year"))) + \
    geom_area(stat='density', color="black") + \
    ggtitle("area: position='stack' (default)")
Out[8]:
In [9]:
ggplot(df, aes(x="hwy", fill=as_discrete("year"))) + \
    geom_area(stat='density', position='fill', color="black") + \
    ggtitle("area: position='fill'")
Out[9]:

2. Stacking Order

Control the stacking order by changing the order parameter of the as_discrete() function.

In [10]:
ggplot(df, aes(x="hwy")) + \
    geom_area(aes(fill=as_discrete("drv")), stat='density', color="black") + \
    ggtitle("Default order of drive types")
Out[10]:
In [11]:
ggplot(df, aes(x="hwy")) + \
    geom_area(aes(fill=as_discrete("drv", order=-1)), stat='density', color="black") + \
    ggtitle("Backward order of drive types")
Out[11]:

3. Other Geometries

Let's have a look at the geometries, for which the default position adjustment differs from 'stack' and 'fill'.

When stacking across multiple layers it's a good idea to set the group aesthetic - this ensures that all layers are stacked in the same way.

In [12]:
ggplot(df, aes(x="drv", color=as_discrete("year"))) + \
    geom_line(stat='count', position='stack') + \
    geom_point(stat='count', position='stack') + \
    ggtitle("Line and point")
Out[12]:
In [13]:
ggplot(df, aes(x="drv")) + \
    geom_area(aes(fill=as_discrete("year")), stat='count', position='stack', color="black") + \
    geom_line(aes(group="year"), stat='count', position='stack', color="black") + \
    ggtitle("Area and line")
Out[13]:

4. Parameter vjust

The vjust argument of position_stack() is used to move the location of plot elements vertically (vjust stands for vertical adjustment).

In [14]:
ggplot(df, aes(x="drv")) + \
    geom_area(aes(fill=as_discrete("year")), stat='count', \
              position='stack', color="black") + \
    geom_label(aes(label="..count..", group="year"), stat='count', \
               position='stack', color="black") + \
    ggtitle("vjust=1 (default)")
Out[14]:
In [15]:
ggplot(df, aes(x="drv")) + \
    geom_area(aes(fill=as_discrete("year")), stat='count', \
              position='stack', color="black") + \
    geom_label(aes(label="..count..", group="year"), stat='count', \
               position=position_stack(vjust=.5), color="black") + \
    ggtitle("vjust=0.5")
Out[15]:

5. Parameter mode

By default only objects from different groups are stacked over each other, and objects inside one group are positioned as in position='identity'.

This behaviour could be changed by switching mode parameter.

mode='all' means, that each object will be shifted.

In [16]:
ggplot(df, aes(x="hwy", color="class")) + \
    geom_point(y=1, position=position_stack()) + \
    coord_fixed() + ylim(1, 35) + \
    ggtitle("mode='groups' (default)")
Out[16]:
In [17]:
ggplot(df, aes(x="hwy", color="class")) + \
    geom_point(y=1, position=position_stack(mode='all')) + \
    coord_fixed() + ylim(1, 35) + \
    ggtitle("mode='all'")
Out[17]:

6. Negative Values

Stacking supports positive and negative shifts of the objects separately.

In [18]:
data = {
    "x": [0, 1, 2, 0, 1, 2],
    "y": [4, 2, 3, 0, 1, -1],
    "g": ["a", "a", "a", "b", "b", "b"],
}
In [19]:
ggplot(data, aes("x", "y", fill="g")) + \
    geom_bar(stat='identity') + \
    geom_hline(yintercept=0, color="black")
Out[19]:
In [20]:
ggplot(data, aes("x", "y")) + \
    geom_bar(aes(fill="g"), stat='identity', position='fill') + \
    geom_label(aes(label="g", group="g"), stat='identity', \
               position=position_fill(vjust=.5)) + \
    geom_hline(yintercept=0, color="black")
Out[20]: