New Variables ..sumprop.., ..sumpct..
in the count and count2d Statistics

Computed variables ..sumprop.. and ..sumpct.. take the value of the share of observations at a given location relative to the total number of observations.

This is in contrast to computed variables ..prop.. and ..proppct.. which take the value of the share of observations belonging to a given group relative to the number of observations at a given location.

In [1]:
import pandas as pd

from lets_plot import *
from lets_plot.mapping import as_discrete
In [2]:
LetsPlot.setup_html()
In [3]:
data = {
    'x': ['a', 'a', 'a', 'a', 'b', 'b'],
    'group': ['A', 'A', 'A', 'B', 'A', 'B'],
}
In [4]:
tooltip_options = layer_tooltips(["..sumprop..", "..sumpct..", "..prop..", "..proppct.."])

1. Use New ..sumprop.. and ..sumpct.. for Plots without Grouping

Note: compare vaues shown in the tooltip.

In [5]:
ggplot(data, aes('x')) + \
    geom_bar(tooltips=tooltip_options, labels=layer_labels().line('@..sumprop.. (@..sumpct..)'))
Out[5]:

2. However "..prop..", "..proppct.." Better Suit for Grouped Plots

Note: again, compare vaues shown in the tooltip.

In [6]:
ggplot(data, aes('x', fill='group')) + \
    geom_bar(tooltips=tooltip_options, labels=layer_labels().line('@..prop.. (@..proppct..)'))
Out[6]:

3. The count2d Stat Works Similarly

In [7]:
df = pd.read_csv("https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/mpg.csv")
print(df.shape)
df.head()
(234, 12)
Out[7]:
Unnamed: 0 manufacturer model displ year cyl trans drv cty hwy fl class
0 1 audi a4 1.8 1999 4 auto(l5) f 18 29 p compact
1 2 audi a4 1.8 1999 4 manual(m5) f 21 29 p compact
2 3 audi a4 2.0 2008 4 manual(m6) f 20 31 p compact
3 4 audi a4 2.0 2008 4 auto(av) f 21 30 p compact
4 5 audi a4 2.8 1999 6 auto(l5) f 16 26 p compact
In [8]:
ggplot(df, aes("drv", as_discrete("year"))) + \
    geom_pie(aes(fill="class", size='..sum..'), tooltips=tooltip_options) + \
    scale_size(guide='none')
Out[8]: