Handling an overplotting on a scatter plot: `stat="sum"`¶

The "sum" stat counts the number of observations at each location.

Computed variables:

from lets_plot import *
from lets_plot.mapping import *
import pandas as pd

LetsPlot.setup_html()

mpg_df = pd.read_csv ("https://raw.githubusercontent.com/JetBrains/lets-plot-docs/master/data/mpg.csv")
mpg_df.head()

p = ggplot(mpg_df, aes(x=as_discrete('class', order=1), y=as_discrete('drv', order=1)))

p + geom_point(stat='sum')

p + geom_point(aes(size='..prop..'), stat='sum')

Note: group by "class".

p + geom_point(aes(size='..prop..', group='class'), stat='sum')

	Unnamed: 0	manufacturer	model	displ	year	cyl	trans	drv	cty	hwy	fl	class
0	1	audi	a4	1.8	1999	4	auto(l5)	f	18	29	p	compact
1	2	audi	a4	1.8	1999	4	manual(m5)	f	21	29	p	compact
2	3	audi	a4	2.0	2008	4	manual(m6)	f	20	31	p	compact
3	4	audi	a4	2.0	2008	4	auto(av)	f	21	30	p	compact
4	5	audi	a4	2.8	1999	6	auto(l5)	f	16	26	p	compact

Handling an overplotting on a scatter plot: stat="sum"¶