01. 引言
本期推文,我们使用 R-ggplot2 绘制学术拟合散点图,关注公众号并后台回复"资源分享"即可获取包括本篇教程的数据及其他绘图教程的Python代码和对应数据
。
02. R-ggplot2 绘制
(1)默认格式
我们首先使用ggplot2 的基本设置对数据进行散点绘制,这里散点形状 shape=15 为黑色方块。代码如下:
代码语言:javascript复制plot <- ggplot(scatter_data,aes(x = true_data,y = model01_estimated))
geom_point(shape=15)
labs(
title = "The scatter chart_pir of Train data and Tset data",
subtitle = "scatter R-ggplot2 Exercise(no color)",
caption = 'Visualization by DataCharm')
plot
结果如下:
这种效果是万万不能进行学术发表的,当然,你可以通过ggthemes 包 选择合适主题进行修饰。而推文的目的在于熟悉和理解绘图函数,再说了,一些统计指标还是需要自己另行添加的。
(2)添加拟合线、图序号
我们通过添加拟合线和图序号等元素对图表进行完善,代码如下:
代码语言:javascript复制plot <- ggplot(scatter_data,aes(x = true_data,y = model01_estimated))
geom_point(shape=15)
#绘制拟合线
geom_smooth(method = 'lm',se = F,color='red',size=1)
#绘制对角线
geom_abline(slope = 1,intercept = 0,color='black',linetype = "dashed",size=1)
#修改坐标轴刻度
scale_x_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))
scale_y_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))
labs(x ='True Values',y="Model Estimated Value",
title = "The scatter chart of Train data and Tset data",
subtitle = "scatter R-ggplot2 Exercise(no color)",
caption = 'Visualization by DataCharm')
#添加图序号(a)
geom_text(x=1.85,y=1.85,label='(a)',size=9,family='Times_New_Roman',fontface='bold')
theme_base()
theme(text = element_text(family = "Times_New_Roman",face='bold'),
axis.text = element_text(family = 'Times_New_Roman',size = 12,face = 'bold'),
#修改刻度线内
axis.ticks.length=unit(-0.22, "cm"),
#加宽图边框
panel.border = element_rect(size=1),
plot.background = element_rect(color = "white"),
axis.line = element_line(size = .8),
axis.ticks = element_line(size = .8),
axis.text.x = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")),
axis.text.y = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")))
plot
知识点:
- 直接使用 geom_smooth() 绘制拟合线,拟合方式为线性回归(lm),se 设为False ;
- 通过geom_text()添加文字元素。其他代码均有对应解释。
结果如下:
(3)添加R2、误差线、误差统计等统计指标
这里就体现出R-ggplot2 绘制图表的灵活之处了,我们使用 ggpubr 包中的stat_cor()和stat_regline_equation() 直接绘制 R2 及拟合方程。代码如下:
代码语言:javascript复制#导入需要的包
library(ggpubr)
plot <- ggplot(scatter_data,aes(x = true_data,y = model01_estimated))
geom_point(shape=15)
geom_smooth(method = 'lm',se = F,color='red',size=1)
#绘制对角线:最佳拟合线
geom_abline(slope = 1,intercept = 0,color='black',size=1)
#绘制上误差线
geom_abline(slope = 1.15,intercept = .05,linetype = "dashed",size=1)
#绘制下误差线
geom_abline(slope = .85,intercept = -.05,linetype = "dashed",size=1)
#使用 ggpubr 包添加R2等元素
stat_regline_equation(label.x = .1,label.y = 1.8,size=6,family='Times_New_Roman',fontface='bold')
stat_cor(aes(label = paste(..rr.label.., ..p.label.., sep = "~`,`~")),
label.x = .1, label.y = 1.6,size=6,family='Times_New_Roman',fontface='bold')
geom_text(x=.1,y=1.4,label="N = 4348",size=6,family='Times_New_Roman',hjust = 0)
#修改坐标轴刻度
scale_x_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))
scale_y_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))
labs(x ='True Values',y="Model Estimated",
title = "The scatter chart of Train data and Tset data",
subtitle = "scatter R-ggplot2 Exercise(no color)",
caption = 'Visualization by DataCharm')
#添加图序号(a)
geom_text(x=1.85,y=1.85,label='(a)',size=9,family='Times_New_Roman',fontface='bold')
#添加误差个数
geom_text(x=1.4,y=.4,label='Within EE = 52%',size=5,family='Times_New_Roman',hjust = 0)
geom_text(x=1.4,y=.3,label='Above EE = 39%',size=5,family='Times_New_Roman',hjust = 0)
geom_text(x=1.4,y=.2,label='Below EE = 9%',size=5,family='Times_New_Roman',hjust = 0)
theme_base()
theme(text = element_text(family = "Times_New_Roman",face='bold'),
axis.text = element_text(family = 'Times_New_Roman',size = 12,face = 'bold'),
#修改刻度线内
axis.ticks.length=unit(-0.22, "cm"),
#加宽图边框
panel.border = element_rect(size=1),
plot.background = element_rect(color = "white"),
axis.line = element_line(size = .8),
axis.ticks = element_line(size = .8),
#去除图例标题
#legend.title = element_blank(),
#设置刻度label的边距
axis.text.x = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")),
axis.text.y = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")))
plot
知识点:
- 通过 geom_abline()方法设置不同斜率(slope)和截距(intercept),并对其进行定制化设置。
- 使用 ggpubr 包添加R2等元素。详细内容大家可以查看对应官网(https://rpkgs.datanovia.com/ggpubr/reference/stat_cor.html )。
- 其他代码均有对应解释。
结果如下:
(4)样式更改
还是和 Python-matplotlib 绘制一样Python-matplotlib学术散点图绘制 ,我们通过定制化修改进行散点图样式的更改,具体代码如下:
代码语言:javascript复制library(ggpubr)
plot <- ggplot(scatter_data,aes(x = true_data,y = model01_estimated))
geom_point(shape=15)
geom_smooth(method = 'lm',se = F,color='red',size=1)
#绘制对角线:最佳拟合线
geom_abline(slope = 1,intercept = 0,color='black',size=1)
#绘制上误差线
geom_abline(slope = 1.15,intercept = .05,linetype = "dashed",size=1)
#绘制下误差线
geom_abline(slope = .85,intercept = -.05,linetype = "dashed",size=1)
#使用 ggpubr 包添加R2等元素
stat_regline_equation(label.x = .1,label.y = 1.8,size=6,family='Times_New_Roman',fontface='bold')
stat_cor(aes(label = paste(..rr.label.., ..p.label.., sep = "~`,`~")),
label.x = .1, label.y = 1.6,size=6,family='Times_New_Roman',fontface='bold')
geom_text(x=.1,y=1.4,label="N = 4348",size=6,family='Times_New_Roman',hjust = 0)
#修改坐标轴刻度
scale_x_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))
scale_y_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))
labs(x ='True Values',y="Model Estimated",
title = "The scatter chart of Train data and Tset data",
subtitle = "scatter R-ggplot2 Exercise(no color)",
caption = 'Visualization by DataCharm')
#添加图序号(a)
geom_text(x=1.85,y=1.85,label='(a)',size=9,family='Times_New_Roman',fontface='bold')
#添加误差个数
geom_text(x=1.4,y=.4,label='Within EE = 52%',size=5,family='Times_New_Roman',hjust = 0)
geom_text(x=1.4,y=.3,label='Above EE = 39%',size=5,family='Times_New_Roman',hjust = 0)
geom_text(x=1.4,y=.2,label='Below EE = 9%',size=5,family='Times_New_Roman',hjust = 0)
#theme_base()
theme(text = element_text(family = "Times_New_Roman",face='bold'),
axis.text = element_text(family = 'Times_New_Roman',size = 12,face = 'bold'),
#修改刻度线内
axis.ticks.length=unit(0.22, "cm"),
#绘制虚线网格
panel.grid.major.y = element_line(linetype = 'dotted',color = 'black'),
#去除y刻度
axis.ticks.y = element_blank(),
axis.line.y = element_blank(),
#去除panel 背景颜色
panel.background = element_rect(fill = NA),
panel.ontop = F,
#加宽图边框
#panel.border = element_rect(size=1),
plot.background = element_rect(fill = NULL),
axis.line = element_line(size = .8),
axis.ticks = element_line(size = .8),
#设置刻度label的边距
axis.text.x = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")),
axis.text.y = element_text(margin=unit(c(0.3,0.3,0.3,0.3), "cm")))
plot
关键步骤代码都有对应解释哦
结果如下:
到这里,一幅符合学术出版的相关性散点图就绘制完成了,我想需要绘制的图表元素应该都体现出来了
03. 总结
R-ggplot2 绘制相关性学术散点图还是很方便的(毕竟有好多优秀的第三方包