R-ggplot2 学术散点图绘制

2021-02-22 15:16:36 浏览数 (1)

01. 引言

本期推文,我们使用 R-ggplot2 绘制学术拟合散点图,关注公众号并后台回复"资源分享"即可获取包括本篇教程的数据及其他绘图教程的Python代码和对应数据

02. R-ggplot2 绘制

(1)默认格式

我们首先使用ggplot2 的基本设置对数据进行散点绘制,这里散点形状 shape=15 为黑色方块。代码如下:

代码语言:javascript复制
plot <- ggplot(scatter_data,aes(x = true_data,y = model01_estimated))  
  geom_point(shape=15)  
   labs(
       title = "The scatter chart_pir of Train data and Tset data",
       subtitle = "scatter R-ggplot2 Exercise(no color)",
       caption = 'Visualization by DataCharm')
plot

结果如下:

这种效果是万万不能进行学术发表的,当然,你可以通过ggthemes 包 选择合适主题进行修饰。而推文的目的在于熟悉和理解绘图函数,再说了,一些统计指标还是需要自己另行添加的。

(2)添加拟合线、图序号

我们通过添加拟合线图序号等元素对图表进行完善,代码如下:

代码语言:javascript复制
plot <- ggplot(scatter_data,aes(x = true_data,y = model01_estimated))  
  geom_point(shape=15)   
  #绘制拟合线
  geom_smooth(method = 'lm',se = F,color='red',size=1)  
  #绘制对角线
  geom_abline(slope = 1,intercept = 0,color='black',linetype = "dashed",size=1)  
  #修改坐标轴刻度
  scale_x_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))  
  scale_y_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))  
  labs(x ='True Values',y="Model Estimated Value",
       title = "The scatter chart of Train data and Tset data",
       subtitle = "scatter R-ggplot2 Exercise(no color)",
       caption = 'Visualization by DataCharm') 
  #添加图序号(a)
  geom_text(x=1.85,y=1.85,label='(a)',size=9,family='Times_New_Roman',fontface='bold') 
  theme_base()  
  theme(text = element_text(family = "Times_New_Roman",face='bold'),
           axis.text = element_text(family = 'Times_New_Roman',size = 12,face = 'bold'),
           #修改刻度线内
           axis.ticks.length=unit(-0.22, "cm"), 
           #加宽图边框
           panel.border = element_rect(size=1),
           plot.background = element_rect(color = "white"),
           axis.line = element_line(size = .8),
           axis.ticks = element_line(size = .8),
           axis.text.x = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")), 
           axis.text.y = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")))
  
plot

知识点:

  • 直接使用 geom_smooth() 绘制拟合线,拟合方式为线性回归(lm),se 设为False ;
  • 通过geom_text()添加文字元素。其他代码均有对应解释。

结果如下:

(3)添加R2、误差线、误差统计等统计指标

这里就体现出R-ggplot2 绘制图表的灵活之处了,我们使用 ggpubr 包中的stat_cor()和stat_regline_equation() 直接绘制 R2 及拟合方程。代码如下:

代码语言:javascript复制
#导入需要的包
library(ggpubr)

plot <- ggplot(scatter_data,aes(x = true_data,y = model01_estimated))  
  geom_point(shape=15)   
  geom_smooth(method = 'lm',se = F,color='red',size=1)  
  #绘制对角线:最佳拟合线
  geom_abline(slope = 1,intercept = 0,color='black',size=1)  
  #绘制上误差线
  geom_abline(slope = 1.15,intercept = .05,linetype = "dashed",size=1)  
  #绘制下误差线
  geom_abline(slope = .85,intercept = -.05,linetype = "dashed",size=1)  
  #使用 ggpubr 包添加R2等元素
  stat_regline_equation(label.x = .1,label.y = 1.8,size=6,family='Times_New_Roman',fontface='bold') 
  stat_cor(aes(label = paste(..rr.label.., ..p.label.., sep = "~`,`~")),
      label.x = .1, label.y = 1.6,size=6,family='Times_New_Roman',fontface='bold')  
  geom_text(x=.1,y=1.4,label="N = 4348",size=6,family='Times_New_Roman',hjust = 0) 
  #修改坐标轴刻度
  scale_x_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))  
  scale_y_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))  
  labs(x ='True Values',y="Model Estimated",
       title = "The scatter chart of Train data and Tset data",
       subtitle = "scatter R-ggplot2 Exercise(no color)",
       caption = 'Visualization by DataCharm') 
  #添加图序号(a)
  geom_text(x=1.85,y=1.85,label='(a)',size=9,family='Times_New_Roman',fontface='bold') 
  #添加误差个数
  geom_text(x=1.4,y=.4,label='Within EE = 52%',size=5,family='Times_New_Roman',hjust = 0) 
  geom_text(x=1.4,y=.3,label='Above EE = 39%',size=5,family='Times_New_Roman',hjust = 0) 
  geom_text(x=1.4,y=.2,label='Below EE = 9%',size=5,family='Times_New_Roman',hjust = 0) 
  theme_base()  
  theme(text = element_text(family = "Times_New_Roman",face='bold'),
           axis.text = element_text(family = 'Times_New_Roman',size = 12,face = 'bold'),
           #修改刻度线内
           axis.ticks.length=unit(-0.22, "cm"), 
           #加宽图边框
           panel.border = element_rect(size=1),
           plot.background = element_rect(color = "white"),
           axis.line = element_line(size = .8),
           axis.ticks = element_line(size = .8),
           #去除图例标题
           #legend.title = element_blank(),
           #设置刻度label的边距
           axis.text.x = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")), 
           axis.text.y = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")))
  
plot

知识点:

  • 通过 geom_abline()方法设置不同斜率(slope)和截距(intercept),并对其进行定制化设置。
  • 使用 ggpubr 包添加R2等元素。详细内容大家可以查看对应官网(https://rpkgs.datanovia.com/ggpubr/reference/stat_cor.html )。
  • 其他代码均有对应解释。

结果如下:

(4)样式更改

还是和 Python-matplotlib 绘制一样Python-matplotlib学术散点图绘制 ,我们通过定制化修改进行散点图样式的更改,具体代码如下:

代码语言:javascript复制
library(ggpubr)

plot <- ggplot(scatter_data,aes(x = true_data,y = model01_estimated))  
  geom_point(shape=15)   
  geom_smooth(method = 'lm',se = F,color='red',size=1)  
  #绘制对角线:最佳拟合线
  geom_abline(slope = 1,intercept = 0,color='black',size=1)  
  #绘制上误差线
  geom_abline(slope = 1.15,intercept = .05,linetype = "dashed",size=1)  
  #绘制下误差线
  geom_abline(slope = .85,intercept = -.05,linetype = "dashed",size=1)  
  #使用 ggpubr 包添加R2等元素
  stat_regline_equation(label.x = .1,label.y = 1.8,size=6,family='Times_New_Roman',fontface='bold') 
  stat_cor(aes(label = paste(..rr.label.., ..p.label.., sep = "~`,`~")),
      label.x = .1, label.y = 1.6,size=6,family='Times_New_Roman',fontface='bold')  
  geom_text(x=.1,y=1.4,label="N = 4348",size=6,family='Times_New_Roman',hjust = 0) 
  #修改坐标轴刻度
  scale_x_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))  
  scale_y_continuous(limits = c(0,2),breaks = seq(0,2,0.2),expand = c(0,0))  
  labs(x ='True Values',y="Model Estimated",
       title = "The scatter chart of Train data and Tset data",
       subtitle = "scatter R-ggplot2 Exercise(no color)",
       caption = 'Visualization by DataCharm') 
  #添加图序号(a)
  geom_text(x=1.85,y=1.85,label='(a)',size=9,family='Times_New_Roman',fontface='bold') 
  #添加误差个数
  geom_text(x=1.4,y=.4,label='Within EE = 52%',size=5,family='Times_New_Roman',hjust = 0) 
  geom_text(x=1.4,y=.3,label='Above EE = 39%',size=5,family='Times_New_Roman',hjust = 0) 
  geom_text(x=1.4,y=.2,label='Below EE = 9%',size=5,family='Times_New_Roman',hjust = 0) 
  #theme_base()  
  theme(text = element_text(family = "Times_New_Roman",face='bold'),
           axis.text = element_text(family = 'Times_New_Roman',size = 12,face = 'bold'),
           #修改刻度线内
           axis.ticks.length=unit(0.22, "cm"), 
           #绘制虚线网格
           panel.grid.major.y = element_line(linetype = 'dotted',color = 'black'),
           #去除y刻度
           axis.ticks.y = element_blank(),
           axis.line.y = element_blank(),
           #去除panel 背景颜色
           panel.background = element_rect(fill = NA),
           panel.ontop = F,
           #加宽图边框
           #panel.border = element_rect(size=1),
           plot.background = element_rect(fill = NULL),
           axis.line = element_line(size = .8),
           axis.ticks = element_line(size = .8),
           #设置刻度label的边距
           axis.text.x = element_text(margin=unit(c(0.5,0.5,0.5,0.5), "cm")), 
           axis.text.y = element_text(margin=unit(c(0.3,0.3,0.3,0.3), "cm")))
  
plot

关键步骤代码都有对应解释哦

结果如下:

到这里,一幅符合学术出版相关性散点图就绘制完成了,我想需要绘制的图表元素应该都体现出来了

03. 总结

R-ggplot2 绘制相关性学术散点图还是很方便的(毕竟有好多优秀的第三方包

0 人点赞