R语言专题1-字符串

代码语言：text复制

library(stringr) #学习前先加载这个包哦

专题1.字符串

1.str_length()-检测字符串长度

代码语言：text复制

x <- "The birch canoe slid on the smooth planks."
x

代码语言：txt复制

## [1] "The birch canoe slid on the smooth planks."

代码语言：text复制

#这边可以分别看下str_length()和length的区别
length(x) #数的是字符串的数量

代码语言：txt复制

## [1] 1

代码语言：text复制

str_length(x) #数的是一个字符串中字符的数量（包含空格）

代码语言：txt复制

## [1] 42

2.str_split()-字符串拆分

代码语言：text复制

x <- "The birch canoe slid on the smooth planks."
str_split(x," ") #后面的空格是个参数，以空格为标准拆分字符串

代码语言：txt复制

## [[1]]
## [1] "The"     "birch"   "canoe"   "slid"    "on"      "the"     "smooth"  "planks."

代码语言：text复制

class(str_split(x," ")) #class一下可以发现它的结构是列表

代码语言：txt复制

## [1] "list"

代码语言：text复制

x2 = str_split(x," ")[[1]];x2 #既然是list，取子集自然就得按照list的方式

代码语言：txt复制

## [1] "The"     "birch"   "canoe"   "slid"    "on"      "the"     "smooth"  "planks."

代码语言：text复制

# 换一个多字符串的向量康康
y = c("jimmy 150","nicker 140","tony 152")
str_split(y," ") #这样看上去就有点list那味了

代码语言：txt复制

## [[1]]
## [1] "jimmy" "150"  
## 
## [[2]]
## [1] "nicker" "140"   
## 
## [[3]]
## [1] "tony" "152"

代码语言：text复制

class(str_split(y," "))

代码语言：txt复制

## [1] "list"

代码语言：text复制

#由于list做一些相关操作比较麻烦，我们可以通过调整一些参数，把它变成data.frame
y2 = str_split(y," ",simplify = T);y2

代码语言：txt复制

##      [,1]     [,2] 
## [1,] "jimmy"  "150"
## [2,] "nicker" "140"
## [3,] "tony"   "152"

代码语言：text复制

class(y2)

代码语言：txt复制

## [1] "matrix" "array"

3.str_sub()-按位置提取字符串

代码语言：text复制

x <- "The birch canoe slid on the smooth planks."
str_sub(x,5,9) #同样包含空格哦

代码语言：txt复制

## [1] "birch"

4.str_detect()-字符检测

这部分内容很重要，若后续需要处理GEO等队列数据的分组信息时会用到这个函数

代码语言：text复制

#前两行代码上文已经做过阐述啦，这边以向量x2为例
x <- "The birch canoe slid on the smooth planks."
x2 = str_split(x," ")[[1]];x2

代码语言：txt复制

## [1] "The"     "birch"   "canoe"   "slid"    "on"      "the"     "smooth"  "planks."

代码语言：text复制

#str_detect()会生成一个与x2元素数量相等的逻辑向量
str_detect(x2,'h') #检测x2中的每个元素是否包含'h'；是=TRUE、否=FALSE

代码语言：txt复制

## [1]  TRUE  TRUE FALSE FALSE FALSE  TRUE  TRUE FALSE

代码语言：text复制

#这样写也符合规则
str_detect(x2,"h|s") #检测x2中的每个元素是否包含'h'或's'；是=TRUE、否=FALSE

代码语言：txt复制

## [1]  TRUE  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE

代码语言：text复制

#拓展：
str_starts(x2,"T") #检测每个元素是否以T开头；是=TRUE、否=FALSE

代码语言：txt复制

## [1]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

代码语言：text复制

str_ends(x2,"e") #检测每个元素是否以e结尾；是=TRUE、否=FALSE

代码语言：txt复制

## [1]  TRUE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE

5.str_replace()-字符替换

代码语言：text复制

x <- "The birch canoe slid on the smooth planks."
x2 = str_split(x," ")[[1]];x2

代码语言：txt复制

## [1] "The"     "birch"   "canoe"   "slid"    "on"      "the"     "smooth"  "planks."

代码语言：text复制

#替换第一个
str_replace(x2,'o','A') #将x2中每个元素的第一个'o'替换成'A'

代码语言：txt复制

## [1] "The"     "birch"   "canAe"   "slid"    "An"      "the"     "smAoth"  "planks."

代码语言：text复制

str_replace(x2,'o|s','A') #将x2中每个元素的第一个'o'或's'替换成'A'

代码语言：txt复制

## [1] "The"     "birch"   "canAe"   "Alid"    "An"      "the"     "Amooth"  "plankA."

代码语言：text复制

#替换所有
str_replace_all(x2,'o','A')#将x2中每个元素的所有'o'替换成'A'

代码语言：txt复制

## [1] "The"     "birch"   "canAe"   "slid"    "An"      "the"     "smAAth"  "planks."

6.str_remove()-字符删除

代码语言：text复制

x <- "The birch canoe slid on the smooth planks."
#删除第一个
str_remove(x,' ') #以空格为例

代码语言：txt复制

## [1] "Thebirch canoe slid on the smooth planks."

代码语言：text复制

#删除所有
str_remove_all(x,' ')

代码语言：txt复制

## [1] "Thebirchcanoeslidonthesmoothplanks."

引用自生信技能树

r语言

0 人点赞