R语言学习笔记(二十一):字符串处理中的元字符(代码展示)

元字符有自己的特殊含义

[ ]内的任意字符将被匹配

grep(pattern = "[wW]", x = states, value = T) grep(pattern = "w", ignore.case = T, x = states, value = T)

\对元字符进行转义

strsplit("strsplit.also.uses", split = ".") strsplit("strsplit.also.uses", split = "\\.") str_extract_all("me credit card: 334", pattern = "\\d")

^匹配字符串的开头,将^置于character class 的首位表达的意思是取反义。如[ˆ5] 表示匹配除了“5” 以外的所有字符。

test_vector <- c("123","456","321") str_extract_all(test_vector, "3") str_extract_all(test_vector, "^3") str_extract_all(test_vector, "[^3]")

$匹配字符串的结尾。但将它置于character class 内则消除了它的特殊含义。如 [akm$]将匹配 a , k , m 或者 $ 。

str_extract_all(test_vector, "3$") str_extract_all(test_vector, "[3$]")

.匹配除换行符以外的任意字符。

str_extract_all(string = c("regular.exp\n","\n"), pattern =".")

| 或者

str_extract_all(string = "we23", pattern ="b|w|3")

?此符号前的字符(组) 是可有可无的,并且最多被匹配一次

str_extract_all(string = c("abc","bc","ac"),pattern = "ab?c")

( )表示一个字符组,括号内的字符串将作为一个整体被匹配

str_extract_all(string = c("abc","ac","cde"),pattern = "(ab)c")

*此符号前的字符(组) 将被匹配零次或多次

str_extract_all(string = c("abab","abc","ac"),pattern = "(ab)*")

+前面的字符(组) 将被匹配一次或多次

str_extract_all(string = c("abbab","abc","ac"),pattern = "ab+")

{n,m} 重复n次到m次

str_extract_all(string = c("abababab","ababc","abc"),pattern = "(ab){2}") str_extract_all(string = c("abababab","ababc","abc"),pattern = "(ab){2,}") str_extract_all(string = c("abababab","ababc","abc"),pattern = "(ab){2,3}")

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/zwgyzx.html