首頁 > 軟體

R語言gsub替換字元工具的具體使用

2021-03-11 10:00:25

gsub()可以用於欄位的刪減、增補、替換和切割,可以處理一個欄位也可以處理由欄位組成的向量。

具體的使用方法為:gsub("目標字元", "替換字元", 物件)

在gsub函數中,任何欄位處理都由將「替換字元」替換到「目標字元」這一流程中實現,令替換字元為''''可實現刪除,令替換字元為"目標字元+增補內容"可實現增補,替換和切割也是使用類似的操作。

> text <- "AbcdEfgh . Ijkl MNM"
> gsub("Efg", "AAA", text) #將Efg改為AAA,區分大小寫
[1] "AbcdAAAh . Ijkl MNM"

 任何符號,包括空格、Tab和換行都是可以識別的

> gsub(" I", "i", text)  #可識別空格
[1] "AbcdEfgh .ijkl MNM"

同時字元可以識別多個,進行批次置換

> gsub("M", "N", text) 
[1] "AbcdEfgh . Ijkl NNN" 

除此之外,gsub還有其他批次操作的方法

> gsub("^.* ", "a", text) #開頭直到最後一個空格結束替換成a
[1] "aMNM"
> gsub("^.* I(j).*$", "1", text) #只保留一個j
[1] "j"
> gsub(" .*$", "b", text) #第一個空格直達結尾替換成b
[1] "AbcdEfghb"
> gsub(".", "+", text) #句號.和加號+是特殊的,要新增來識別
[1] "AbcdEfgh + Ijkl MNM"

Syntax Description
d Digit, 0,1,2 ... 9
D Not Digit
s Space
S Not Space
w Word
W Not Word
t Tab
n New line
^ Beginning of the string
$ End of the string
Escape special characters, e.g. is "", + is "+"
| Alternation match. e.g. /(e|d)n/ matches "en" and "dn"
Any character, except n or line terminator
[ab] a or b
[^ab] Any character except a and b
[0-9] All Digit
[A-Z] All uppercase A to Z letters
[a-z] All lowercase a to z letters
[A-z] All Uppercase and lowercase a to z letters
i+ i at least one time
i* i zero or more times
i? i zero or 1 time
i{n} i occurs n times in sequence
i{n1,n2} i occurs n1 - n2 times in sequence
i{n1,n2}? non greedy match, see above example
i{n,} i occures >= n times
[:alnum:] Alphanumeric characters: [:alpha:] and [:digit:]
[:alpha:] Alphabetic characters: [:lower:] and [:upper:]
[:blank:] Blank characters: e.g. space, tab
[:cntrl:] Control characters
[:digit:] Digits: 0 1 2 3 4 5 6 7 8 9
[:graph:] Graphical characters: [:alnum:] and [:punct:]
[:lower:] Lower-case letters in the current locale
[:print:] Printable characters: [:alnum:], [:punct:] and space
[:punct:] Punctuation character: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ ] ^ _ ` { | } ~
[:space:] Space characters: tab, newline, vertical tab, form feed, carriage return, space
[:upper:] Upper-case letters in the current locale
[:xdigit:] Hexadecimal digits: 0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f

到此這篇關於R語言gsub替換字元工具的具體使用的文章就介紹到這了,更多相關R語言gsub替換字元工具內容請搜尋it145.com以前的文章或繼續瀏覽下面的相關文章希望大家以後多多支援it145.com!


IT145.com E-mail:sddin#qq.com