gsub로 여러 글자를 악센트로 대체

Program Tip

gsub로 여러 글자를 악센트로 대체

programtip 2020. 11. 16. 22:06

gsub로 여러 글자를 악센트로 대체

물론 다음과 같은 특정 인수를 대체 할 수 있습니다.

    mydata=c("á","é","ó")
    mydata=gsub("á","a",mydata)
    mydata=gsub("é","e",mydata)
    mydata=gsub("ó","o",mydata)
    mydata

하지만 확실히이 모든 것을 온라인으로하는 더 쉬운 방법이 있습니다. 나는 이것에 대해 매우 포괄적 인 gsub 도움말을 찾지 못했습니다.

문자 번역 기능 사용

chartr("áéó", "aeo", mydata)

흥미로운 질문입니다! 가장 간단한 옵션은 "multi"gsub ()와 같은 특수 함수를 고안하는 것입니다.

mgsub <- function(pattern, replacement, x, ...) {
  if (length(pattern)!=length(replacement)) {
    stop("pattern and replacement do not have the same length.")
  }
  result <- x
  for (i in 1:length(pattern)) {
    result <- gsub(pattern[i], replacement[i], result, ...)
  }
  result
}

나에게주는 :

> mydata <- c("á","é","ó")
> mgsub(c("á","é","ó"), c("a","e","o"), mydata)
[1] "a" "e" "o"

아마도 이것은 유용 할 수 있습니다.

iconv('áéóÁÉÓçã', to="ASCII//TRANSLIT")
[1] "aeoAEOca"

stringi패키지를 사용 하여 이러한 문자를 바꿀 수 있습니다 .

> stri_trans_general(c("á","é","ó"), "latin-ascii")

[1] "a" "e" "o"

사용하는 또 다른 mgsub구현Reduce

mystring = 'This is good'
myrepl = list(c('o', 'a'), c('i', 'n'))

mgsub2 <- function(myrepl, mystring){
  gsub2 <- function(l, x){
   do.call('gsub', list(x = x, pattern = l[1], replacement = l[2]))
  }
  Reduce(gsub2, myrepl, init = mystring, right = T) 
}

이것은 @kith와 매우 유사하지만 함수 형식과 가장 일반적인 분음 부호의 경우입니다.

removeDiscritics <- function(string) {
  chartr(
     "ŠŽšžŸÀÁÂÃÄÅÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖÙÚÛÜÝàáâãäåçèéêëìíîïðñòóôõöùúûüýÿ"
    ,"SZszYAAAAAACEEEEIIIIDNOOOOOUUUUYaaaaaaceeeeiiiidnooooouuuuyy"
    , string
  )
}


removeDiscritics("test áéíóú")

"애 이우 테스트"

A problem with some of the implementations above (e.g., Theodore Lytras's) is that if the patterns are multiple characters, they may conflict in the case that one pattern is a substring of another. A way to solve this is to create a copy of the object and perform the pattern replacement in that copy. This is implemented in my package bayesbio, available on CRAN.

mgsub <- function(pattern, replacement, x, ...) {
  n = length(pattern)
  if (n != length(replacement)) {
    stop("pattern and replacement do not have the same length.")
  }
  result = x
  for (i in 1:n) {
    result[grep(pattern[i], x, ...)] = replacement[i]
  }
  return(result)
}

Here is a test case:

  asdf = c(4, 0, 1, 1, 3, 0, 2, 0, 1, 1)

  res = mgsub(c("0", "1", "2"), c("10", "11", "12"), asdf)

Not so elegant, but it works and does what you want

> diag(sapply(1:length(mydata), function(i, x, y) {
+   gsub(x[i],y[i], x=x)
+ }, x=mydata, y=c('a', 'b', 'c')))
[1] "a" "b" "c"

You can use the match function. Here match(x, y) returns the index of y where the element of x is matched. Then you can use the returned indices, to subset another vector (say z) that contains the replacements for the values of x, appropriately matched with y. In your case:

mydata <- c("á","é","ó")
desired <- c('a', 'e', 'o')

desired[match(mydata, mydata)]

In a simpler example, consider the situation below, where I was trying to substitute a for 'alpha', 'b' for 'beta' and so forth.

x <- c('a', 'a', 'b', 'c', 'b', 'c', 'e', 'e', 'd')

y <- c('a', 'b', 'c', 'd', 'e')
z <- c('alpha', 'beta', 'gamma', 'delta', 'epsilon')

z[match(x, y)]

Related to Justin's answer:

> m <- c("á"="a", "é"="e", "ó"="o")
> m[mydata]
  á   é   ó 
"a" "e" "o"

And you can get rid of the names with names(*) <- NULL if you want.

In this case, doesn't have so much sense, but if they are just two, you can also combine them with gsub:

mydata <- gsub("á","a", gsub("é","e",mydata))

참고URL : https://stackoverflow.com/questions/15253954/replace-multiple-letters-with-accents-with-gsub

'Program Tip' 카테고리의 다른 글

기존 DataTable에 값이있는 새 열을 추가하는 방법은 무엇입니까? (0)	2020.11.17
EditText 상자 높이를 확장하는 방법 (0)	2020.11.16
React 컴포넌트의 상태에 객체를 저장합니까? (0)	2020.11.16
vim의 명령 기록을 어떻게 검색합니까? (0)	2020.11.16
자동 조정 방법 (0)	2020.11.16

현재글gsub로 여러 글자를 악센트로 대체

programtip

gsub로 여러 글자를 악센트로 대체

gsub로 여러 글자를 악센트로 대체

'Program Tip' 카테고리의 다른 글

'Program Tip'의 다른글

티스토리툴바

gsub로 여러 글자를 악센트로 대체

gsub로 여러 글자를 악센트로 대체

'Program Tip' 카테고리의 다른 글

'Program Tip'의 다른글

관련글

티스토리툴바