从字符串中提取最后的 n 个字符

How can I get the last n characters from a string in R? Is there a function like SQL's RIGHT?

转载于:https://stackoverflow.com/questions/7963898/extracting-the-last-n-characters-from-a-string-in-r

14个回答

I'm not aware of anything in base R, but it's straight-forward to make a function to do this using substr and nchar:

x <- "some text in a string"

substrRight <- function(x, n){
  substr(x, nchar(x)-n+1, nchar(x))
}

substrRight(x, 6)
[1] "string"

substrRight(x, 8)
[1] "a string"

This is vectorised, as @mdsumner points out. Consider:

x <- c("some text in a string", "I really need to learn how to count")
substrRight(x, 6)
[1] "string" " count"
csdnceshi68
local-host Would it be more efficient to avoid calling nchar(x) twice by assigning it to a local variable?
大约 3 年之前 回复
csdnceshi77
狐狸.fox Use stringi package. It works fine with NAs and all encoding :)
6 年多之前 回复
csdnceshi71
Memor.の And watch out for NAs...
接近 9 年之前 回复

If you don't mind using the stringr package, str_sub is handy because you can use negatives to count backward:

x <- "some text in a string"
str_sub(x,-6,-1)
[1] "string"

Or, as Max points out in a comment to this answer,

str_sub(x, start= -6)
[1] "string"
csdnceshi75
衫裤跑路 I believe stringr had been remade using stringi as a backend, so should work with NAs etc. now.
4 年多之前 回复
weixin_41568131
10.24 stringr doesn't work well with NA's value and all encoding. I strongly reccomend stringi package :)
6 年多之前 回复
csdnceshi59
ℙℕℤℝ also, str_sub(x,start=-n) gets n last characters.
接近 9 年之前 回复

UPDATE: as noted by mdsumner, the original code is already vectorised because substr is. Should have been more careful.

And if you want a vectorised version (based on Andrie's code)

substrRight <- function(x, n){
  sapply(x, function(xx)
         substr(xx, (nchar(xx)-n+1), nchar(xx))
         )
}

> substrRight(c("12345","ABCDE"),2)
12345 ABCDE
 "45"  "DE"

Note that I have changed (nchar(x)-n) to (nchar(x)-n+1) to get n characters.

csdnceshi56
lrony* sapply != vectorized
6 年多之前 回复
csdnceshi60
℡Wang Yan Andrie's is already vectorized.
接近 9 年之前 回复
csdnceshi72
谁还没个明天 I think you mean "(nchar(x)-n) to (nchar(x)-n+1)"
接近 9 年之前 回复

someone before uses a similar solution to mine, but I find it easier to think as below:

> text<-"some text in a string" # we want to have only the last word "string" with 6 letter
> n<-5 #as the last character will be counted with nchar(), here we discount 1
> substr(x=text,start=nchar(text)-n,stop=nchar(text))

This will bring the last characters as desired.

I used the following code to get the last character of a string.

    substr(output, nchar(stringOfInterest), nchar(stringOfInterest))

You can play with the nchar(stringOfInterest) to figure out how to get last few characters.

Use stri_sub function from stringi package. To get substring from the end, use negative numbers. Look below for the examples:

stri_sub("abcde",1,3)
[1] "abc"
stri_sub("abcde",1,1)
[1] "a"
stri_sub("abcde",-3,-1)
[1] "cde"

You can install this package from github: https://github.com/Rexamine/stringi

It is available on CRAN now, simply type

install.packages("stringi")

to install this package.

An alternative to substr is to split the string into a list of single characters and process that:

N <- 2
sapply(strsplit(x, ""), function(x, n) paste(tail(x, n), collapse = ""), N)
csdnceshi54
hurriedly% I sense a system.time() battle brewing :-)
接近 9 年之前 回复

A simple base R solution using the substring() function (who knew this function even existed?):

RIGHT = function(x,n){
  substring(x,nchar(x)-n+1)
}

This takes advantage of basically being substr() underneath but has a default end value of 1,000,000.

Examples:

> RIGHT('Hello World!',2)
[1] "d!"
> RIGHT('Hello World!',8)
[1] "o World!"

Try this:

x <- "some text in a string"
n <- 5
substr(x, nchar(x)-n, nchar(x))

It shoudl give:

[1] "string"

I use substr too, but in a different way. I want to extract the last 6 characters of "Give me your food." Here are the steps:

(1) Split the characters

splits <- strsplit("Give me your food.", split = "")

(2) Extract the last 6 characters

tail(splits[[1]], n=6)

Output:

[1] " " "f" "o" "o" "d" "."

Each of the character can be accessed by splits[[1]][x], where x is 1 to 6.

共14条数据 1 尾页
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐