 从字符串中提取最后的 n 个字符

How can I get the last n characters from a string in R? Is there a function like SQL's RIGHT?
I'm not aware of anything in base R, but it's straightforward to make a function to do this using substr
and nchar
:
x < "some text in a string"
substrRight < function(x, n){
substr(x, nchar(x)n+1, nchar(x))
}
substrRight(x, 6)
[1] "string"
substrRight(x, 8)
[1] "a string"
This is vectorised, as @mdsumner points out. Consider:
x < c("some text in a string", "I really need to learn how to count")
substrRight(x, 6)
[1] "string" " count"
If you don't mind using the stringr
package, str_sub
is handy because you can use negatives to count backward:
x < "some text in a string"
str_sub(x,6,1)
[1] "string"
Or, as Max points out in a comment to this answer,
str_sub(x, start= 6)
[1] "string"
 localhost I believe stringr had been remade using stringi as a backend, so should work with NAs etc. now.
 4 年多之前 回复
 ℡Wang Yan stringr doesn't work well with NA's value and all encoding. I strongly reccomend stringi package :)
 6 年多之前 回复
 必承其重  欲带皇冠 also, str_sub(x,start=n) gets n last characters.
 接近 9 年之前 回复
Use stri_sub
function from stringi
package.
To get substring from the end, use negative numbers.
Look below for the examples:
stri_sub("abcde",1,3)
[1] "abc"
stri_sub("abcde",1,1)
[1] "a"
stri_sub("abcde",3,1)
[1] "cde"
You can install this package from github: https://github.com/Rexamine/stringi
It is available on CRAN now, simply type
install.packages("stringi")
to install this package.
str = 'This is an example'
n = 7
result = substr(str,(nchar(str)+1)n,nchar(str))
print(result)
> [1] "example"
>
UPDATE: as noted by mdsumner, the original code is already vectorised because substr is. Should have been more careful.
And if you want a vectorised version (based on Andrie's code)
substrRight < function(x, n){
sapply(x, function(xx)
substr(xx, (nchar(xx)n+1), nchar(xx))
)
}
> substrRight(c("12345","ABCDE"),2)
12345 ABCDE
"45" "DE"
Note that I have changed (nchar(x)n)
to (nchar(x)n+1)
to get n
characters.
Another reasonably straightforward way is to use regular expressions and sub
:
sub('.*(?=.$)', '', string, perl=T)
So, "get rid of everything followed by one character". To grab more characters off the end, add however many dots in the lookahead assertion:
sub('.*(?=.{2}$)', '', string, perl=T)
where .{2}
means ..
, or "any two characters", so meaning "get rid of everything followed by two characters".
sub('.*(?=.{3}$)', '', string, perl=T)
for three characters, etc. You can set the number of characters to grab with a variable, but you'll have to paste
the variable value into the regular expression string:
n = 3
sub(paste('.+(?=.{', n, '})', sep=''), '', string, perl=T)
An alternative to substr
is to split the string into a list of single characters and process that:
N < 2
sapply(strsplit(x, ""), function(x, n) paste(tail(x, n), collapse = ""), N)
A simple base R solution using the substring()
function (who knew this function even existed?):
RIGHT = function(x,n){
substring(x,nchar(x)n+1)
}
This takes advantage of basically being substr()
underneath but has a default end value of 1,000,000.
Examples:
> RIGHT('Hello World!',2)
[1] "d!"
> RIGHT('Hello World!',8)
[1] "o World!"
I use substr
too, but in a different way. I want to extract the last 6 characters of "Give me your food." Here are the steps:
(1) Split the characters
splits < strsplit("Give me your food.", split = "")
(2) Extract the last 6 characters
tail(splits[[1]], n=6)
Output:
[1] " " "f" "o" "o" "d" "."
Each of the character can be accessed by splits[[1]][x]
, where x is 1 to 6.
someone before uses a similar solution to mine, but I find it easier to think as below:
> text<"some text in a string" # we want to have only the last word "string" with 6 letter
> n<5 #as the last character will be counted with nchar(), here we discount 1
> substr(x=text,start=nchar(text)n,stop=nchar(text))
This will bring the last characters as desired.
