检查符文是否在基本多语言平面中的正确方法是什么?

我要检查给定的符文是否在基本的多语言平面。 </ p>

也就是说,要在此函数中添加什么内容- https:/ /play.golang.org/p/3szTn8pP7xe </ p>

  package main 

import(
“ fmt”

func isBMP(r rune)bool {
// ???
返回false
}

func main(){
fmt.Println(isBMP(rune('պ') ))//期望为真
fmt.Println(isBMP(rune('

展开原文

原文

I want to check, whether a given rune is in a basic multilingual plane or not.

That is, what to put in this function - https://play.golang.org/p/3szTn8pP7xe

package main

import (
"fmt"
)

func isBMP(r rune) bool {
// ???
return false
}

func main() {
fmt.Println(isBMP(rune('պ'))) // expect true
fmt.Println(isBMP(rune('

2个回答

Basic Multilingual Plane have the following code point ranges allocated:

0000–​0FFF    8000–​8FFF
1000–​1FFF    9000–​9FFF
2000–​2FFF    A000–​AFFF
3000–​3FFF    B000–​BFFF
4000–​4FFF    C000–​CFFF
5000–​5FFF    D000–​DFFF
6000–​6FFF    E000–​EFFF
7000–​7FFF    F000–​FFFF

So to tell if a rune falls in the basic multilingual plane, just check if it falls inside any of these ranges. Since these ranges cover all values between 0 and 0xffff (both inclusive), just check it like this:

func isBMP(r rune) bool {
    return r >= 0 && r <= 0xffff
}

Note that since rune is alias for int32, it may have negative values, so also checking if it's not negative is important.

This will output your expected result. Try it on the Go Playground.

Note #2: iterating over the runes of a string which contains invalid UTF-8 bytes, you will get the Unicode replacement character for the invalid bytes, which is 0xfffd. If you want to exclude those from your test, you could modify it like:

func isBMP(r rune) bool {
    return r >= 0 && r <= 0xffff && r != 0xfffd
}
dongpo3957
dongpo3957 注意:对于非平凡的范围,还有unicode.RangeTable可以与unicode.In之类的东西一起使用。 golang.org/x/text/unicode/rangetable包提供了一些用于创建和检查unicode.RangeTables的实用程序。
12 个月之前 回复
douzhou7037
douzhou7037 完成。
12 个月之前 回复
dongzhuo0895
dongzhuo0895 我编辑了问题,您可以编辑答案。
12 个月之前 回复
duansao6776
duansao6776 我复制粘贴了一封错误的信,对不起。 我将编辑问题。
12 个月之前 回复

I'm not that familiar with go. However a bit of Googleing suggests that a rune is in fact an int32 so as anything in the basic multilingual plain has a code point between 0 and 65535 you should be able to do this

func isBMP(r rune) bool {
    if r <= 65535 {
        return true
    }
    else {
        return false
    }
}
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐