字符串到UCS-2

I want to translate in Go my python program to convert an unicode string to a UCS-2 HEX string.

In python, it's quite simple:

u"Bien joué".encode('utf-16-be').encode('hex')
-> 004200690065006e0020006a006f007500e9

I am a beginner in Go and the simplest way I found is:

package main

import (
    "fmt"
    "strings"
)

func main() {
    str := "Bien joué" 
    fmt.Printf("str: %s
", str)

    ucs2HexArray := []rune(str)
    s := fmt.Sprintf("%U", ucs2HexArray)
    a := strings.Replace(s, "U+", "", -1)
    b := strings.Replace(a, "[", "", -1)
    c := strings.Replace(b, "]", "", -1)
    d := strings.Replace(c, " ", "", -1)
    fmt.Printf("->: %s", d)
}

str: Bien joué
->: 004200690065006E0020006A006F007500E9
Program exited.

I really think it's clearly not efficient. How can-I improve it?

Thank you

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douzhulan1815 2015-05-31 13:24
关注
Make this conversion a function then you can easily improve the conversion algorithm in the future. For example,

package main import ( "fmt" "strings" "unicode/utf16" ) func hexUTF16FromString(s string) string { hex := fmt.Sprintf("%04x", utf16.Encode([]rune(s))) return strings.Replace(hex[1:len(hex)-1], " ", "", -1) } func main() { str := "Bien joué" fmt.Println(str) hex := hexUTF16FromString(str) fmt.Println(hex) }

Output:

Bien joué 004200690065006e0020006a006f007500e9

NOTE:

You say "convert an unicode string to a UCS-2 string" but your Python example uses UTF-16:

u"Bien joué".encode('utf-16-be').encode('hex')

The Unicode Consortium

UTF-16 FAQ

Q: What is the difference between UCS-2 and UTF-16?

A: UCS-2 is obsolete terminology which refers to a Unicode implementation up to Unicode 1.1, before surrogate code points and UTF-16 were added to Version 2.0 of the standard. This term should now be avoided.

UCS-2 does not describe a data format distinct from UTF-16, because both use exactly the same 16-bit code unit representations. However, UCS-2 does not interpret surrogate code points, and thus cannot be used to conformantly represent supplementary characters.

Sometimes in the past an implementation has been labeled "UCS-2" to indicate that it does not support supplementary characters and doesn't interpret pairs of surrogate code points as characters. Such an implementation would not handle processing of character properties, code point boundaries, collation, etc. for supplementary characters.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(2条)

报告相同问题？

关注问题

字符串到UCS-2
2015-05-31 10:56

回答 3 已采纳 Make this conversion a function then you can easily improve the conversion algorithm in the future
检查字符串是UTF-8还是UCS-2 php
2012-01-19 18:10

回答 1 已采纳 First of all, the strings you show are hexadecimal representations, not the actual UCS-2 or UTF-8
UCS2 / HexEncoded字符 php
2009-12-09 09:51

回答 2 已采纳 mb_convert_encoding($str, 'UCS-2', 'auto') works correctly to convert the string, but you'll have
GB13000 UCS-2格式转UTF-8
2017-10-17 17:08

身份证读卡器生成的基本信息TEXT文件，由于是采用 GB 13000 的 UCS-2 编码格式，java读取出来是乱码，这个段代码就是解决转码问题的，亲测可用
哪个字符编码是字节0不为空？ php
2013-01-29 20:46

回答 2 已采纳 Any straightforward multibyte encoding (e.g. UTF-16 in all forms) will represent each code point a
请问cad二次开发 c#中点的ucs和wcs坐标相互转换的具体方法是什么？ c# 开发语言
2019-05-22 11:54

回答 1 已采纳 Document doc = Application.DocumentManager.MdiActiveDocument; Database db = doc.Database; // 拾取的
pdfjs使用过程中无法完全展示文件，报错是有字体无法识别，请问如何解决？ javascript
2019-04-12 14:39

回答 3 已采纳在项目中增加最新版本的cmaps字体依赖包即可解决该问题，下载地址：https://mozilla.github.io/pdf.js/getting_started/#download
UCS-2、UCS-4
2021-06-25 16:28

明月几时有666的博客为了在屏幕上显示字符。需要下面几个步骤: 制作所有字符对应的字模。比如大写字母A长什么样。这个模样就是最终显示在屏幕上图形，即我们看到的字符A。为对所有的字符进行编码。比如大写字母A的编码为0x41.由于字符...
mb_detect_encoding显示相同的编码 php
2013-08-08 12:47

回答 1 已采纳 From the documentation for mb_detect_order, the function that establishes the order in which mb_de
mybatisPlus分页报警告如何解决 sql 有问必答
2022-03-10 15:31

回答 2 已采纳 sql中包含子查询会触发这个异常。可以升级一下版本试试。
单元测试Symfony2 php symfony
2012-12-10 11:13

回答 1 已采纳 Instead of mocking the instance, go for the interface it implements. It almost always works better
linux utf8 转 ucs-2,Linux string conversion from UTF-8 to UNICODE, UCS-4LE, UCS-4LE
2021-05-18 11:00

vvv666s的博客 Linux string conversion from UTF-8 to UNICODE, UCS-4LE, UCS-4LE.It is astonishing for windows developers that Linux has two distinct difference to Windows character set.1. standard char * is default ....
运行结果Uas=0，为什么没有值啊 c语言
2022-10-04 17:40

回答 2 已采纳 11行改成： Uas = sqrt((1.0 / 30.0)*Add(R1, R2, R3, R4, R5, R6, Rs));
java ucs 2,【字符编码系列】JavaScript使用的编码-UCS-2
2021-04-22 09:55

王林楠的博客在JavaScrip中，进行一些GBK或者UTF-8编码的字符操作时，打印出来的经常是乱码，其原因就是因为JavaScript当然内置编码是UCS-2(UTF-16的子集)。所以弄懂JavaScript的内置编码还是很有必要的，否则对于一些字符操作，...
字符编码之UCS-2与Utf-8
2017-08-18 15:57

imxiangzi的博客很多操作系统都直接支持utf-8字符串操作，只有MS这个异类用的Unicode，就是所谓的ucs-2 如果写关于跨平台的代码，那么避免不了要做编码转化这里贴一下今天写的把Unicode转化为Utf-8的代码 Ucs2BeToUcs2Le负责将...
ASCII、Unicode、UCS-2、UTF-8 等字符编码规则的区别与联系
2022-09-01 08:30

零号萌新的博客计算机对数据的读取是按照一个字节的大小来读取识别的，那么面对全世界这么多语言，计算机怎么知道是多个字节表示一个符号，而不是分别表示多个符号呢？...ASCII、Unicode、UCS-2、UCS-4、UTF-8、UTF-16、UTF-32......
java gb13000 ucs2_采用GB 13000的UCS-2进行存储的文件怎么转换
2021-03-14 18:14

轻喘的博客假设文件头采用标准UCS2格式的两个字节，每个字段的数据是通过\t分隔的，每行文字是一条记录，如果有不同，需要对程序进行调整。FILE *f = _wfopen(L"d:\\文件名.txt",L"rb");if(f) // 打开文件成功{unsigned char ...
utf8转ucs2编码c语言实现,C++字符转换
2021-05-20 05:54

Louielim087的博客 UTF_8与GBK在windows平台下sizeof...在windows平台下宽字符(或字符串)字面量使用UTF-16编码，linux平台下使用UTF-32编码。MultiByteToWideChar、WideCharToMultiBytestd::string UTF8ToGBK(const std::string&am...
c语言可以调用ucs2字符集,C++：字符串编码与字符串
2021-05-17 04:48

妩媚怡口莲的博客 1、编码在讲字符串之前首先说说编码方式。字符串在程序用用数据类型进行存储，同时数据类型存储的也可以是不同编码方式的字符串。总的来说，常用编码方式有以下几种：ASCII：最古老的编码方式，只使用后7位，可以...
编码方式部分信息整合：Unicode、UCS-2/4、UTF-8/16/32、GB2312、GBK
2020-05-07 23:44

watersevenmmfx的博客 ISO：国际标准化组织（International Organization for Standardization，ISO）简称ISO。负责目前绝大部分领域...UTF：（Unicode Transformation Format）通用转换格式或 Unicode字符集转换格式。属于Unicode Sta...
没有解决我的问题, 去提问

悬赏问题

¥20 西门子S7-Graph,S7-300，梯形图
¥50 用易语言http 访问不了网页
¥50 safari浏览器fetch提交数据后数据丢失问题
¥15 matlab不知道怎么改，求解答！！
¥15 永磁直线电机的电流环pi调不出来
¥15 用stata实现聚类的代码
¥15 请问paddlehub能支持移动端开发吗？在Android studio上该如何部署？
¥20 docker里部署springboot项目，访问不到扬声器
¥15 netty整合springboot之后自动重连失效
¥15 悬赏！微信开发者工具报错，求帮改

字符串到UCS-2

3条回答 默认 最新

悬赏问题

3条回答默认最新