dsavz66262
dsavz66262
2015-01-05 13:58

“正则表达式中太短的多字节代码字符串”是什么意思?

  • sublimetext3
  • regex
已采纳

I am creating a sublime text highlighting file. However, I am stuck with an error I don't fully understand. I have the following regex:

\x([0-9]|[A-F]|[a-f])([0-9]|[A-F]|[a-f])

When I try to load the file in sublime text, I get the error:

Error in regex: too short multibyte code string in regex \x([0-9]|[A-F]|[a-f])([0-9]|[A-F]|[a-f])

I have tried Googling to understand what this error means, the only thing I have come across that is relavent are the following links:

0. github issue of the rubinius project

1. stackoverflow thread

2. reddit thread

Unfortunately, from those links i could only determine that that error likely caused by a character encoding [ from 1 & 2]. I now suspect that "\x" might be the problem as everything else in that regex is fine. How does one escape that character and all others like it, in particular, can a golang script be used to sanitize regexes to get rid of such problems?

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

1条回答

  • douyalin2258 douyalin2258 6年前

    It means that you forgot to escape the \ in \x.
    Therefore, it's trying to parse a Unicode character escape of the form \x1234, and it didn't find enough numbers.

    点赞 评论 复制链接分享