dream6120 2012-02-04 17:29
浏览 55
已采纳

如何检测文本的字符集?

I have a text with diacritic characters that are displayed bad, like this: ¤ or ˇ or ˘. I don't know what charset the text was. Is there any easy way to figure it out? It would be nice if there is some online charset detector or maybe charset conversion previewer? I think about a application that would show me how some specific diacritic characters look like malformed in all available encodings so i would be able to track the one that fits into the chars i have in the text.

Any ideas?

  • 写回答

2条回答 默认 最新

  • doupixian1436 2012-02-04 17:39
    关注

    In Windows PowerShell:

    $bytes = [IO.File]::ReadAllBytes('some file.txt')
    [Text.Encoding]::GetEncodings() |
      %{
        $_|Add-Member -pass Noteproperty Text ($_.GetEncoding().GetString($bytes))
      } | fl Name,Codepage,Text
    

    Adjust the path to the file and browse the results until you see something that looks correct ;-)

    This simply iterates through all encodings that are known to .NET and converts the text into a string using the respective encoding.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 关于#hadoop#的问题
  • ¥15 (标签-Python|关键词-socket)
  • ¥15 keil里为什么main.c定义的函数在it.c调用不了
  • ¥50 切换TabTip键盘的输入法
  • ¥15 可否在不同线程中调用封装数据库操作的类
  • ¥15 微带串馈天线阵列每个阵元宽度计算
  • ¥15 keil的map文件中Image component sizes各项意思
  • ¥20 求个正点原子stm32f407开发版的贪吃蛇游戏
  • ¥15 划分vlan后,链路不通了?
  • ¥20 求各位懂行的人,注册表能不能看到usb使用得具体信息,干了什么,传输了什么数据