drsdvwsvo78320812 2013-07-22 16:10
浏览 101

mysql换行符格式化

I have an application which has a always worked with no issues. Fast forward to today: all formatting is broken. Basically I am inserting a plain text emails to mysql db, something that has worked for more than 5 years because nothing has changed. In my php code the plain text looked like this:

hello [name],

How are you?

This is a test.

Thank you.

Ceo

Today I looked at the same php code containing the email, so this is just sitting there, like a file. Then I look at existing plain text of the email which has always been in the database and they both look like this:

hello [name],

�How are you?

�This is a test.

�Thank you.

�
Ceo

Now before I pull all my hair out, do you all know what happened in mysql db, on the browser, the server? (Oh and due to this, I am unable to get emails too.)

The glories of Monday.

  • 写回答

3条回答 默认 最新

  • doulongsha5478 2013-07-22 16:50
    关注

    "�" has the following characters from latin-1 (iso-8859-1):

       303  195  C3    Ã    LATIN CAPITAL LETTER A WITH TILDE
       257  175  AF    ¯    MACRON
       302  194  C2    Â    LATIN CAPITAL LETTER A WITH CIRCUMFLEX
       277  191  BF    ¿    INVERTED QUESTION MARK
       275  189  BD    ½    VULGAR FRACTION ONE HALF
    

    The byte sequence is, then C3 AF C2 BF C2 BD. This "smells" like UTF-8. Decoding (per https://en.wikipedia.org/wiki/UTF-8), we turn these into bit-patterns:

    • 11000011
    • 10101111
    • 11000010
    • 10111111
    • 11000010
    • 10111101

    That first one (110xxxxx) indicates it's the first byte in a two-byte character, and stripping the marker bits from 11000011 10101111 yields ...00011 ..101111 or 00000000 00000000 00000000 11101111 == U+000000EF.

    Similarly, the next two make ...00010 ..111111 or U+000000BF.

    Then ...00010 ..111101 or U+000000BD.

    U+00EF U+00BF U+00BD (per https://en.wikibooks.org/wiki/Unicode/Character_reference/0000-0FFF) are "�", which is clearly not right.

    However, this answer — https://stackoverflow.com/a/6544206/1105015 — seems to provide some insight. EF BF BD is the UTF-8 representation of the "replacement character" U+FFFD. So it looks like something way up the line got a character that confused your system, it was stored as the replacement character, and then eventually re-rendered as latin-1.

    What i'd suggest looking closely at at this point is actually the encoding you use when inserting into the db. Maybe the only thing that changed is the MySQL client used for that?

    评论

报告相同问题?

悬赏问题

  • ¥20 西门子S7-Graph,S7-300,梯形图
  • ¥50 用易语言http 访问不了网页
  • ¥50 safari浏览器fetch提交数据后数据丢失问题
  • ¥15 matlab不知道怎么改,求解答!!
  • ¥15 永磁直线电机的电流环pi调不出来
  • ¥15 用stata实现聚类的代码
  • ¥15 请问paddlehub能支持移动端开发吗?在Android studio上该如何部署?
  • ¥20 docker里部署springboot项目,访问不到扬声器
  • ¥15 netty整合springboot之后自动重连失效
  • ¥15 悬赏!微信开发者工具报错,求帮改