douge3113 2013-05-01 14:36
浏览 92
已采纳

为POP3电子邮件创建唯一ID的常规方法是什么?

IMAP messages have a UID for which we all rejoice. However, I'm trying to figure out how to generate a unique ID for a POP3 message and having trouble (old systems like hotmail.com only allow POP3).

Available messages to the client are fixed when a POP session opens the maildrop, and are identified by message-number local to that session or, optionally, by a unique identifier assigned to the message by the POP server. This unique identifier is permanent and unique to the maildrop and allows a client to access the same message in different POP sessions. Mail is retrieved and marked for deletion by message-number. When the client exits the session, the mail marked for deletion is removed from the maildrop. - wikipedia

It seems however, that the basic LIST command simply returns an array of temp numbers to allow you to fetch the email. Those numbers are in no way unique though so another extension called UIDL seems to have been added: CAPA (POP3 Extension Mechanism).

POP3 states that a UIDL is unique as long as the message exists.

The unique-id of a message is an arbitrary server-determined string, consisting of one to 70 characters in the range 0x21 to 0x7E, which uniquely identifies a message within a maildrop and which persists across sessions. This persistence is required even if a session ends without entering the UPDATE state. The server should never reuse an unique-id in a given maildrop, for as long as the entity using the unique-id exists.

Note that messages marked as deleted are not listed.

While it is generally preferable for server implementations to store arbitrarily assigned unique-ids in the maildrop, this specification is intended to permit unique-ids to be calculated as a hash of the message. Clients should be able to handle a situation where two identical copies of a message in a maildrop have the same unique-id.

Which makes me think that it's possible that I might download another message a year later (after the first one was deleted) which has the same UIDL and might clash in my system.

Should I just hash the whole message body and use that as an ID?

Rather than fetching the whole email to hash it, perhaps I should just use TOP [id] 1 to hash the headers (and first line) which shouldn't ever match an existing email since the receiving server will always add some type of information correct? So an attacker could never cause a collision since the received or something should have been modified right?

The MDaemon program seems to tackle the issue with partial header hashing:

MDaemon constructs the UIDL results using the message name, date stamp, size, and a few other details about the messages. As a result, if a message is modified on the server, it will appear as “new” to mail clients even if you don’t rename it.

What is the proper way to make an ID for a POP3 email?

Note: Emails often contain a Message-ID header - but I can't rely on that because it could be used as an attack vector to confuse my system. It also is left-out by some email clients.

  • 写回答

3条回答 默认 最新

  • dongxie559554 2013-05-09 20:49
    关注

    Personally, I would just hash a small subset of the email headers: something like Date, From, Subject, and Message-ID if available.

    I often subscribe to mailing lists where you tend receive multiple copies of the same message when someone is replying to you - one that comes directly from them, and another via the mail server. Under those circumstances, many of the headers are different, but I'd really rather not receive two copies of the message.

    And the chance of me receiving two different emails, from the same person at the same time, with the same subject and the same message-id seems extremely unlikely.

    Of course, it's not impossible. They might not generate message-ids, they might have a blank subject line, they might have a broken clock, and they might have all of those things at the same time. But then again, the router through which their email is passing might be wiped out by a giant meteor from space.

    Frankly, the most likely scenario is the email will end up being detected by spam and I'll never see it anyway. Email just isn't that reliable a form of communication. You need something that works reasonably well, but if it doesn't handle that 1 in a million edge case, you'll probably still be ok.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 求差集那个函数有问题,有无佬可以解决
  • ¥15 【提问】基于Invest的水源涵养
  • ¥20 微信网友居然可以通过vx号找到我绑的手机号
  • ¥15 寻一个支付宝扫码远程授权登录的软件助手app
  • ¥15 解riccati方程组
  • ¥15 display:none;样式在嵌套结构中的已设置了display样式的元素上不起作用?
  • ¥15 使用rabbitMQ 消息队列作为url源进行多线程爬取时,总有几个url没有处理的问题。
  • ¥15 Ubuntu在安装序列比对软件STAR时出现报错如何解决
  • ¥50 树莓派安卓APK系统签名
  • ¥65 汇编语言除法溢出问题