douge3113 2013-05-01 14:36
浏览 93
已采纳

为POP3电子邮件创建唯一ID的常规方法是什么?

IMAP messages have a UID for which we all rejoice. However, I'm trying to figure out how to generate a unique ID for a POP3 message and having trouble (old systems like hotmail.com only allow POP3).

Available messages to the client are fixed when a POP session opens the maildrop, and are identified by message-number local to that session or, optionally, by a unique identifier assigned to the message by the POP server. This unique identifier is permanent and unique to the maildrop and allows a client to access the same message in different POP sessions. Mail is retrieved and marked for deletion by message-number. When the client exits the session, the mail marked for deletion is removed from the maildrop. - wikipedia

It seems however, that the basic LIST command simply returns an array of temp numbers to allow you to fetch the email. Those numbers are in no way unique though so another extension called UIDL seems to have been added: CAPA (POP3 Extension Mechanism).

POP3 states that a UIDL is unique as long as the message exists.

The unique-id of a message is an arbitrary server-determined string, consisting of one to 70 characters in the range 0x21 to 0x7E, which uniquely identifies a message within a maildrop and which persists across sessions. This persistence is required even if a session ends without entering the UPDATE state. The server should never reuse an unique-id in a given maildrop, for as long as the entity using the unique-id exists.

Note that messages marked as deleted are not listed.

While it is generally preferable for server implementations to store arbitrarily assigned unique-ids in the maildrop, this specification is intended to permit unique-ids to be calculated as a hash of the message. Clients should be able to handle a situation where two identical copies of a message in a maildrop have the same unique-id.

Which makes me think that it's possible that I might download another message a year later (after the first one was deleted) which has the same UIDL and might clash in my system.

Should I just hash the whole message body and use that as an ID?

Rather than fetching the whole email to hash it, perhaps I should just use TOP [id] 1 to hash the headers (and first line) which shouldn't ever match an existing email since the receiving server will always add some type of information correct? So an attacker could never cause a collision since the received or something should have been modified right?

The MDaemon program seems to tackle the issue with partial header hashing:

MDaemon constructs the UIDL results using the message name, date stamp, size, and a few other details about the messages. As a result, if a message is modified on the server, it will appear as “new” to mail clients even if you don’t rename it.

What is the proper way to make an ID for a POP3 email?

Note: Emails often contain a Message-ID header - but I can't rely on that because it could be used as an attack vector to confuse my system. It also is left-out by some email clients.

  • 写回答

3条回答 默认 最新

  • dongxie559554 2013-05-09 20:49
    关注

    Personally, I would just hash a small subset of the email headers: something like Date, From, Subject, and Message-ID if available.

    I often subscribe to mailing lists where you tend receive multiple copies of the same message when someone is replying to you - one that comes directly from them, and another via the mail server. Under those circumstances, many of the headers are different, but I'd really rather not receive two copies of the message.

    And the chance of me receiving two different emails, from the same person at the same time, with the same subject and the same message-id seems extremely unlikely.

    Of course, it's not impossible. They might not generate message-ids, they might have a blank subject line, they might have a broken clock, and they might have all of those things at the same time. But then again, the router through which their email is passing might be wiped out by a giant meteor from space.

    Frankly, the most likely scenario is the email will end up being detected by spam and I'll never see it anyway. Email just isn't that reliable a form of communication. You need something that works reasonably well, but if it doesn't handle that 1 in a million edge case, you'll probably still be ok.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 微信小程序协议怎么写
  • ¥15 c语言怎么用printf(“\b \b”)与getch()实现黑框里写入与删除?
  • ¥20 怎么用dlib库的算法识别小麦病虫害
  • ¥15 华为ensp模拟器中S5700交换机在配置过程中老是反复重启
  • ¥15 java写代码遇到问题,求帮助
  • ¥15 uniapp uview http 如何实现统一的请求异常信息提示?
  • ¥15 有了解d3和topogram.js库的吗?有偿请教
  • ¥100 任意维数的K均值聚类
  • ¥15 stamps做sbas-insar,时序沉降图怎么画
  • ¥15 买了个传感器,根据商家发的代码和步骤使用但是代码报错了不会改,有没有人可以看看