doucao8982 2016-01-21 23:52
浏览 37

解析已剥离EOL的电子邮件标头

I am working on a JS/JQuery based tool to review and analyze Internet Email Headers. The tools I have created before have relied on the presence of a New Line delimiter. From time to time, I get a customer that has sent in Internet Headers that have been Copy and Pasted enough times that the or and no longer read. Often substituted by \s\s or just \s. Hence there become no simple, or good way to find each of the lines.

That is exactly what I am trying to do. Given Internet Headers that may or may not contain the proper EOL, how can I capture all the lines or capture the various elements (Received:, X-Headers:, Message-ID:, To:, From:, Subject:, Date:).

Here is a Fiddle I have been working on: https://jsfiddle.net/Twisty/0n5tmm6L/

Snippet

  $("#clean-header").click(function(e) {
    e.preventDefault();
    if ($("#header-1").val() === "") {
      $("#error").html("No headers submitted.");
      return false;
    }
    $("#error").html("");
    var textLines = $("#header-1").val().split('
');
    if (textLines.length > 1) {
      console.log("Found " + textLines.length + " Lines ('\
') in  headers.");
      return false;
    } else {
      console.log("No EOL found in Headers. Seeking 'Received:'.");
      var s1 = /(Received):\s?from\s(.+?)\s(by.+?);\s(.+?,\s[0-9]{2}\s[a-z]{3}\s[0-9]{4}\s\d{2}:\d{2}:\d{2}.+?)\s{2}/ig;
      var match, received = [];
      var line = $("#header-1").val();
      while (match = s1.exec(line)) {
        received.push({
          "from": match[2],
          "by": match[3],
          "stamp": match[4]
        });
      }
      console.log("Found ", received.length, " Received Lines.");
    }
  });

I'm not afraid to push this to PHP with Pear and use the IMAP library to do this. I was just hoping to do it in the browser without having to involve the server.

In the absence of , I was hoping to make a few passes with regular expressions like the following:

/(Received):\s?(from.+?)\s(by.+?);\s(.+?,\s[0-9]{2}\s[a-z]{3}\s[0-9]{4}\s\d{2}:\d{2}:\d{2}.+?)\s{2}/ig
/(X-.+?):(.+?)\s\s/ig
/(Reply-To|Return-Path|From):.+?<(.+?)>\s{2}/ig
/(To):.+?<(.+?)>\s{2}/ig

I never know if what the EOL will be, if anything. I have also hunted to see if there are any good add-ons or libraries that already do this. So far, not seen any. Suggestions greatly welcome. Alternatives also welcome.

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥20 有偿 写代码 要用特定的软件anaconda 里的jvpyter 用python3写
    • ¥20 cad图纸,chx-3六轴码垛机器人
    • ¥15 移动摄像头专网需要解vlan
    • ¥20 access多表提取相同字段数据并合并
    • ¥20 基于MSP430f5529的MPU6050驱动,求出欧拉角
    • ¥20 Java-Oj-桌布的计算
    • ¥15 powerbuilder中的datawindow数据整合到新的DataWindow
    • ¥20 有人知道这种图怎么画吗?
    • ¥15 pyqt6如何引用qrc文件加载里面的的资源
    • ¥15 安卓JNI项目使用lua上的问题