douqian3712 2008-10-09 12:42
浏览 59
已采纳

使用CSV文件上的PHP替换或删除新行,但仅在单引号或双引号之间

I have a CSV file that holds about 200,000 - 300,000 records. Most of the records can be separated and inserted into a MySQL database with a simple

$line = explode("
", $fileData);

and then the values separated with

$lineValues = explode(',', $line);

and then inserted into the database using the proper data type i.e int, float, string, text, etc.

However, some of the records have a text column that includes a in the string. Which breaks when using the $line = explode(" ", $fileData); method. Each line of data that needs to be inserted into the database has approximately 216 columns. not every line has a record with a in the string. However, each time a is found in the line it is enclosed between a pair of single quotes (')

each line is set up in the following format:

id,data,data,data,text,more data

example:

1,0,0,0,'Hello World,0
2,0,0,0,'Hello
    World',0
3,0,0,0,'Hi',0
4,0,0,0,,0

As you can see from the example, most records can be easily split with the methods shown above. Its the second record in the example that causes the problem.

New lines are only and the file does not include in the file at all.

  • 写回答

5条回答 默认 最新

  • duane9322 2008-10-09 12:48
    关注

    If the csv data is in a file, you can just use fgetcsv() as others have pointed out. fgetcsv handles embedded newlines correctly.

    However if your csv data is in a string (like $fileData in your example) the following method may be useful as str_getcsv() only works on a row at a time and cannot split a whole file into records.

    You can detect the embedded newlines by counting the quotes in each line. If there are an odd number of quotes, you have an incomplete line, so concatenate this line with the following line. Once you have an even number of quotes, you have a complete record.

    Once you have a complete record, split it at the quotes (again using explode()). Odd-numbered fields are quoted (thus embedded commas are not special), even-numbered fields are not.

    Example:

    # Split file into physical lines (records may span lines)
    $lines = explode("
    ", $fileData);
    
    # Re-assemble records
    $records = array ();
    $record = '';
    $lineSep = '';
    foreach ($lines as $line) {
      # Escape @ symbol so we can use it as a marker (as it does not conflict with
      # any special CSV character.)
      $line = str_replace('@', '@a', $line);
    
      # Escape commas as we don't yet know which ones are separators
      $line = str_replace(',', '@c', $line);
    
      # Escape quotes in a form that uses no special characters
      $line = str_replace("\\'", '@q', $line);
      $line = str_replace('\\', '@b', $line);
    
      $record .= $lineSep . $line;
      $lineSep = "
    ";
    
      # Must have an even number of quotes in a complete record!
      if (substr_count($record, "'") % 2 == 0) {
        $records[] = $record;
        $record = '';
        $lineSep = '';
      }
    }
    if (strlen($record) > 0) {
      $records[] = $record;
    }
    
    $rows = array ();
    
    foreach ($records as $record) {
      $chunks_in = explode("'", $record);
      $chunks_out = array ();
    
      # Decode escaped quotes/backslashes.
      # Decode field-separating commas (unless quoted)
      foreach ($chunks_in as $i => $chunk) {
        # Unescape quotes & backslashes
        $chunk = str_replace('@q', "'", $chunk);
        $chunk = str_replace('@b', '\\', $chunk);
        if ($i % 2 == 0) {
          # Unescape commas
          $chunk = str_replace('@c', ',', $chunk);
        }
        $chunks_out[] = $chunk;
      }
    
      # Join back together, discarding unescaped quotes
      $record = join('', $chunks_out);
    
      $chunks_in = explode(',', $record);
      $row = array ();
      foreach ($chunks_in as $chunk) {
        $chunk = str_replace('@c', ',', $chunk);
        $chunk = str_replace('@a', '@', $chunk);
        $row[] = $chunk;
      }
      $rows[] = $row;
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(4条)

报告相同问题?

悬赏问题

  • ¥15 微信会员卡接入微信支付商户号收款
  • ¥15 如何获取烟草零售终端数据
  • ¥15 数学建模招标中位数问题
  • ¥15 phython路径名过长报错 不知道什么问题
  • ¥15 深度学习中模型转换该怎么实现
  • ¥15 HLs设计手写数字识别程序编译通不过
  • ¥15 Stata外部命令安装问题求帮助!
  • ¥15 从键盘随机输入A-H中的一串字符串,用七段数码管方法进行绘制。提交代码及运行截图。
  • ¥15 TYPCE母转母,插入认方向
  • ¥15 如何用python向钉钉机器人发送可以放大的图片?