doudao1922 2013-08-05 12:27
浏览 29

file_get_contents是否适用于某些.txt文件而非其他文件?

I have a script written by a friend that gets all the contents from a directory of .txt files and uploads them into a database alongside some other information.

aka: filename | Contents

Each file's contents - simple text info - is stored in a corresponding database entry. It's been working very well so far, but the contents of a new bunch of text files simply aren't being read. The filenames are read fine and that info is imported into the database easily. It's just the actual contents. Old .txt files that I've imported previously still are imported perfectly.

Examples of files are here: Working / Not-Working

Long story short - does anyone know why the contents of some .txt files can be read and not others? Encoding issues possibly, etc? (though they're from the same person and look identical) I'm losing my mind.

Thanks!

$dir = 'text';
//createxml(10);exit;
$time_start = microtime(true);
$files = scandir($dir);
natsort($files);
foreach ($files as $v) {
    if ($v != "." && $v != ".." && $v != "thumbs" && $v != ".DS_Store") {
        //get work done
        $text = file_get_contents($dir.'/'.$v);
        //get volume, page, county
        $ta = explode('.',$v);
        $ma = explode('_',$ta[0]);
        $last = count($ma)-1;
        $volume = '';
        $year = '1999';
        for ($i = 0; $i < $last; ++$i)
        {
            $volume .= $ma[$i].'_';
        }
        $volume = $mysqli->real_escape_string(rtrim($volume,'_'));

        $pagenr = $mysqli->real_escape_string($ma[$last]);
        $ntext  = $mysqli->real_escape_string(getmtext($text));
        $pdf    = 'http://griffiths.****.ie/gv4/thoms/'.$volume.'/'.$volume.'_pg'.str_pad($pagenr, 4, "0", STR_PAD_LEFT).'.pdf';
        $thumb  = 'http://griffiths.****.ie/gv4/thoms/'.$volume.'/thumbs2/'.$volume.'_'.str_pad($pagenr, 4, "0", STR_PAD_LEFT).'.jpg';

        //create sql
        $echo[$volume] .= "('','$year','$pagenr','$volume','$ntext','$pdf','$thumb'),";
        $excl[$volume]=true;
    }
}
// check if there is volume already in DB
foreach ($excl as $k => $v) {
    $volumes .= "'$k',";
}
$volumes = rtrim($volumes,',');
$excls ='';

if ($result = $mysqli->query("SELECT DISTINCT volume FROM thoms_copy2 WHERE volume in ($volumes)")) {
    //found volumes already in DB
        while ($r = $result->fetch_array(MYSQLI_NUM))
            //we only need the new volumes, so we will ignore the rest
            unset($echo[$r[0]]);
    $result->close();
}

//create mysql string
foreach ($echo as $k => $v) {
    $echot .= $v.',';
}
$echot = rtrim($echot,',');
if ($echot) {// if i have something to insert
    //insert into DB
    $sql = "INSERT INTO `thoms_copy2` (`id`,`year`,`main_page`, `volume`, `texty`, `pdf`, `thumb`) VALUES $echot";
    if ($result = $mysqli->query($sql)) {
        echo "Done.";
        //create the XML file       
        createxml($mysqli->affected_rows);
    } else {
        printf("Error message: %s
", $mysqli->error);
        echo "<br><br>$sql";
    }
} else { echo "Done. Nothing new."; }
$time_end = microtime(true);
$time = $time_end - $time_start;    
echo "<br>$time";

//functions ===============================================================
function getmtext($str) {
    $text = '';
    $words = str_word_count($str, 1);
    foreach ($words as $word) {
        if ($word[0] >= 'A' && $word[0] <= 'Z') 
            if (strlen($word)>1) 
                $text .= $word.' ';
    }
    return $text;
}
  • 写回答

1条回答 默认 最新

  • douwen9534 2013-08-05 12:32
    关注

    No, file_get_contents is equal to combination of fopen+fread+fclose, so it provides bytes as a result. If you have a wrong charset, it will not affect that fact, that your file consists from bytes (which will be returned by file_get_contents). Since you're not a script author, it's difficult to say, where's the problem, but you should be sure, that your files are accessible to your script (i.e. have correct permissions, for example).

    评论

报告相同问题?

悬赏问题

  • ¥15 ogg dd trandata 报错
  • ¥15 高缺失率数据如何选择填充方式
  • ¥50 potsgresql15备份问题
  • ¥15 Mac系统vs code使用phpstudy如何配置debug来调试php
  • ¥15 目前主流的音乐软件,像网易云音乐,QQ音乐他们的前端和后台部分是用的什么技术实现的?求解!
  • ¥60 pb数据库修改与连接
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错