duanqiao1961 2015-07-10 02:07
浏览 35
已采纳

如何减少文本文件大小?

I have a php code that gets product IDs from my database (23436 unique records).

I fetch each product ID and check if it has been set in feature_product table by comparing productIDs.

If no records are found under that ID in features tables than I get check a trial.txt file for the missing features of the products again by comparing the productID in the text file with the productID that was not present in the feature_product table.

The problem is the trial.txt file has 593262 lines in it and it takes forever to match productID in this file. I run out of memory. I took me 15 hours to actually get all the data off the file and that too in parts manually. Is there any way to make it faster or not run out of time and memory?

I tried increasing the max execution time in my php.ini file as suggested by some posts on sites. But it keeps running out of memory or max execution time. I will be using mysqli once i get this right, as mysql is not used any more. I thought of dividing the product IDs so i can loop only say 5000 at a time but I don't think it would help in execution time.

<?php
$conn = mysql_connect("localhost", "dbuser", "pwd");  

 //loop through the 1st line to avoid the headers in csv
 if (!$conn){ 
 die('Could not connect : ' . mysql_error()); 
 echo mysql_error();
 }  
 echo '<p>Connected!';

 mysql_select_db("mydb") or die( "Unable to select database");

//Select all product ids from product table into product array
 $pArray = mysql_query("SELECT `id_product` from `product`",$conn);

 //loop through each product id
 while($row = mysql_fetch_assoc($pArray)) {

 //get product ID to check if it exists in features table
 $productID = $row["id_product"];

//check whether product id exists in feature table where product_id matches both product table and features table
 $fArray = mysql_query("SELECT * from `feature_product` WHERE `id_product`=$productID");

//if product Id does not have entry in feature table than call a function to get check if product id has features in text file
if(mysql_num_rows($fArray) ==0)
 {
 checkFeatures($productID);
 }
 else continue;
}

function checkFeatures($productID){
//trial.txt contains features of the products that are missing in features table but the products are in products table
$fd = fopen('trial.txt', 'r');
$fheader = fgets($fd); 

//creates a new text file to save all features(multiple records per product) separated by ',' for future use
$my_file = 'file.txt';
$handle = fopen($my_file, 'a') or die('Cannot open file:  '.$my_file);

while (($data = fgetcsv($fd,0, "~")) !== FALSE) {
//Since this text file has many products i only get the ones that are missing in the features table by comparing product ID which is the 1st element of data array
     if($data[0]==$productID){
     $d= $data[0].",".$data[1].",".$data[2].$data[3]."
";
     echo $d."<BR/>";
     fwrite($handle, $d);
     }  
}
fclose($fd);
fclose($handle);

   }
?>  

example of product table

id_product,shop,manufacutrer,category  
1000010,1,41,1112,1  
1000011,1,7,1721,1  
1000012,1,7,1721,1  

example of feature table

feature_id,id_product,value  
1,1000010,1  
3,1000010,2  
6,1000011,5  
11,1931555,1 

sample trial.txt

IMSKU~AttributeID~Value~Unit~StoredValue~StoredUnit  
1000006~16121~2-25~~~  
1000006~3897~* McAfee Protection Suite~~~  
1000006~3933~* 1yr Subscription~~~  
1000010~1708~Feb 2011~~~  
1000010~1710~Cisco~~0.00~  
1000010~1711~http://www.cisco.com~~~  
1000011~2852~1~~0.00~  
1000011~2855~Light Cyan~~0.00~  
1000012~2840~May 2010~~~  
1000012~2842~HP~~0.00~  

I tried to load the text file as a table in sql as suggested by the users

<?php  $con=mysqli_connect("localhost","username","pwd","db");
// Check connection
if (mysqli_connect_errno())
{
echo "Failed: " . mysqli_connect_error();
}

mysqli_query($con,"CREATE TABLE IF NOT EXISTS `add_features` (`id_product` INT(10) NOT NULL, `id_feature` INT(10) NOT NULL, `value` varchar(255),`unit` varchar(20),`s_value` varchar(20),`s_unit` varchar(20))");

$sql = "LOAD DATA INFILE 'trial.txt'
INTO TABLE `add_features`
FIELDS TERMINATED BY '~'
";
if ($con->query($sql) === TRUE) {
echo "OK!";
} else {
echo "Error: " . $sql . "<br>" . $con->error;
} 
$result = mysqli_query($con,"SELECT * FROM `add_features`");

echo "<table class='add_features'>
<tr class='titles'>
<th>Product_id</th>
<th>feature_id</th>
<th>value</th>
<th>Unit</th>
</tr>";

while($row = mysqli_fetch_array($result))
{
echo "<tr>";
echo "<td>" . $row['id_product'] . "</td>";
echo "<td>" . $row['id_feature'] . "</td>";
echo "<td>" . $row['value'] . "</td>";
echo "<td>" . $row['unit'] . "</td>";
echo "</tr>";
}
echo "</table>";

mysqli_close($con);
?>  

But i am getting an error
Error: LOAD DATA INFILE 'trial.txt' INTO TABLE add_features FIELDS TERMINATED BY '~'

  • 写回答

1条回答 默认 最新

  • douye2572 2015-07-10 02:15
    关注

    If the trial.txt file is static, I would process it / parse it into either separate smaller files based on some logical divider, or import it into a new database table (preferable) where searching it will be instant. It's a one time import and then it's done.

    If it's not static, how often does it change?

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 求差集那个函数有问题,有无佬可以解决
  • ¥15 【提问】基于Invest的水源涵养
  • ¥20 微信网友居然可以通过vx号找到我绑的手机号
  • ¥15 寻一个支付宝扫码远程授权登录的软件助手app
  • ¥15 解riccati方程组
  • ¥15 display:none;样式在嵌套结构中的已设置了display样式的元素上不起作用?
  • ¥15 使用rabbitMQ 消息队列作为url源进行多线程爬取时,总有几个url没有处理的问题。
  • ¥15 Ubuntu在安装序列比对软件STAR时出现报错如何解决
  • ¥50 树莓派安卓APK系统签名
  • ¥65 汇编语言除法溢出问题