dongzhuifeng1843 2013-05-27 11:15
浏览 27
已采纳

优化将数据插入到mysql中

function generateRandomData(){
@ $db = new mysqli('localhost','XXX','XXX','scores');
if(mysqli_connect_errno()) {
    echo 'Failed to connect to database. Please try again later.';
    exit;
}
$query = "insert into scoretable values(?,?,?)";
for($a = 0; $a < 1000000; $a++)
{
$stmt = $db->prepare($query);

$id = rand(1,75000);

$score = rand(1,100000);

$time = rand(1367038800 ,1369630800);
$stmt->bind_param("iii",$id,$score,$time);
$stmt->execute();
}

}

I am trying to populate a data table in mysql with a million rows of data. However, this process is extremely slow. Is there anything obvious I'm doing wrong that I could fix in order to make it run faster?

  • 写回答

4条回答 默认 最新

  • dqbjvg2518 2013-05-27 12:17
    关注

    As hinted in the comments, you need to reduce the number of queries by catenating as many inserts as possible together. In PHP, it is easy to achieve that:

    $query = "insert into scoretable values";
    for($a = 0; $a < 1000000; $a++) {
        $id = rand(1,75000);
        $score = rand(1,100000);
        $time = rand(1367038800 ,1369630800);
        $query .= "($id, $score, $time),";
    }
    $query[strlen($query)-1]= ' ';
    

    There is a limit on the maximum size of queries you can execute, which is directly related to the max_allowed_packet server setting (This page of the mysql documentation describes how to tune that setting to your advantage).

    Therfore, you will have to reduce the loop count above to reach an appropriate query size, and repeat the process to reach the total number you want to insert, by wrapping that code with another loop.

    Another practice is to disable check constraints on the table you wish to do bulk insert:

    ALTER TABLE yourtablename DISABLE KEYS;
    SET FOREIGN_KEY_CHECKS=0;
    -- bulk insert comes here
    SET FOREIGN_KEY_CHECKS=1;
    ALTER TABLE yourtablename ENABLE KEYS;
    

    This practice however must be done carefully, especially in your case since you generate the values randomly. If you have any unique key within the columns you generate, you cannot use that technique with your query as it is, as it may generate a duplicate key insert. You probably want to add a IGNORE clause to it:

    $query = "insert INGORE into scoretable values";
    

    This will cause the server to silently ignore duplicate entries on unique keys. To reach the total number of requiered inserts, just loop as many time as needed to fill up the remaining missing lines.

    I suppose that the only place where you could have a unique key constraint is on the id column. In that case, you will never be able to reach the number of lines you wish to have, since it is way above the range of random values you generate for that field. Consider raising that limit, or better yet, generate your ids differently (perhaps simply by using a counter, which will make sure every record is using a different key).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥20 软件测试决策法疑问求解答
  • ¥15 win11 23H2删除推荐的项目,支持注册表等
  • ¥15 matlab 用yalmip搭建模型,cplex求解,线性化处理的方法
  • ¥15 qt6.6.3 基于百度云的语音识别 不会改
  • ¥15 关于#目标检测#的问题:大概就是类似后台自动检测某下架商品的库存,在他监测到该商品上架并且可以购买的瞬间点击立即购买下单
  • ¥15 神经网络怎么把隐含层变量融合到损失函数中?
  • ¥15 lingo18勾选global solver求解使用的算法
  • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行
  • ¥20 测距传感器数据手册i2c
  • ¥15 RPA正常跑,cmd输入cookies跑不出来