dongwei9771 2013-07-31 23:03
浏览 21

SQL:只要没有更新的匹配记录,只删除旧记录?

I've got a really big collection of data in a postgres database where I'd like to nuke data past a particular age... but I do not want it nuking the latest iteration of data from any given location & site combination.

Basically, I've got a really big table that has a location (bigint), site (bigint), readdate (bigint), and a little accompanying data (note: there will be multiple entries for a given site, location, and readdate - but anything on the same readdate is considered part of the same scan, and needs to be kept for a given location).

Currently, I've just got it set to get rid of all old records... but the possibility exists that a particular site and location combination will stop giving out data for a while, and I'd like to preserve the final state if that happens. I'm doing the SQL queries from php, so I'm pretty sure I could hack together some highly ugly code that finds the latest readdate for any given site & location combination, then either deletes stuff younger than that for that location, or deletes based on the calender limit (whichever gives the lesser date), but I'd prefer to put the decision-making workload in the SQL query, rather than having to first get a list of all location, site, and max(readdate) entries, then iterate over them in php making individual delete queries.

My current query (which doesn't do what I want, as it deletes everything before $limit) is declared by:

$query="DELETE FROM votwdata WHERE readdate < '".$limit."';";
  • any ideas for a good revision?
  • 写回答

1条回答 默认 最新

  • doucitao2944 2013-07-31 23:49
    关注

    If I understand what you are trying to do, you have a number of fields that might be the same, and you want to keep the most recent record. Assuming you have a sequential ID or a created_at on each record, you can run a subquery to identify the records you want to delete. For example:

    select max(id),data1,data2 from table group by data1,data2;

    That will pull the most recent record for a unique data1 and data2. You can run that as an inline query, joining it back to the original table.

    select t.* from table t, (select max(id) "id",data1,data2 from table group by data1,data2) t2 where t.id=t2.id;

    That will give you the most recent records. You can do an left join and look at the null values to delete anything that you don't like.

    select t.id,t2.id from table t left join (select max(id) "id",data1,data2 from table group by 2,3) t2 on t.id=t2.id where t2.id is null;

    That gives you all the records that you want to delete.

    Okay, that's the dirty way - refactor away.

    评论

报告相同问题?

悬赏问题

  • ¥15 在获取boss直聘的聊天的时候只能获取到前40条聊天数据
  • ¥20 关于URL获取的参数,无法执行二选一查询
  • ¥15 液位控制,当液位超过高限时常开触点59闭合,直到液位低于低限时,断开
  • ¥15 marlin编译错误,如何解决?
  • ¥15 有偿四位数,节约算法和扫描算法
  • ¥15 VUE项目怎么运行,系统打不开
  • ¥50 pointpillars等目标检测算法怎么融合注意力机制
  • ¥20 Vs code Mac系统 PHP Debug调试环境配置
  • ¥60 大一项目课,微信小程序
  • ¥15 求视频摘要youtube和ovp数据集