ds211107 2014-07-17 16:13
浏览 149

如何为每个唯一行选择最早的实例[min(timestamp)]

I am working with an XML dump of featured Wikipedia articles (including revisions and extracted citations).

My current query joins three tables to return all the citations for all revisions of featured articles, sorted by page title, author, year, and timestamp, something like this:

____TIMESTAMP______PAGE_TITLE____AUTHOR____YEAR___TITLE_______   

1___20110801...____AARDVARK______BLAND_____2010___MAJESTIC AARDVARKS

2___20110910...____AARDVARK______BLAND_____2010___MAJESTIC AARDVARKS

3___20120101...____AARDVARK______BLAND_____2012___AARDVARK BEHAVIOUR

4___20070601...____AARDVARK______SMITH_____2005___BREEDING HABITS OF

5___20090602...____AARDVARK______SMITH_____2005___BREEDING HABITS OF 

Ideally, my query would return only the earliest instance [min(timestamp)] for each unique citation. In other words, I would like a query that returns just rows 1, 3, and 4. I still need duplicates in terms of page_title, author, year, as there are multiple citations per page and possibly several by the same author.

Thanks very much in advance for your help!

  • 写回答

1条回答 默认 最新

  • douyanzhou1450 2014-07-17 16:46
    关注

    Following query should help with use of group_concat function in mysql env.

    SELECT title, min(timestamp) min_ts, group_concat(page_title) pts
           ,group_concat(author) as, group_concat(year) ys
    FROM temp_table  -- For simplicity I assume your above data is in a temp_table
    GROUP BY title
    

    Including related queries helps the person answering your question without having to create the problem context and test. Also providing more details like which DBMS - mysql, postgres, etc

    评论

报告相同问题?

悬赏问题

  • ¥15 如何实验stm32主通道和互补通道独立输出
  • ¥30 这是哪个作者做的宝宝起名网站
  • ¥60 版本过低apk如何修改可以兼容新的安卓系统
  • ¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
  • ¥50 有数据,怎么建立模型求影响全要素生产率的因素
  • ¥50 有数据,怎么用matlab求全要素生产率
  • ¥15 TI的insta-spin例程
  • ¥15 完成下列问题完成下列问题
  • ¥15 C#算法问题, 不知道怎么处理这个数据的转换
  • ¥15 YoloV5 第三方库的版本对照问题