yaycici 2014-07-22 09:03
浏览 3301

join:file is not in sorted order

linux下使用join关联两个文件
两文件1和2内容如下:
[c7@db1 shconfig]$ cat 1
1|1|目3|的|0
3|1|Test|dd|0
4|1|jjjj|dd|0
6|1|等发达|大水|0
7|1|3组| |0
10|1|sd|好的|0
11|2|恩呢| |2

[c7@db1 shconfig]$ cat 2
35|2|1|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
37|1|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
39|1|3|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
40|1|3|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
34|1|10|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|

希望经过关联文件1的第一列和文件2的第三列之后的结果是这样的:
1|1|目3|的|0|35|2|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
1|1|目3|的|0|37|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
3|1|Test|dd|0|39|1|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
3|1|Test|dd|0|40|1|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
10|1|sd|好的|0|34|1|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|

但是对文件1和2进行排序后以join关联结果如下:
[c7@db1 shconfig]$ sort -k1 -t'|' 1 > 3
[c7@db1 shconfig]$ sort -k3 -t'|' 2 > 4
[c7@db1 shconfig]$ join -j1 1 -j2 3 -t'|' 3 4
10|1|sd|好的|0|34|1|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|
join: file 1 is not in sorted order
3|1|Test|dd|0|39|1|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
3|1|Test|dd|0|40|1|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
附:[c7@db1 shconfig]$ cat 3
10|1|sd|好的|0
11|2|恩呢| |2
1|1|目3|的|0
3|1|Test|dd|0
4|1|jjjj|dd|0
6|1|等发达|大水|0
7|1|3组| |0
[c7@db1 shconfig]$ cat 4
34|1|10|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|
37|1|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
35|2|1|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
39|1|3|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
40|1|3|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|

再用以数字排序后进行关联的方式:
[c7@db1 shconfig]$ sort -n -k1 -t'|' 1 > 5
[c7@db1 shconfig]$ sort -n -k3 -t'|' 2 > 6
[c7@db1 shconfig]$ join -j1 1 -j2 3 -t'|' 5 6
1|1|目3|的|0|35|2|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
1|1|目3|的|0|37|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
3|1|Test|dd|0|39|1|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
3|1|Test|dd|0|40|1|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
join: file 1 is not in sorted order

附:
[c7@db1 shconfig]$ cat 5
1|1|目3|的|0
3|1|Test|dd|0
4|1|jjjj|dd|0
6|1|等发达|大水|0
7|1|3组| |0
10|1|sd|好的|0
11|2|恩呢| |2
[c7@db1 shconfig]$ cat 6
35|2|1|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
37|1|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
39|1|3|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
40|1|3|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
34|1|10|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|

实在搞不懂为什么,如果大家有解决办法请务必告知,谢谢。。

附:版本及LANG
[c7@db1 shconfig]$ sort --version
sort (GNU coreutils) 8.4
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and Paul Eggert.
[c7@db1 shconfig]$ join --version
join (GNU coreutils) 8.4
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel.
[c7@db1 shconfig]$ echo $LANG
C

  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥15 求daily translation(DT)偏差订正方法的代码
    • ¥15 js调用html页面需要隐藏某个按钮
    • ¥15 ads仿真结果在圆图上是怎么读数的
    • ¥20 Cotex M3的调试和程序执行方式是什么样的?
    • ¥20 java项目连接sqlserver时报ssl相关错误
    • ¥15 一道python难题3
    • ¥15 牛顿斯科特系数表表示
    • ¥15 arduino 步进电机
    • ¥20 程序进入HardFault_Handler
    • ¥15 关于#python#的问题:自动化测试