如何连接(合并)数据帧(内部，外部，左，右) ？

Given two data frames:

df1 = data.frame(CustomerId = c(1:6), Product = c(rep("Toaster", 3), rep("Radio", 3)))
df2 = data.frame(CustomerId = c(2, 4, 6), State = c(rep("Alabama", 2), rep("Ohio", 1)))

df1
#  CustomerId Product
#           1 Toaster
#           2 Toaster
#           3 Toaster
#           4   Radio
#           5   Radio
#           6   Radio

df2
#  CustomerId   State
#           2 Alabama
#           4 Alabama
#           6    Ohio

How can I do database style, i.e., sql style, joins? That is, how do I get:

An inner join of df1 and df2:
Return only the rows in which the left table have matching keys in the right table.
An outer join of df1 and df2:
Returns all rows from both tables, join records from the left which have matching keys in the right table.
A left outer join (or simply left join) of df1 and df2
Return all rows from the left table, and any rows with matching keys from the right table.
A right outer join of df1 and df2
Return all rows from the right table, and any rows with matching keys from the left table.

Extra credit:

How can I do a SQL style select statement?

转载于:https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

12条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
胖鸭 2009-08-19 15:15
关注
By using the merge function and its optional parameters:

Inner join: merge(df1, df2) will work for these examples because R automatically joins the frames by common variable names, but you would most likely want to specify merge(df1, df2, by = "CustomerId") to make sure that you were matching on only the fields you desired. You can also use the by.x and by.y parameters if the matching variables have different names in the different data frames.

Outer join: merge(x = df1, y = df2, by = "CustomerId", all = TRUE)

Left outer: merge(x = df1, y = df2, by = "CustomerId", all.x = TRUE)

Right outer: merge(x = df1, y = df2, by = "CustomerId", all.y = TRUE)

Cross join: merge(x = df1, y = df2, by = NULL)

Just as with the inner join, you would probably want to explicitly pass "CustomerId" to R as the matching variable. I think it's almost always best to explicitly state the identifiers on which you want to merge; it's safer if the input data.frames change unexpectedly and easier to read later on.

You can merge on multiple columns by giving by a vector, e.g., by = c("CustomerId", "OrderId").

If the column names to merge on are not the same, you can specify, e.g., by.x = "CustomerId_in_df1", by.y = "CustomerId_in_df2" where CustomerId_in_df1 is the name of the column in the first data frame and CustomerId_in_df2 is the name of the column in the second data frame. (These can also be vectors if you need to merge on multiple columns.)

本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(11条)

报告相同问题？

关注问题

如何连接(合并)数据帧(内部，外部，左，右) ？ r语言
2009-08-19 13:18

回答 12 已采纳 By using the merge function and its optional parameters: Inner join: merge(df1, df2) will work fo
对trunk端口发送数据帧时的操作不理解? 网络协议
2022-09-22 08:55

回答 1 已采纳 PVID存在的意义就是使某一VLAN不用打标签，相当于native VLAN，所以发送的时候自然要把标签去掉，还不理解的话建议先搞清楚PVID的作用
如何将Gota数据帧写入CSV？
2018-10-24 20:50

回答 1 已采纳 So it turns out I misunderstood the io.Writer interface and I didn't understand what the os.Create
python数据帧_如何在Python中加入多个数据帧？
2020-12-30 14:26

L NEO的博客在进入探索和模型构建部分之前，你需要首先连接这些多个数据集(以表、数据帧等形式)。怎么能做到这一点而不丢失任何信息?这听起来可能是一个简单的场景，但对于许多新来的人来说，这可能是一个威胁，特别是那些不...
同时合并列表中的多个数据帧 list r语言
2011-11-11 08:16

回答 6 已采纳 Another question asked specifically how to perform multiple left joins using dplyr in R . The ques
用qt实现数据帧的收发 c++ qt 有问必答
2021-07-20 18:20

回答 1 已采纳这和socket网络编程有关系..
MFC如何将数据显示在编辑框中？ c++ c语言开发语言
2019-04-29 16:25

回答 3 已采纳 ``` 最简单当然是绑定为float类型了。不过要比较灵活一些，可以帮定为CEdit控件类型用 m_edit1.SetDlgItemText(你需要设置的任何文本) 比如 float
《汇编语言编程基础基于 LoongArch 》读书与实践笔记
2023-02-10 23:35

loongsoner的博客好记性不如烂笔头，在此记录与分享一下《汇编语言编程基础基于 LoongArch 》读书与实践笔记。如文中出现错误，欢迎在评论区留言讨论，我会尽快修改更新 :-)
脑语言v0.5.8 2500令【单字编程】
2022-07-11 06:55

脑语言的博客这是脑语言v0.5.8版的2500个单字（也称为“令”与“一令”），通过【单字编程】（并不仅是中文编程，而是混合英文关键字，但以单字为主的命名）也许是英文不太好时又希望能写代码的其中一种方式。我在做脑语言...
命令式语言编程_从命令式语言到功能性语言，反向单反适用于功能性语言
2020-07-08 17:22

cunfu6353的博客命令式语言编程在过去的几年中，已经从功能编程（FP）语言的思想涌入主流命令性语言。不仅lambda和高阶函数已进入Java，C ++和其他语言，而且甚至从最纯粹的FP语言Haskell导入的更高级的概念（如monad）也是如此。...
空气质量等级c语言编程,编程小白如何快速处理空气质量数据
2021-05-21 05:58

陈景隆的博客空气质量监测值多为逐时或逐日数据，但环境遥感方向的研究内容多为成年累月的数据分析和对比，编程能够快速进行类似的数据处理，但若研究者是编程小白，则可以采用本文的方法进行快速的数据整理。数据获取空气质量...
汇编语言笔记——接口技术与编程
2022-12-18 11:50

亦梦亦醒乐逍遥的博客北京理工大学汇编语言与接口技术笔记，接口编程技术与编程部分
【时空序列】TKDE2020-时空图数据挖掘深度学习技术全面综述
2020-11-28 18:22

AI蜗牛车的博客 ①点：点类型的时空数据通常在时间或空间领域合并从而形成时间序列或者空间地图，例如犯罪事件、交通事故、社会活动等，经过数据转化后作为各种深度模型的输入。该文列举了如何采用ST-ResNet、GRU、ConvLSTM等模型...
没有解决我的问题, 去提问

悬赏问题

¥15 C#读写EXCEL文件，不同编译
¥15 如何提取csv文件中需要的列，将其整合为一篇完整文档，并进行jieba分词(语言-python)
¥15 MapReduce结果输出到HBase，一直连接不上MySQL
¥15 扩散模型sd.webui使用时报错“Nonetype”
¥15 stm32流水灯＋呼吸灯＋外部中断按键
¥15 将二维数组，按照假设的规定，如0/1/0 == "4"，把对应列位置写成一个字符并打印输出该字符
¥15 NX MCD仿真与博途通讯不了啥情况
¥15 win11家庭中文版安装docker遇到Hyper-V启用失败解决办法整理
¥15 gradio的web端页面格式不对的问题
¥15 求大家看看Nonce如何配置

如何连接(合并)数据帧(内部，外部，左，右) ？

12条回答 默认 最新

悬赏问题

12条回答默认最新