您好,您可以使用以下R语言代码来解决您的问题:
# 一
# 1. 直接在R中输入生成,不能在excel中录入完后再读取到R中。(列名称可以换成英文字符串来表示)。
data1 <- data.frame(
Department = rep(c("A","B","C","D","E","F","G","H"), each = 3),
ID = paste(rep(c("1","2","3"), 8), rep(c("A","B","C"), 8), sep = ""),
Name = paste(rep(c("John","Mary","Tom"), 8), rep(c("A","B","C"), 8), sep = ""),
InitialScore = round(rnorm(24, 280, 20), 1),
InterviewScore = round(rnorm(24, 80, 20), 1),
ComprehensiveScore = round(InitialScore * 0.7 + InterviewScore * 0.3, 1),
OperationalAbility = round(rnorm(24, 70, 15), 1)
)
# 2. 用语句实现:每个部门综合成绩第一被录用,录用标记1,不录用标记0。
data1$Recruitment <- ifelse(data1$ComprehensiveScore ==
ave(data1$ComprehensiveScore, data1$Department, FUN = max), 1, 0)
# 3. 将生成的数据和录用结果生成新的数据框,存储在电脑中,csv格式或者excel格式均可,文件命名为自己的学号姓名。
write.csv(data1, file = "学号姓名.csv")
# 4. 将操作能力分成4个等级,具体为:数值≥90对应A,[80,90)对应B,[70,80)对应C,其余对应D。
data1$Level <- ifelse(data1$OperationalAbility >= 90, "A",
ifelse(data1$OperationalAbility >= 80 & data1$OperationalAbility < 90, "B",
ifelse(data1$OperationalAbility >= 70 & data1$OperationalAbility < 80, "C", "D")))
# 5. 分别统计每个等级的面试成绩的平均分、综合成绩的平均分、录用人员的综合成绩的平均分,未录用人员的综合成绩的平均分。
ata1_mean <- aggregate(cbind(InterviewScore, ComprehensiveScore) ~ Level + Recruitment, data = data1, mean)
View(data1_mean)
二、从R中自带的数据集中选择一个数据集,选择你感兴趣的变量做如下分析(40分)
# 1. 根据变量画不同类型的散点图,尽可能的标注坐标轴,标题,图例等。
# 以mtcars数据集为例,画出mpg(每加仑英里数)和disp(排量)的散点图
library(datasets)
data(mtcars)
plot(mtcars$mpg,mtcars$disp,
main = "Scatterplot of MPG vs. Displacement",
xlab = "Miles per Gallon",
ylab = "Displacement",
pch = 16)
# 2. 对所选的变量作基本的统计分析。
# 以mtcars数据集为例,对mpg和disp变量进行基本统计分析
summary(mtcars$mpg)
summary(mtcars$disp)
# 3. 对数据做一元或者多元回归分析,可尝试对不同的变量进行回归分析。
# 以mtcars数据集为例,对mpg和disp变量进行回归分析
fit <- lm(mpg ~ disp, data = mtcars)
# 4. 对回归结果进行诊断分析,画出残差图,画出数据的散点图和回归直线。
# 画出残差图
plot(fit$residuals, type="l")
# 画出数据的散点图和回归直线
plot(mtcars$mpg,mtcars$disp,
main = "Scatterplot of MPG vs. Displacement",
xlab = "Miles per Gallon",
ylab = "Displacement",
pch = 16)
abline(fit, col = "red")