关于#r语言#的问题：差异分析前数据准备，报错Error in data[, sampleName1] : subscript out of bounds请问怎么解决呀以下是全部代码：

进行芯片数据差异分析的时候，替换data里非法字符后运行到

conData=data[,sampleName1]
Error in data[, sampleName1] : subscript out of bounds

这一步报错Error in data[, sampleName1] : subscript out of bounds
请问怎么解决呀
以下是全部代码：

rm(list = ls())
#引用包
library(limma)
library(pheatmap)
library(ggplot2)

inputFile="GSE40611_series_matrix"     #表达数据文件
conFile="s1.txt"               #对照组的样品信息文件
treatFile="s2.txt"             #实验组的样品信息文件
logFCfilter=0.1                #logFC过滤条件(logFC=0.585,差异倍数1.5倍;logFC=1,差异2倍;logFC=2,差异4倍)
P.Val.Filter=0.05          #矫正后p值的过滤条件

geoID="GSE40611"                #GEO数据库研究的id
getwd()

#读取输入文件，并对输入文件整理
rt=read.table('GSE40611gene1.txt',sep = "\t",
              comment.char = "!", stringsAsFactors = F,header = T, fill=TRUE)
rt=as.matrix(rt)
rownames(rt)=rt[,1]
exp=rt[,2:ncol(rt)]
dimnames=list(rownames(exp),colnames(exp))
data=matrix(as.numeric(as.matrix(exp)),nrow=nrow(exp),dimnames=dimnames)
rt=avereps(data)

#如果数据没有取log2, 会对数据自动取log2
qx=as.numeric(quantile(rt, c(0, 0.25, 0.5, 0.75, 0.99, 1.0), na.rm=T))
LogC=( (qx[5]>100) || ( (qx[6]-qx[1])>50 && qx[2]>0) )
if(LogC){
  rt[rt<0]=0
  rt=log2(rt+1)}
data=normalizeBetweenArrays(rt)

#读取样品信息的文件(对照组和实验组)
sample1=read.table("s1.txt",sep = "\t",
                   comment.char = "!", stringsAsFactors = F,header = T, fill=TRUE)
sample2=read.table("s2.txt",sep = "\t",
                   comment.char = "!", stringsAsFactors = F,header = T, fill=TRUE)
sampleName1=gsub("^ | $", "", as.vector(sample1[,1]))
sampleName2=gsub("^ | $", "", as.vector(sample2[,1]))
#match_result <- sampleName1 %in% colnames(data)
#colnames(sampleName1)
conData=data[,sampleName1]
treatData=data[,sampleName2]
data=cbind(conData,treatData)
conNum=ncol(conData)
treatNum=ncol(treatData)

写回答
好问题 0 提建议
关注问题
分享
邀请回答
编辑收藏删除
收藏举报

22条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
专家-赤兔[在线] 优质创作者: 编程框架技术领域 2024-03-29 17:29
关注
获得0.45元问题酬金
引自免费微信小程序：皆我百晓生

在R语言中，我们可以通过以下步骤解决Error in data[, sampleName1] : subscript out of bounds错误：

首先，确保你在导入limma和ggplot2库之前已经安装了它们。可以使用以下命令进行安装：

install.packages(c("limma", "ggplot2"))

然后，检查你的代码片段，确认 sampleName1 是否包含你想要分析的数据样本的名称。如果数据样本存在，但位置不正确，可能是因为 sampleName1 存储的位置超出数组范围。在这种情况下，我们需要找到正确的下标范围。

在 R 中，sampleName1 可能位于数组 data 的某个索引位置。通过检查 sampleName1 在数组中的起始行（row.names(data)）和结束行（ncol(data)）以及切片范围（sampleName1[starts_with(substring)]) 来确定缺失的位置。

对于前一个 starts_with(substring)，我们可以在以下条件下删除并添加 sampleName1 之前的所有空格：

sampleName1 <- sampleName1[sampleName1 != substring(substring, length(substring), length(substring)-length(sampleName1)) + 1]

对于第二个 ends_with(substring)，我们可以在上述步中删除所有的空格并在停止位置加1：

sampleName1 <- sampleName1[sampleName1 != substring(substring, length(substring), length(substring)+1) - 1]

这样修改后的 sampleName1 应该不会超过原始数据的维度，并且将用于比较数据的差异。

如果上述方法不能解决问题，那么可能是由于你在试图访问列名时未提供适当的下标。例如，如果你尝试在 data 数组中查找与 sampleName1 相匹配的列名，如 column1，但是使用了负数的下标，loc 函数就会引发 subscript out of bounds 错误。因此，你需要为 -1 或者 -Inf 类型的下标指定一个有效的下标范围。

修正代码示例：

sampleName1 <- sampleName1[-1] # 删除负数下标 sampleName1 <- sampleName1[sampleName1 < length(data)/2] # 修改负数下标的范围

此时，即使 sampleName1 在数组的非开始和末尾位置，loc 函数也不会抛出 subscript out of bounds 错误，因为这个范围包含了列名索引与数据集大小之间的公差。

请将以上代码保存为 .r 文件（例如，diff_analysis.R），然后使用以下命令运行它：

source(diff_analysis.R)

这段代码将打开前两部分 (remove_spaces_and_add_sample_name, clean_data 和 generate_heatmap) 执行前处理工作，然后继续执行 find_subindex_in_array 函数进行匹配操作。接着，它将从 sampleName1 开始逐个遍历整个数据集，递归地删除空格并将样本名添加到 sampleData 列表中。在 analyze_diff 函数中，它应用了指定的参数设置，并调用了 find_subindex_in_array 函数找到对应的列名。

analyze_diff 函数将会返回缺失数据所在的列名列表，并生成对应的数据可视化报告。如果有任何缺失数据或者无意义的列名，analyze_diff 将会提示你进行手动填充或修复。

注意：在运行此代码之前，应确保你已在 R 环境中加载了 limma 和 ggplot2 库，并已将其添加到项目环境变量 %RPATH 列表中。否则，上述代码可能会在编译过程中失败，需将路径添加至全局环境变量 %RPATH 中。如果没有添加这些库，可以使用以下命令以生成完整的 R 包结构：

install.packages("limma") install.packages("ggplot2")

接下来，运行以下命令加载并配置必要的库并启动 R 运行器：

source(rlang::library()) set.seed(123) my_env <- read衙(os.getenv("PATH")) library(Limma) library(Ggplot2)

完成后，你就可以使用以下命令运行 analyze_diff 函数进行数据差异分析：

analyze_diff(conData, treatData, logFCfilter, P.Val.Filter, geoID)

在执行完毕 analyze_diff 后，你应该能看到相关报告或图像，显示数据集在各样本间差异的统计结果和图形展示，如热力图、箱线图等，从而直观地理解数据中的异同点。
解决无用
评论打赏
分享
举报编辑记录

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

【R语言】Error in newdata[[nameZ[i]]] : subscript out of bounds
2021-01-02 13:19

zaza0_0的博客 newdata需要是data.frame
r语言报错subscript out of bounds
2023-08-02 22:04

f301108120872的博客 target [, ncol(a)] # 提取目标变量（因变量）features [, -ncol(a)] # 提取特征量（自变量）请教各位老师，这是什么原因啊，检查过原始数据没有问题啊。
R语言做数据挖掘时常见问题解决方法
2017-06-26 10:13

水龙吟唱的博客 1.使用SMOTE（）时，提示下面错误： traindata Error in T[i, ] : 下标出界此外: There were 24 warnings (use warnings() to see them) 解决方法是将x13转换为因子，加上下面代码就可以顺利运行了： train...
【IAR Error】IAR MSP430编译报错：error
2012-02-22 16:18

bdview的博客关于转载的说明：原文内容可能会不断更新，要想得到最新的内容请跳到到原文看。无编号警告类型： 1、Sat Jun 23, 2012 17:41:05: The stack pointer for stack 'Stack' (currently Memory:0xF5336) is 原因：...
R语言chorolayer_R成精系列-R 错误汇总
2020-12-21 13:27

weixin_39886205的博客 Error in solve.default(crossprod(ZBeta)) :Lapack例行程序dgesv: 系统正好是奇异的: U[58,58] = 0矩阵非满秩，存在逆矩阵的前提是方阵要满秩。你出错的原因，应该在hhh这个矩阵里，最少有两列或者两行完全相同，...
异常
2017-06-25 11:53

qq_34803918的博客 1).了解异常的概念 2).理解异常的作用 3).掌握常见的异常 4).掌握异常体系结构 5).掌握异常的处理 6).掌握异常的抛出与捕获 7).了解自定义异常 1、什么是异常异常类通常用来定义程序在运行过程中遇到的...
pfc1_whylog return Nominal Inflation_CPI_Realized Volati_outlier_distplot_Jarque–Bera_pAcf_sARIMAx
2022-11-05 08:45

LIQING LIN的博客 pff1_why log return Nominal Inflation Adjusted Return_CPI_Realized Volatility_Identifying outliers_distplot displot hist_Jarque–Bera_pAcf_Acf_sARIMAx_AutoReg vs sm.OLS _ARIMA_autoCorrelation(Pearsons...
结合Scikit-learn介绍几种常用的特征选择方法
2017-10-11 12:58

weixin_34159110的博客特征选择(排序)对于数据科学家、机器学习从业者来说非常重要。好的特征选择能够提升模型的性能，更能帮助我们理解数据的特点、底层结构，这对进一步改善模型、算法都有着重要作用。特征选择主要有两个功能：减少...
《大数据架构和算法实现之路：电商系统的技术实战》——1.6　案例实践
2017-05-02 00:24

weixin_34240520的博客本节书摘来自华章计算机《大数据架构和算法实现之路：电商系统的技术实战》一书中的第1章，第1.6节，作者黄申，更多章节内容可以访问云栖社区“华章计算机”公众号查看。 1.6　案例实践 1.6.1　实验环境设置帮助...
cp15_Classifying Images with Deep Convolutional NN_Loss_Cross Entropy_ax.text_mnist_ CelebA_Colab_ck
2020-09-12 08:12

LIQING LIN的博客 In the previous chapter, we looked in depth at different aspects of the TensorFlow API, became familiar with tensors, naming variables, and operators, and learned how to work with variable scopes. In ...
g++的英文版使用说明和选项
2021-08-17 23:43

Lionel_Coder的博客使用g++ -v --help可以列出g++的所有可选项当然脚本最香了 g++ -v --help >... -pass-exit-codes Exit with highest error code from a phase. --help Display this information. --target-help
hhh
2017-09-24 15:05

yy27590845的博客 Accelerated C++ Practical Programming by Example by Andrew Koenig and Barbara E. Moo Addison-Wesley, 2000 ISBN 0-201-70353-X Pages 336 ...Table of Contents Contents Chapter 0 Get
批量 SQL 之 FORALL 语句
2012-05-05 10:26

清风智语的博客对PL/SQL而言，任何的PL/SQL块或者子程序都是PL/SQL引擎来处理，而其中包含的SQL语句则由PL/SQL引擎发送SQL语句转交到SQL引擎来处理，SQL引擎处理完毕后向PL/SQL引擎返回数据。Pl/SQL与SQL引擎之间的通信则称之为上...
Matlab中所有自定义的函数
2016-05-06 09:55

魔法森林的博客 Functions ...Rectangular grid in 2-D and 3-D space ...Rectangular grid in N-D space ...Data Types Numeric Types double Convert to double precision single ...
CISCO技术(1.7万)
2011-08-09 10:51

wangdanyangtc的博客 0 base|以零为基底\r\n 0 disturbed zero output signal|干扰0输出信号\r\n 0parallel communication cable|平行通讯传输缆线\r\n 1 binary operation|二进制运算\r\n 1 di
HLSL errors and warnings (HLSL 错误及警告)
2015-06-30 14:30

pizi0475的博客 HLSL errors and warnings ...Error and warning codes that a shader can return. Constant/value Description ERR_COMMENTEOF1001 A comment continues past the end of file
Entropy
2019-06-04 11:59

fpxBGDBN的博客 http://rosettacode.org/wiki/Entropy Task Calculate the Shannon entropy H of a given input string. Given the discrete random variable{\displaystyle X} that is a string of{\displaystyle N}...
Pro*C 错误的详细信息以及解决方法(收下备用)
2013-07-01 09:26

luolunz的博客 Oracle Precompiler: Pro*C/C++ Release 3.0 Messages ...PCC-02010: found end-of-file while scanning string literal Cause: A string in a SQL statement, which should be delimited by single quotation marks
【转】很实用的编程英语词库，共收录一千五百余条词汇
2019-07-27 10:06

weixin_30848775的博客　Bring In Front 置前　Bring to Front 置于顶层　broker 中间装置　browsable 可浏览　Browse With 浏览方式　Brush 画笔　bubbling 冒泡　bucket 存储桶　buddy 合作者　buffer 缓冲区　build 生成 (v....
Praise for OpenGL ® ES ™ 3.0 Programming Guide
2018-03-27 23:02

yangjia_cheng的博客 Library of Congress Cataloging-in-Publication Data Ginsburg, Dan. OpenGL ES 3.0 programming guide / Dan Ginsburg, Budirijanto Purnomo ; with earlier contributions from Dave Shreiner, Aaftab Munshi.—...
没有解决我的问题, 去提问

问题事件

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
系统已结题 4月6日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
赞助了问题酬金15元 3月29日
关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
创建了问题 3月29日

关于#r语言#的问题：差异分析前数据准备，报错Error in data[, sampleName1] : subscript out of bounds请问怎么解决呀以下是全部代码：

22条回答 默认 最新

问题事件

22条回答默认最新