Lotus@ 2012-08-30 09:27 采纳率: 100%
浏览 1353
已采纳

如何在 r 中写 trycatch

I want to write trycatch code to deal with error in downloading from the web.

url <- c(
    "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
    "http://en.wikipedia.org/wiki/Xz")
y <- mapply(readLines, con=url)

These two statements run successfully. Below, I create a non-exist web address:

url <- c("xxxxx", "http://en.wikipedia.org/wiki/Xz")

url[1] does not exist. How does one write a trycatch loop (function) so that:

  1. When the URL is wrong, the output will be: "web URL is wrong, can't get".
  2. When the URL is wrong, the code does not stop, but continues to download until the end of the list of URLs?

转载于:https://stackoverflow.com/questions/12193779/how-to-write-trycatch-in-r

  • 写回答

4条回答 默认 最新

  • 叼花硬汉 2013-10-17 16:04
    关注

    Well then: welcome to the R world ;-)

    Here you go

    Setting up the code

    urls <- c(
        "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
        "http://en.wikipedia.org/wiki/Xz",
        "xxxxx"
    )
    readUrl <- function(url) {
        out <- tryCatch(
            {
                # Just to highlight: if you want to use more than one 
                # R expression in the "try" part then you'll have to 
                # use curly brackets.
                # 'tryCatch()' will return the last evaluated expression 
                # in case the "try" part was completed successfully
    
                message("This is the 'try' part")
    
                readLines(con=url, warn=FALSE) 
                # The return value of `readLines()` is the actual value 
                # that will be returned in case there is no condition 
                # (e.g. warning or error). 
                # You don't need to state the return value via `return()` as code 
                # in the "try" part is not wrapped insided a function (unlike that
                # for the condition handlers for warnings and error below)
            },
            error=function(cond) {
                message(paste("URL does not seem to exist:", url))
                message("Here's the original error message:")
                message(cond)
                # Choose a return value in case of error
                return(NA)
            },
            warning=function(cond) {
                message(paste("URL caused a warning:", url))
                message("Here's the original warning message:")
                message(cond)
                # Choose a return value in case of warning
                return(NULL)
            },
            finally={
            # NOTE:
            # Here goes everything that should be executed at the end,
            # regardless of success or error.
            # If you want more than one expression to be executed, then you 
            # need to wrap them in curly brackets ({...}); otherwise you could
            # just have written 'finally=<expression>' 
                message(paste("Processed URL:", url))
                message("Some other message at the end")
            }
        )    
        return(out)
    }
    

    Applying the code

    > y <- lapply(urls, readUrl)
    Processed URL: http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html
    Some other message at the end
    Processed URL: http://en.wikipedia.org/wiki/Xz
    Some other message at the end
    URL does not seem to exist: xxxxx
    Here's the original error message:
    cannot open the connection
    Processed URL: xxxxx
    Some other message at the end
    Warning message:
    In file(con, "r") : cannot open file 'xxxxx': No such file or directory
    

    Investigating the output

    > head(y[[1]])
    [1] "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"      
    [2] "<html><head><title>R: Functions to Manipulate Connections</title>"      
    [3] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">"
    [4] "<link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\">"             
    [5] "</head><body>"                                                          
    [6] ""    
    
    > length(y)
    [1] 3
    
    > y[[3]]
    [1] NA
    

    Additional remarks

    tryCatch

    tryCatch returns the value associated to executing expr unless there's an error or a warning. In this case, specific return values (see return(NA) above) can be specified by supplying a respective handler function (see arguments error and warning in ?tryCatch). These can be functions that already exist, but you can also define them within tryCatch() (as I did above).

    The implications of choosing specific return values of the handler functions

    As we've specified that NA should be returned in case of error, the third element in y is NA. If we'd have chosen NULL to be the return value, the length of y would just have been 2 instead of 3 as lapply() will simply "ignore" return values that are NULL. Also note that if you don't specify an explicit return value via return(), the handler functions will return NULL (i.e. in case of an error or a warning condition).

    "Undesired" warning message

    As warn=FALSE doesn't seem to have any effect, an alternative way to suppress the warning (which in this case isn't really of interest) is to use

    suppressWarnings(readLines(con=url))
    

    instead of

    readLines(con=url, warn=FALSE)
    

    Multiple expressions

    Note that you can also place multiple expressions in the "actual expressions part" (argument expr of tryCatch()) if you wrap them in curly brackets (just like I illustrated in the finally part).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥15 聚类分析或者python进行数据分析
  • ¥15 逻辑谓词和消解原理的运用
  • ¥15 三菱伺服电机按启动按钮有使能但不动作
  • ¥15 js,页面2返回页面1时定位进入的设备
  • ¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?
  • ¥15 错误 LNK2001 无法解析的外部符号