Document doc = Jsoup.connect(website).get();
其中 website="http://www.huxiu.com/photo".
这个网址可以打开。
但是解析后报这样的错:
org.jsoup.HttpStatusException: HTTP error fetching URL. Status=404, URL=http://m.huxiu.com/photo
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:435)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:446)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:410)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:164)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:153)
at com.coship.crawler.crawler.parser.huxiu.HuxiuHomeProcessor.processor(HuxiuHomeProcessor.java:38)
at com.coship.crawler.crawler.work.FetchWorker.startDealJob(FetchWorker.java:76)
at com.coship.crawler.crawler.work.FetchWorker.run(FetchWorker.java:37)
at java.lang.Thread.run(Thread.java:662)
问题来了:明明是“http://www.huxiu.com/photo”,怎么就变成了“http://m.huxiu.com/photo”了呢?