public static String getData() throws HttpException, IOException { HttpClient client = new HttpClient(); GetMethod getMethod = new GetMethod("/Default.aspx?Page=MemDirsCompGroup&cGroupID=11&p=1"); client.getHostConfiguration().setHost("www.iranrd.net", 80,"http"); System.out.println("charset=>"+getMethod.getResponseCharSet()); client.executeMethod(getMethod); try { InputStream in; in = getMethod.getResponseBodyAsStream(); BufferedReader br = new BufferedReader(new InputStreamReader(in,"ISO-8859-1")); String tempbf; StringBuffer res= new StringBuffer(500); while ((tempbf = br.readLine()) != null) { res.append(tempbf + "\n"); } System.out.println("Response = "+res.toString()); getMethod.releaseConnection(); } catch (IOException e) { e.printStackTrace(); } return null; }
请教下大家,getMethod 方法Response对象返回的是ISO-8859-1字符编码 可是依然读取出来的数据为乱码,查看返回的数据里面
有句 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
引用:
对于象xml或者html这样的文件,允许作者在页面中直接指定编码类型。 比如在html中会有<meta http-equiv="Content-Type" content="text/html; charset=gb2312"/>这样的标签; 或者在xml中会有<?xml version="1.0" encoding="gb2312"?>这样的标签, 在这些情况下,可能与http头中返回的编码信息冲突,需要用户自己判断到底那种编码类型应该是真正的编码。 出自:http://www.ibm.com/developerworks/cn/opensource/os-httpclient/#ibm-pcon
请大家帮忙看看
PS:读取的网站是伊朗的