比如说我通过地址获取到了网页的html文件,我现在想获取<span class="value" id="sku-discount-price" itemprop="price">6.89</span> 标签之间的6.89这个值,用java该怎么写呢?怎么做才是最合理的,我自己也尝试的写了一点,请各位高手们指教。如果有更好的方案,欢迎分享一下。
<div class="inf-pnl-price-detail"> <dl> <dt>Price:</dt> <dd> <div class="price price-highlight"> <del class="original-price">US $ <span class="" id="sku-price">7.66</span> <span class="separator">/</span> <span class="unit">piece</span> </del> </div> </dd> <dt>Discount Price:</dt> <dd> <div class="price price-highlight"> <span class="currency" itemprop="priceCurrency" content="USD">US $</span> <span class="value" id="sku-discount-price" itemprop="price">6.89</span> <span class="separator">/</span><span class="unit"> piece </span> <span class="time-left">(7 days left )</span> </div> </dd> </dl> </div>
我自己尝试写的代码:
public class TestUrl { public static void main(String[] args) { Long l1 = System.currentTimeMillis(); String string = "http://www.aliexpress.com/item/10pcs-lot-New-arrival-Hot-sale-fashion-hoomia-jonadab-magicpencil-magic-pencil-earphones-in-earfree-shipping/848760252.html"; String str3 = ""; String str[] = new String[750]; String str2 = ""; int i = 0; try { URL readSource = new URL(string); BufferedReader input = new BufferedReader(new InputStreamReader(readSource.openStream())); input.skip(15555); while((str2 = input.readLine()) !=null){ str[i] = str2; i++; } str3 = str[1]+str[2]+str[3]+str[4]+str[5]+str[6]+str[7]; System.out.println("1====================>"+str3); } catch (Exception e) { e.printStackTrace(); } String tempStr2 = str3.replaceAll(".*itemprop=\"price\">", ""); String tempStr3 = tempStr2.replaceAll("</span>.*", ""); System.out.println("tempStr2:"+tempStr3); Long l2 = System.currentTimeMillis(); System.out.println("time:"+(l2-l1)); } }