I have following code:
preg_match_all('/"([^"]*)"/', $json , $results);
var_dump($json);var_dump($results);die();
At this point a dump of $json
has
string(423) "{"http://ecx.images-amazon.com/images/I/51Lg%2Bd4cqRL._SX355_.jpg";[355,266],"http://ecx.images-amazon.com/images/I/51Lg%2Bd4cqRL._SX425_.jpg":[425,319],"http://ecx.images-amazon.com/images/I/51Lg%2Bd4cqRL._SX466_.jpg":[466,350],"http://ecx.images-amazon.com/images/I/51Lg%2Bd4cqRL._SX450_.jpg":[450,338],"http://ecx.images-amazon.com/images/I/51Lg%2Bd4cqRL.jpg":[500,375]}"
I’m trying to get the links. I’ve tried json_decode
but I get error number 4 which is incorrect syntax. There are no invisible characters in front or after the JSON on the string. Without luck i decided to try to regex my way into it but the above code returns
array(2) { [0]=> array(0) { } [1]=> array(0) { } }
Any help to get the first first would be greatly appreciated.
Ok, as some of you noted this is basically a hack to get it to work no matter what. If you are interested in doing it right here’s the full info:
$ch = curl_init("http://www.amazon.com/gp/product/B00BEL2G4C/ref=s9_wish_gw_d31_g21_i3?pf_rd_m=ATVPDKIKX0DER&pf_rd_s=desktop-1&pf_rd_r=1VPYMKFSFN5BRHD4AD3W&pf_rd_t=36701&pf_rd_p=1970559082&pf_rd_i=desktop");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIESESSION, true );
curl_setopt($ch, CURLOPT_COOKIEJAR, "cookies.txt" );
curl_setopt($ch, CURLOPT_COOKIEFILE, "cookies.txt" );
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0");
$curl_scraped_page = curl_exec($ch);
$html = $html->load($curl_scraped_page);
$json = $html->find('#imageBlock', 0)->children[0]->children[0]->children[1]->children[1]->children[0]->children[2]->children[0]->children[0]->children[0]->children[0]->children[0]->attr['data-a-dynamic-image'];
$json = utf8_encode($json);
var_dump(json_decode($json));var_dump(json_last_error());die();
I know that Amazon has an API but they are annoying and will only let you use it if you are an affiliate and they don’t accept under construction websites as affiliates so I’m just trying to get this out and will change it to the API once site goes live and gets approved for Amazon affiliates.
The URL is actually dynamic, just used a static one for testing purposes. I would love to find a JSON solution as that would be much cleaner.