Have a project where I'm scraping a few sites with data, then outputting onto one site. To help with load times, I'm trying to rig it so once every 10 mins, my main website does a full data scrape, then stores it all into a cache folder called "cache", stored in the root folder. Then, anytime I refresh main site after that 10 mins, it pulls from the cache, making load times quite fast at that point.
Trouble is, load times haven't changed, which it really should using this method, so I'm doing something wrong. Would appreciate any help. Now I can confirm the data IS being stored in the cache, because I see the files automatically appearing there. So the issue has to be that the code is broken where specified to grab the data from cache, after it's stored every 10 minutes, it's not grabbing the data.
*part of me wonders if the issue is with how the filenames are being saved in cache, right now it seems to be random values. for ex, one is named f32dd7f0b85eb4c1be0bb9a417cc29ea553d898e.html
I'd think it needs to be saved as a specific file name. Not sure how to achieve that though. The code at the end of my php reference files seem to specify this, so not sure issue. The code that is supposed to be doing this is at the bottom of the post.
I'm really new to php, and honestly have only gotten this far through some very nice and helpful people. I'm close, but not quite there yet with this cache framework.
global.php in root folder:
<?php
$_cache_time =600; //10 minutes
$_cache_dir="./cache"; //cache dir
function deleteBlankInArray($var){
return !ctype_space($var)&&!empty($var);
}
function cache_start($filename)
{
global $_cache_dir,$_cache_time;
$cachefile = $_cache_dir.'/'.sha1($filename).'.html';
ob_start();
if(file_exists($cachefile) && (time( )-$_cache_time <
filemtime($cachefile)))
{
include($cachefile);
ob_flush();
return true;
}
return false;
}
function cache_end($filename)
{
global $_cache_dir,$_cache_time;
$cachefile = $_cache_dir.'/'.sha1($filename).'.html';
$fp = fopen($cachefile, 'w');
fwrite($fp, ob_get_contents());
fclose($fp);
ob_flush();
}
My main website, is an xhtml site. It's referencing these php pages like this:
<?php include 's&pcurrent.php';?>
<?php include 'news.php';?>
It's referencing/outputting multiple php files, which is why load times are slow, if not pulling from cache.
And lastly, this is an example of one of my php files that are being "included". This one is called litecoinchange.php
<?php
error_reporting(E_ALL^E_NOTICE^E_WARNING);
include_once "global.php";
//filename of the file
if(!cache_start("litecoinchange.php")){
$doc = new DOMDocument;
// We don't want to bother with white spaces
$doc->preserveWhiteSpace = false;
$doc->strictErrorChecking = false;
$doc->recover = true;
$doc->loadHTMLFile('https://coinmarketcap.com/');
$xpath = new DOMXPath($doc);
$query = "//tr[@id='id-litecoin']";
$entries = $xpath->query($query);
foreach ($entries as $entry) {
$result = trim($entry->textContent);
$ret_ = explode(' ', $result);
//make sure every element in the array don't start or end with blank
foreach ($ret_ as $key=>$val){
$ret_[$key]=trim($val);
}
//delete the empty element and the element is blank "
" "" "\t"
//I modify this line
$ret_ = array_values(array_filter($ret_,deleteBlankInArray));
//echo the last element
echo $ret_[7];
//filename of the file
cache_end("litecoinchange");
}
}