I am a student working on a placement for the summer. I have been given the task of dealing with data entry from excel to a SQL Server database for surveys that were carried out over a number of years. The task is outlined below:
There are three tables, a main event, an individual event and an individual. An event has many individual events, an individual event has many individuals. My code regards just the last two tables.
I read two files, a list of all individual events in one file, and a list of all individuals in the other. The individual's data tells me what individual event it is associated with.
My code basically reads an individual event, then looks through the second file for any associated individuals. For each line in the individuals file, if it is associated, it is inserted to the proper table, else it is written to a new file. Once the whole file is traversed, the new file is copied to the old file, thus removing data already entered to the database.
This copying across has knocked a good 3 minutes of execution time off simply re-reading the full individuals file again and again. But is there a better approach to this? My execution time for my sample data is ~47 seconds...ideally I'd like that lower.
Any advice, regardless how trivial would be appreciated.
EDIT: This is a cut down version of the code I am using
<?php
//not shown:
//connect to database
//input event data
//get the id of the event
//open files
$s_handle = fopen($_FILES['surveyfile']['tmp_name'],'r');//open survey file
copy($_FILES['cocklefile']['tmp_name'],'file1.csv');//make copy of the cockle file
//read files
$s_csv = fgetcsv($s_handle,'0',',');
//read lines and print lines
// then input data via sql
while (! feof($s_handle))
{
$max_index = count($s_csv);
$s_csv[$max_index]='';
foreach($s_csv as $val)
{
if(!isset($val))
$val = '';
}
$grid_no = $s_csv[0];
$sub_loc = $s_csv[1];
/*
.define more variables
.*/
$sql = "INSERT INTO indipendant_event"
."(parent_id,grid_number,sub_location,....)"
."VALUES ("
."'{$event_id}',"
."'{$grid_no}',"
//...
.");";
if (!odbc_exec($con,$sql))
{
echo "WARNING: SQL INSERT INTO fssbur.cockle_quadrat FAILED. PHP.";
}
//get ID
$sql = "SELECT MAX(ind_event_id)"
."FROM independant_event";
$return = odbc_exec($con,$sql);
$ind_event_id = odbc_result($return, 1);
//insert individuals
$c_2 = fopen('file2.csv','w');//create file c_2 to write to
$c_1 = fopen('file1.csv','r');//open the data to read
$c_csv = fgetcsv($c_1,'0',',');//get the first line of data
while(! feof($c_1))
{
for($i=0;$i<9;$i++)//make sure theres a value in each column
{
if(!isset($c_csv[$i]))
$c_csv[$i] = '';
}
//give values meaningful names
$stat_no = $c_csv[0];
$sample_method = $c_csv[1];
//....
//check whether the current line corresponds to the current station
if (strcmp(strtolower($stat_no),strtolower($grid_no))==0)
{
$sql = "INSERT INTO fssbur2.cockle"
."(parent_id,sampling_method,shell_height,shell_width,age,weight,alive,discarded,damage)"
."VALUES("
."'{$ind_event_id}',"
."'{$sample_method}',"
//...
."'{$damage}');";
//write data if it corresponds
if (!odbc_exec($con,$sql))
{
echo "WARNING: SQL INSERT INTO fssbur.cockle FAILED. PHP.";
}
$c_csv = fgetcsv($c_1,'0',',');
}
else//no correspondance
{
fputcsv($c_2,$c_csv);//write line to the new file
$c_csv = fgetcsv($c_1,'0',',');//get new line
continue;//rinse and repeat
}
}//end while, now gone through all individuals, and filled c_2 with the unused data
fclose($c_1);//close files
fclose($c_2);
copy('file2.csv','file1.csv');//copy new file to old, removing used data
$s_csv = fgetcsv($s_handle,'0',',');
}//end while
//close file
fclose($s_handle);
?>