My goal is to with PHP, remove the entire row of a CSV-file if duplicate values appears in a certain column, in this example ID-column. I naturally want to keep the first row where the duplicated ID appears (see example below).
I don't want to create a new CSV-file, I want to open the file, remove what needs to be removed, and overwrite the current file.
I also want to store how many rows that has been removed in variable.
Input (notice duplicate ID of 3): file.csv
ID,Date,Name,Age
1,12/3/13,John Doe ,23
2,12/3/19,Jane Doe ,21
3,12/4/19,Jane Doe ,19
3,12/3/18,John Doe ,33
4,12/3/19,Jane Doe ,21
Expected output: file.csv
ID,Date,Name,Age
1,12/3/13,John Doe ,23
2,12/3/19,Jane Doe ,21
3,12/4/19,Jane Doe ,19
4,12/3/19,Jane Doe ,21
And then also be able to: echo $removedRows;
that will output: 1
How to accomplish this?
I've managed to get this in a new file, but I just want to overwrite the current file and i dont know why i got the " " around name column:
ID,Date,Name,Age
1,12/3/13,"John Doe ",23
2,12/3/19,"Jane Doe ",21
3,12/4/19,"Jane Doe ",19
4,12/3/19,"Jane Doe ",21
With the following code:
$input_filename = 'file.csv';
// Move the csv-file to 'newfile' directory
copy($input_filename, 'newfile/'.$input_filename);
$output_filename = 'newfile/'.$input_filename;
$input_file = fopen($input_filename, 'r');
$output_file = fopen($output_filename, 'w');
$IDs = array();
// Read the header
$headers = fgetcsv($input_file, 1000);
fputcsv($output_file, $headers);
// Flip it so it becomes name => ID
$headers = array_flip($headers);
// Read every row
while (($row = fgetcsv($input_file, 1000)) !== FALSE)
{
$ID = $row[$headers['ID']];
// Do we already have this ID?
if (isset($IDs[$ID]))
continue;
// Mark this ID as being found
$IDs[$ID] = true;
// Write it to the output
fputcsv($output_file, $row);
}