So I have a database in which I have to search for duplicate records and it's going pretty well so far. However, since the data That I currently am using to check for duplicates has quite a few records where it is a null value. Therefore it considers them duplicates of each other while they actually aren't.
here is an example of my problem (this is not the actual representation of the database! it is merely an example)
example:
Real duplicate:
--------------------------------------------------
| id | Storename | Location | coordinates |
--------------------------------------------------
| 1 | bigPharma | Amsterdam| 51,12345 |
--------------------------------------------------
| 2 | bigPharma | Amsterdam| 51,12345 |
--------------------------------------------------
Fake duplicate:
--------------------------------------------------
| id | Storename | Location | coordinates |
--------------------------------------------------
| 3 | Jumbo |Loosdrecht| |
--------------------------------------------------
| 4 | Lidl | Limburg | |
--------------------------------------------------
You see that with the one you have coordinates and with the other they are empty but not a NULL
value.
Question: So what do I need to do to make my program detect these kinds of discrepancies?
Example of my code:
function winkel_duplicates_algemeen() {
global $conn;
$statement = $conn->prepare('SELECT
u.winkel_id,
u.winkel_filiaal_id,
u.winkel_naam,
u.winkel_filiaal_nr,
u.winkel_straat,
u.winkel_huisnummer,
u.winkel_toevoeging,
u.winkel_postcode,
u.winkel_plaats,
u.winkel_lat,
u.winkel_lng,
u.winkel_actief
FROM
winkel_adresgegevens u
INNER JOIN
(SELECT
winkel_lat, winkel_lng, COUNT(*)
FROM
winkel_adresgegevens
GROUP BY winkel_lat, winkel_lng
HAVING COUNT(*) > 1) temp ON temp.winkel_lat = u.winkel_lat
AND temp.winkel_lng = u.winkel_lng
ORDER BY winkel_lat, winkel_lng, winkel_filiaal_id;');
$statement->execute();
return $statement;
}