I'm having a really hard time with Doctrine failing to work as expected.
What my code tries to do.
I'm writing a CLI command in my Symfony 3 web application that is supposed to tidy up a tags table in my database. There are Actors and there are Tags. There's a Many-To-Many relation between Actors and Tags (bidirectional). My command imports a CSV file where in one column current Tags are listed, and in another column there are some substitutes for them. It goes through the file, line by line, finds the existing Tag, reads all its current relations to Actors, deletes the Tag, creates a new Tag (substitute) or uses an existing one, and attaches to it all the Actor relations of the deleted one.
The code (its crucial part)
protected function doReplace(InputInterface $input, OutputInterface $output, $input_file)
{
$em = $this->getContainer()->get('doctrine')->getManager();
$con = $em->getConnection();
//open the input CSV
$input_fhndl = fopen($input_file, 'r');
if (!$input_fhndl)
throw new \Exception('Unable to open file!');
//do everything in a big transaction, so that if anything fails
//everything rolls back and there's no half-finished information
//in the DB
$con->beginTransaction();
try
{
//I was trying to use official Doctrine recommendation for batch inserts
//to clear the entity manager after a bunch of operations,
//but it does neither help nor make things worse
// $batchSize = 20;
$i = 0;
//reading the file line by line
while (($line = fgetcsv($input_fhndl)) !== false)
{
//$line[0] - source tag ID (the one to be substituted)
//$line[1] - source tag type ('language' or 'skill')
//$line[2] - source tag value (e.g. 'pole dancing (advanced)')
//$line[3] - replacement tag value (e.g. 'pole dancing')
$i++;
if ($i === 1) //omit table headers
continue;
$line[3] = trim($line[3]);
if ($line[3] === null || $line[3] === '') //omit lines with no replacements
continue;
//getting the tag to be replaced
$src_tag = $em->getRepository('AppBundle:Tag')
->find($line[0]);
if (!$src_tag)
{
//if the tag that is supposed to be replaced doesn't exist, just skip it
continue;
}
$replacement_tag = null;
$skip = false;
//if the replacement value is '!' they just want to delete the original
//tag without replacing it
if (trim($line[3]) === '!')
{
$output->writeln('Removing '.$line[2].' ');
}
//here comes the proper replacement
else
{
//there can be a few replacement values for one source tag
//in such case they're separated with | in the input file
$replacements = explode('|', $line[3]);
foreach ($replacements as $replacement)
{
$skip = false;
$output->write('Replacing '.$line[2].' with '.trim($replacement).'. ');
//getOrCreateTag looks for a tag with the same type and value as the replacement
//if it finds one, it retrieves the entity, if it doesn't it creates a new one
$replacement_tag = $this->getOrCreateTag($em, $src_tag->getTagType(), trim($replacement), $output);
if ($replacement_tag === $src_tag) //delete the original tag only if it is different from the replacement
{
$skip = true;
}
else
{
//we iterate through deleted Tag's relationships with Actors
foreach ($src_tag->getActors() as $actor)
{
//this part used to be the many-to-many fail point but i managed to fix it by removing indexBy: id line from Actor->Tag relation definition
if (!$replacement_tag->getActors() || !$replacement_tag->getActors()->contains($actor))
$replacement_tag->addActor ($actor);
}
$em->persist($replacement_tag);
//...and if I uncomment this flush()
//I get errors like Notice: Undefined index: 000000005f12fa20000000000088a5f2
//from Doctrine internals
//even though it should be harmless
// $em->flush();
}
}
}
if (!$skip) //delete the original tag only if it is different from the replacement
{
$em->remove($src_tag);
$em->flush(); //this flush both deletes the original tag and sets up the new one
//with its relations
}
// if (($i % $batchSize) === 0) {
// $em->flush(); // Executes all updates.
// $em->clear(); // Detaches all objects from Doctrine!
// }
}
$em->flush(); //one final flush just in case
$con->commit();
}
catch (\Exception $e)
{
$output->writeln('<error> Something went wrong! Rolling back... </error>');
$con->rollback();
throw $e;
}
//closing the input file
fclose($input_fhndl);
}
protected function getOrCreateTag($em, $tag_type, $value, $output)
{
$value = trim($value);
$replacement_tag = $em
->createQuery('SELECT t FROM AppBundle:Tag t WHERE t.tagType = :tagType AND t.value = :value')
->setParameter('tagType', $tag_type)
->setParameter('value', $value)
->getOneOrNullResult();
if (!$replacement_tag)
{
$replacement_tag = new Tag();
$replacement_tag->setTagType($tag_type);
$replacement_tag->setValue($value);
$output->writeln('Creating new.');
}
else
{
$output->writeln('Using existing.');
}
return $replacement_tag;
}
How it fails
Even though i do this check: $replacement_tag->getActors()->contains($actor)
Doctrine tries to create a duplicate Actor-Tag relation:
[Doctrine\DBAL\Exception\UniqueConstraintViolationException]
An exception occurred while executing 'INSERT INTO actor_tags (actor_id, tag_id) VALUES (?, ?)' with params [280, 708]:
SQLSTATE[23505]: Unique violation: 7 ERROR: duplicate key value violates unique constraint "actor_tags_pkey"
DETAIL: Key (actor_id, tag_id)=(280, 708) already exists.
I managed to fix the above by removing indexBy: id
from Actor->Tag relation definition (it was there by accident).
And additionally, when I do some theoretically harmless modifications, like uncommenting the commented flush()
call or not using the big transaction, I get this
Still even without any modifications to the code, at some point of importing I get this:
[Symfony\Component\Debug\Exception\ContextErrorException]
Notice: Undefined index: 000000001091cbbe000000000b4818c6
Exception trace:
() at /src/__sources/atm/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php:2907
Doctrine\ORM\UnitOfWork->getEntityIdentifier() at /src/__sources/atm/vendor/doctrine/orm/lib/Doctrine/ORM/Persisters/Collection/ManyToManyPersister.php:543
Doctrine\ORM\Persisters\Collection\ManyToManyPersister->collectJoinTableColumnParameters() at /src/__sources/atm/vendor/doctrine/orm/lib/Doctrine/ORM/Persisters/Collection/ManyToManyPersister.php:473
Doctrine\ORM\Persisters\Collection\ManyToManyPersister->getDeleteRowSQLParameters() at /src/__sources/atm/vendor/doctrine/orm/lib/Doctrine/ORM/Persisters/Collection/ManyToManyPersister.php:77
Doctrine\ORM\Persisters\Collection\ManyToManyPersister->update() at /src/__sources/atm/vendor/doctrine/orm/lib/Doctrine/ORM/UnitOfWork.php:388
Doctrine\ORM\UnitOfWork->commit() at /src/__sources/atm/vendor/doctrine/orm/lib/Doctrine/ORM/EntityManager.php:359
Doctrine\ORM\EntityManager->flush() at /src/__sources/atm/src/AppBundle/Command/AtmReplaceTagsCommand.php:176
AppBundle\Command\AtmReplaceTagsCommand->doReplace() at /src/__sources/atm/src/AppBundle/Command/AtmReplaceTagsCommand.php:60
AppBundle\Command\AtmReplaceTagsCommand->execute() at /src/__sources/atm/vendor/symfony/symfony/src/Symfony/Component/Console/Command/Command.php:262
Symfony\Component\Console\Command\Command->run() at /src/__sources/atm/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php:848
Symfony\Component\Console\Application->doRunCommand() at /src/__sources/atm/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php:190
Symfony\Component\Console\Application->doRun() at /src/__sources/atm/vendor/symfony/symfony/src/Symfony/Bundle/FrameworkBundle/Console/Application.php:80
Symfony\Bundle\FrameworkBundle\Console\Application->doRun() at /src/__sources/atm/vendor/symfony/symfony/src/Symfony/Component/Console/Application.php:121
Symfony\Component\Console\Application->run() at /src/__sources/atm/bin/console:28
Doing $em->clear()
every few rows doesn't help.
What I tried
- I tried changing the
flush()
call sequence which often resulted in that weird undefined index error. - I tried commenting out the big transaction (to no avail).
- I tried to call
$em->clear()
after each 20 records - that also didn't change anything at all.
I will really appreciate any help with this.
Additional info
YAML definition of Actor->Tag relation (for Actor entity):
manyToMany:
tags:
targetEntity: AppBundle\Entity\Tag
inversedBy: actors
#indexBy: id
#the above line caused the Many-To-Many duplicate problem - commenting it out fixed that part of the problem.
joinTable:
name: actor_tags
joinColumns:
actor_id:
referencedColumnName: id
inverseJoinColumns:
tag_id:
referencedColumnName: id
YAML definition of Tag->Actor relation (for Tag entity):
manyToMany:
actors:
targetEntity: AppBundle\Entity\Actor
mappedBy: tags
Tag::addActor()
function definition
public function addActor(\AppBundle\Entity\Actor $actor)
{
$this->actor[] = $actor;
$actor->addTag($this);
return $this;
}
Actor::addTag()
function definition
public function addTag(\AppBundle\Entity\Tag $tag)
{
$this->tags[] = $tag;
$this->serializeTagIds();
return $this;
}
If you need any additional info just ask. Thank you very much.