Optimize
You could optimize your process without using Doctrine's result caches:
First create a map of years to their ids like so:
$yearsMap = array();
$q = $em->createQuery('SELECT y.id, y.year_name FROM Entities\Year y');
foreach ($q->getScalarResult() as $row) {
$yearsMap[$row['year_name']] = $row['id'];
}
Also create a map of departments to their ids, and division to their ids.
This will result in 3 (light) queries.
The best place to put this code is in a (custom) repository.
Next you can run your loop, but "get" the actual entities like this:
$year = $this->em->getReference('Entities\Year', $yearsMap[$this->year[$i]]);
$department = $this->em->getReference('Entities\Department', $departmentsMap[$this->branch[$i]]);
$division = $this->em->getReference('Entities\Division', $divisionsMap[$this->division[$i]]);
I say "get", because getReference()
actually creates a proxy (unless it was already loaded by the entity-manager, but in this case it probably isn't). That proxy won't be loaded yet, so no queries are performed here.
The rest of your code doesn't need changes.
Now when flush()
is called, Doctrine will load each distinct year/department/division only once. This could still result in a few queries, depending on how many different years/departments/divisions are used. So if all 100 students use different years/departments/divisions, you'll end up with 403 queries (3 for the maps, 300 for loading proxies, 100 for inserting students). But if all 100 students use the same year/department/division, you'll end up with only 106 queries (3 for the maps, 3 for loading proxies, 100 for inserting students).
Optimize another way
Another way to go is to use the names you've gathered to fetch all the entities you need:
$q = $em->createQuery('SELECT y FROM Entities\Year y INDEX BY y.year_name WHERE y.year_name IN(:years)');
$q->setParameter('years', $yearNames);
$yearsMap = $q->getResult();
You now have all Year entities you need with only 1 query. You can do the same for departments and divisions.
Also note the INDEX BY
in the DQL statement: This will make sure you'll get an array with year_name
as key and the entity as value. You can use this straight away in your loop like so:
$year = $yearsMap[$this->year[$i]];
$department = $departmentsMap[$this->branch[$i]];
$division = $divisionsMap[$this->division[$i]];
The end result for 100 students will always be 103 queries (3 for the maps, 100 for inserting students).
Cache
When you need to run this loop relatively often and it strains the database, it's wise to use Doctrine's result cache. A couple of things to note though:
getReference()
doesn't support result caches (yet), and result caches aren't used automatically. So I suggest you put something like this in a repository:
public function findOneYearByName($name)
{
$q = $em->createQuery('SELECT y FROM Entities\Year y WHERE y.year_name = :year');
$q->setParameter('year', $name);
$q->useResultCache(true);
return $q->getSingleResult();
}
You'd probably want to configure the result cache, see the docs about that.
Another note is that the result cache will cache the result fetched from the database, before it's hydrated. So even when using result caches, the actual entities are hydrated every time. I therefor still recommend to use maps, but implemented slightly different:
$yearsMap = array();
$departmentsMap = array();
$divisionsMap = array();
forloop (...):
if (!isset($yearsMap[$this->year[$i]])) {
$yearsMap[$this->year[$i]] = $this->em->getRepository('Entities\Year')->findOneYearByName($this->year[$i]);
}
if (!isset($departmentsMap[$this->branch[$i]])) {
$departmentsMap[$this->branch[$i]] = $this->em->getRepository('Entities\Department')->findOneDepartmentByName($this->branch[$i]);
}
if (!isset($divisionsMap[$this->division[$i]])) {
$divisionsMap[$this->division[$i]] = $this->em->getRepository('Entities\Division')->findOneDivisionByName($this->division[$i]);
}
$year = $yearsMap[$this->year[$i]];
$department = $departmentsMap[$this->branch[$i]];
$division = $divisionsMap[$this->division[$i]];
This will make sure each distinct year/department/division is hydrated only once.
PS: Using a result cache for "Optimize another way" won't work as efficient, because the names of the years/departments/divisions are likely to be different each time you run your loop. With every change of the names the queries change, and cached results can't be used.
DBAL
Can I get IDs for year, department and division from their name directly while inserting data?
You can, but you won't be using the ORM, but only DBAL. You basically do this:
$connection = $em->getConnection();
$statement = $conn->executeQuery('insert query', array('parameter1', 'etc'));
$statement->execute();
I doubt this will be more efficient, because MySQL (or whatever vendor you're using) will still perform those 3 (sub)queries for every insert, they just don't "go over the wire". And you don't get any help from the ORM, like managing associations, etc.
Still, you can find everything on the subject here.