Aaah, search engines. Exciting subject, but I would rather build something with internal intelligence rather than using brute force solution. Yes - checking every table/column in database is brute force and may result in sluggishness and false positives.
Let me present you with something I would use instead. With below solution each table/column worth scanning needs to be added manually, but everything else is automatic. Here's the usage:
$e = new SearchEngine();
$e->addTable('users', 'id', 'login'); // table, primary key name, column to be searched in
$e->addTable('users', 'id', 'last_name');
$e->addTable('towns', 'id', 'name');
print_r($e->search('austin')); // we search for exact match for word "austin"
And here's how it was implemented:
class SearchEngine {
protected $tables = array();
public function addTable($table, $key, $column) {
$this->tables[] = array(
'table' => $table,
'key' => $key,
'column' => $column
);
}
public function search($term) {
$q = array();
foreach ($this->tables as $t) {
list($table, $key, $column) = $t;
$q[] = "
SELECT
$key AS searched_key,
'$key' AS searched_key_name,
'$table' AS searched_table,
'$column' AS searched_column,
$column AS searched_value
FROM $table
WHERE $column = $term
";
}
$sql = implode(' UNION ', $q);
// query the database
// return results
}
} // class SearchEngine
Let's analyse example output:
searched_key | searched_key_name | searched_table | searched_column | searched_value
-------------+-------------------+----------------+-----------------+---------------
276 | id | users | login | austin
1782 | id | users | last_name | austin
71 | id | towns | name | austin
From the table above you can figure out that phrase "austin" was found in table users
, column login
(primary key 276) and column last_name
(primary key 1782). It was also found in table towns
in column name
(primary key 71);
Such search result may be sufficient for you. Or else, you can further process the list to select full row from each table:
$out = array();
foreach ($rows as $row) {
$sql = "
SELECT * FROM {$row['searched_table']}
WHERE {$row['searched_key_name']} = {$row['searched_key']}
LIMIT 1
";
// query the database
// append result to $out array
}
return $out;
This way you will end up with full search result (as opposed to intermediate results from previous table):
id: 276, login: austin, last_name: Powers, email: austin.powers@gmail.com
id: 1782, login: michael, last_name: austin, email: michael.e@gmail.com
id: 71, name: austin, state: texas, country: usa
Because current implementation is restricted to fixed comparison operator (WHERE field = value), you may want to introduce some flexibility here. If so, search operator needs to be delegated to external class and injected into search()
function:
public function search(SearchOperator $operator, $term) {
...
Then SearchOperator
needs to be taken into account by replacing WHERE condition with the below:
WHERE {$operator->toSQL($column, $term)}
Now let's focus on SearchOperator
implementation. Since operator implementation provides only one method, namely toSQL
, we don't need full class, or even abstract class. Interface will suffice in this case:
interface SearchOperator {
public function toSQL($column, $term);
} // interface SearchOperator
And let's define couple of implementations representing =
(equals) and LIKE
operators:
class Equals implements SearchOperator {
public function toSQL($column, $term) {
return "$column = '$term'";
}
} // class Equals
class Like implements SearchOperator {
public function toSQL($column, $term) {
return "$column LIKE '$term'";
}
} // class Like
Naturally, any other implementation is possible - think about classes called StartsWith, EndsWith, or DoesNotContain.
See updated solution usage:
$e = new SearchEngine();
$e->addTable('users', 'id', 'login');
$e->addTable('users', 'id', 'last_name');
$e->addTable('towns', 'id', 'name');
print_r($e->search(new Like(), 'austin%')); // here we search for columns being LIKE 'austin%'
Time to leave some final remarks:
- Above examples are incomplete. Database querying code was omitted for clarity.
- SQL used in examples does not sanitize input data. I strongly urge you to use prepared statements with bound parameters to avoid huge security risk.
- Search algorithm presented above is a naive one. Some optimisation can be done (i.e. grouping queries referring to the same table). But don't optimise prematurely - wait until it becomes a real issue.
Hoping this was helpful.