I have a table with a column 'search_text' type text
.
In that field I have values:
1. 'MyBook MyBook PDF PDF',
2. 'Example 1 Example 2 Example 3'
3. 'John Snow John Snow'
I would like to distinct clean these fields.
Expected result:
1. 'MyBook PDF',
2. 'Example 1 2 3'
3. 'John Snow'
The approach I came up with goes as follows:
read the field for each record, split it by space (' '), put each text in array, do array_unique
in PHP, then put the array back to string with join
in PHP.
The thing is, this is a PHP based solution, I would like to have an MySQL solution for this. I have over 180.000 records I need to clean, I don't know what impact it would have to run this on PHP.
I have found a solution for MS SQL: Remove duplicate values in a cell SQL Server
Help greatly appreciated.
SQL of my test data:
CREATE TABLE IF NOT EXISTS `test` (
`id` int(10) unsigned NOT NULL,
`search_text` text COLLATE utf8_unicode_ci NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=6 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `test` (`id`, `search_text`) VALUES
(1, 'MyBook MyBook PDF PDF'),
(2, 'Example 1 Example 2 Example 3'),
(3, 'John Snow John Snow'),
(4, 'test test test test formula test test test formula test test test formula test test test formula test test test formula test test test formula '),
(5, '');
ALTER TABLE `test`
ADD PRIMARY KEY (`id`);
ALTER TABLE `test`
MODIFY `id` int(10) unsigned NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=6;