I'm trying to find the best way to compare one text (max lenght: 300) against 300.000 with Levenshtein. At the end I need a webservice with a simple REST API. In the future it will be way more entries than 300.000.
In the background I'm using a simple MySQL Database. My first thought was, to use MySQL to do the job. For that I found this: https://github.com/juanmirocks/Levenshtein-MySQL-UDF
But that is way to slow and I tried to implement my idea in Java and PHP. And this is what I got in the worst case (longest text):
MySQL: 70 seconds
Java: 45 seconds
PHP: 17 seconds
Actually PHP sounds not that bad, but it is not that easy to create a webservice with it, which loads my whole table (300.000 entries) into an array and just sits there and waits for some requests to do the job. If I'm wrong, please let me know!
Now I'm looking for any
advice or maybe a solution. I thought about to create a webservice in Go. I don't know if it is going to be faster than PHP, but I could create easy a webservice with it.