dongxin9759 2012-07-07 04:06
浏览 56
已采纳

如何优化我的MySQL数据库

I have a MySQL database that contains all the words in the standard English alphabet, which I am using to create a simple Scrabble word generator. The database is separated into 26 tables: one for each letter in the alphabet. Each table contains two columns:

  • "Word" column: this column is the primary key, is of type char(12), and does not accept null values.
  • "Length" column: this column contains an unsigned tinyint value and does not accept null values.

In my application, the user enters in any number of letters into a textbox (indicating their tiles) and I query the database using this code:

// this is looped over 26 times, and $char is a letter between 'A' and 'Z'
// check if the user entered in character $char or a blank tile (signified by ? in app)
// this check prevents me from having to query useless tables
if (in_array($char, $lettersArray) || $blanks)
{
    // if so, select all words that have a length that's possible to make
    $query = 'SELECT Word FROM '.$char.'Words WHERE Length <= '.strlen($letters);
    $result = $db->query($query);
    $num_results = $result->num_rows;

    for ($j = 0; $j < $num_results; $j++)
    {
        // determine if it's possible to create word based on letters input
        // if so, perform appropriate code
    }
}

Everything is working, but my application takes a long time compared to the competition (theoretical competition, that is; this is more of a learning project I created for myself and I doubt I'll release it on the internet), despite the fact the application is on my local computer. I tried used the automatic optimization feature of phpMyAdmin, but that provided no noticeable speed increase.

  • 写回答

3条回答 默认 最新

  • dongshi1914 2012-07-07 04:55
    关注

    I don't think the performance problem is really the database. The structure of your data store is going to have the most significant impact on the performance of your algorithm.

    One fairly easy-to-understand approach to the problem would be to handle the problem as anagrams. You could alphabetize all of the letters in each of your words, and store that as a column with an index on it.

    word      dorw
    --------  -------
    DALE      ADEL
    LEAD      ADEL
    LED       DEL
    HELLO     EHLLO
    HELP      EHLP
    

    Then, given a set of letters, you could query the database for all matching anagrams. Just alphabetize the set of letters passed in, and run a query.

    SELECT word FROM dictionary WHERE dorw = 'AERT'
    
    RATE
    TARE
    TEAR
    

    Then, you could query for subsets of the letters:

    SELECT word FROM dictionary WHERE dorw IN ('AER','AET','ART','ERT')
    

    This approach would get you the longest words returned first.

    This isn't the most efficient approach, but it's workable.

    Handling a "blank" tile is going to be more work, you'd need to substitute a possible letter for it, and checking all 26 possibilities could be done in one query,

    If they have letters ABCD and the blank tile, for example...

    SELECT word FROM dictionary WHERE dorw IN ('AABCD','ABBCD', 'ABCCD'
     , 'ABCDD', 'ABCDE', 'ABCDE', 'ABCDF', ..., 'ABCDZ') 
    

    That gets more painful when you start dealing with the subsets...

    (In Crossword and Jumble puzzles, there aren't any blank tiles)

    So this may not be the most appropriate algorithm for Scrabble.


    There are other algorithms that may be more efficient, especially at returning the shorter words first.

    One approach is to build a tree.

    The root node is a "zero" letter word. As a child of the root node, would be nodes of all one-letter words. Each node would be marked whether it represented a valid word or not. As children of those nodes, you would have all possible three-letter words, again marked as whether it was valid or not.

    That will be a lot of nodes. For words up to 12 letters in length, that's a total possible space of 1 + 26 + 26**2 + 26**3 + 26**4 + ...

    But you wouldn't need to store every possible node, you'd only store those branches that result in a valid word. You wouldn't have branches below ->Z->Z or ->X->Q

    However, you would have a branch under ->X->Y->L, even though XYL is not a word, it would be the beginning of a branch leading to 'XYLOPHONE'

    But that's a tree traversal algorithm, which is fundamentally different.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 CSAPPattacklab
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图
  • ¥15 stm32开发clion时遇到的编译问题
  • ¥15 lna设计 源简并电感型共源放大器
  • ¥15 如何用Labview在myRIO上做LCD显示?(语言-开发语言)