2021-01-06 06:07

Three bugs in the QSSense scoring algorithm

I've converted the Quicksilver scoring algorithm to JavaScript for various projects, and I believe I've found a few bugs in QSSense.m. (I've only learned as much Objective-C as I needed to translate the algorithm, but I'm fairly sure these bugs are real.)

Several years ago I started with this old version of the code, which used NSString. That worked well, though I didn't really have a way to verify that the JS version was returning the same scores as the Objective-C version.

Recently, I found this repo and the TestQSSense.m file. The scores in testScoreSimpleNS() from the initial commit matched the scores from my JS version, but I couldn't get it to consistently match the new values, even when I added the /2 to score -= (matchedRange.location-strRange.location)/2; (which wasn't in the original code above). Some of the tests produced the same scores, but others didn't. As far as I could tell by inspection, they all should've been the same, so it was very confusing.

I finally just gave in and ran the ObjC code in Xcode to see what was going on. After stepping through it (and cursing the absurd complexity of simply dumping strings to the console in Cocoa/ObjC), I believe there are three issues:

  1. The line where the /2 was added to increase the scores for longer strings throws away the fractional part of the result, since it's not dividing by 2.0. I suppose that could be intentional, like calling floor() on the result, but it seemed odd since score is a float. Making the divisor a float brought more of the scores from my code into alignment with the test file.

  2. The inline buffer is filled with only part of the string, rather than all of it. This throws off the lookup of the current and previous chars, since the lookup is done from matchedRange.location, which is in relation to the start of the entire string. But inlineBuffer is filled from strRange, which moves to the right with each recursion. So the check of whether the previous char was a word separator or the current char is uppercase is looking at the wrong letters, which then causes the score to be adjusted incorrectly. For instance, an abbreviation matching the initial letters of words will have a lower score than it should, since the previous char in the buffer won't be a space, while it is in the original string. Fixing this in the ObjC code brought the remainder of the scores into alignment.

  3. The mask of matching letters can contain more matches than there are letters in the abbreviation, which I mentioned recently in a comment on this old PR (someplace probably no one would see it).

    Consider matching "test" against the string "t e s t tess!". The full abbreviation doesn't appear as a substring in the first loop, so it's reduced to "tes" and the function recurses. That substring is found near the end of the string, and those indices ([8, 9, 10], I think) get set in the mask. Then it looks for the remaining "t" in the rest of the string, but only finds "s!", so it has to start over and shrink the abbreviation again. Eventually, it finds the individual letters separated by spaces and returns a score. But the mask still contains those first hits that it gave up on.

    My solution was to clear the mask just before the return 0; at the end of the function. I believe that's correct, as the 0 score means the algorithm will start over with a shorter abbreviation, and that shorter bit might hit in a different place, rendering the previous hits obsolete. If the UI isn't underlining all those letters, then something else must be cleaning up the mask. (I tried clamping it to the length of the abbreviation, but I don't think that gives the correct results in all cases.)

It looks like the inline buffer was added 9 years ago, so maybe it doesn't make sense to fix the behavior now. But it should be noticeable if you have two somewhat similar strings and an abbreviation that matches the start of words or uppercase letters in one, and random letters near the beginning of the other. The first match might score lower than the second when it shouldn't.

Anyway, I'm just happy to solve the mystery of why my code wasn't returning the same scores when it looked like it should. My rough test version of QSSense.m that I was using to step through the code and print out debug info is in this gist.


  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答


  • weixin_39864571 weixin_39864571 4月前

    I also tried to replicate the results of the testLongString() test in TestQSSense.m with my JS version. Part of the reason the indexes set kept growing as strRange.length kept increasing was bug 3 above, but the other reason is that the index set is not cleared or reintialized between each test. That means it ends up with every hit from all the calls to QSScoreForAbbreviationWithRanges(), whereas you'd normally call it with a fresh set each time.

    点赞 评论 复制链接分享
  • weixin_39859052 weixin_39859052 4月前

    I've converted the Quicksilver scoring algorithm to JavaScript

    Gross. 😉 (I’ve done something similar.)

    That’s the one area of the code that I’ve never really looked at, but I’m sure you’re probably right about the bugs. I’m looking forward to digging into it. (Looks like you’ve done most of the hard work. Thanks.)

    Maybe this is a good time to incorporate the TextStart logic into the main app and make it the default (which I’ve wanted to do for a long time).

    点赞 评论 复制链接分享
  • weixin_39864571 weixin_39864571 4月前

    I've got a JS version of this algorithm in my QuicKey Chrome extension, which lets you switch to a tab via a Quicksilver-style search on the title or URL. I realized it needed some tuning to better sort webpage titles or URLs while the user types, since they're much longer than typical OS filenames and often contain GUIDs or tokens with lots of capital letters. So I dove into to trying to better understand the algorithm and added some debugging info that tried to visualize how the scoring worked.

    Recently I started extracting that algorithm into an independent library, and tried to match the scores in TestQSSense.m test. That's when I realized there were some discrepancies with the official Quicksilver source.

    The current, very early version of that library is here (on the dev branch): https://github.com/fwextensions/quick-score/tree/dev

    Running npm run debug will execute the debug/debug-quick-score.js file, which runs through some tests similar to the range test in TestQSSense.m and outputs a sort of ASCII visualization of the algorithm as it loops through the abbreviation looking for matches:

             This excellent string tells us an interesting story
             (                     **                      )
                                   **(               **    )
    remainingScore: 0.90000
    score: 18
    score: 10.000 remaining: 0.90000 [42,47)
    score: 14.500
    remainingScore: 0.63043
    score: 24
    score: 19.150 remaining: 0.63043 [24,47)
    score: 33.650

    ( ) is the strRange of that call. * shows hits. + indicates the beginning of the current strRange to the end of the match. | indicates what's left in remainingStrRange from the match to the end. w shows word separators and u uppercase letters before the current match. If you look at the output in an IDE, the cursor's column - 10 is the 0-based index of that char in the string.

    Anyway, it might be useful for your investigations, depending on how gross you find JavaScript. :)

    I'll have to look into that TextStart algo. Does it give better matches?

    点赞 评论 复制链接分享
  • weixin_39859052 weixin_39859052 4月前

    I'll have to look into that TextStart algo. Does it give better matches?

    It probably depends on how your brain works, but I think so.

    It gives more weight to letters at the beginning of words and uppercase letters, so searching for “aft” would show “A Frozen Treat.jpg” before showing “Aftermath.txt”.

    点赞 评论 复制链接分享
  • weixin_39864571 weixin_39864571 4月前

    Doesn't the original algorithm already do that? I get 0.92308 for Aftermath.txt and 0.91389 for A Frozen Treat.jpg, compared with 0.68235 for A rofzen reat.jpg, where the matches aren't at the beginning of words.

    点赞 评论 复制链接分享
  • weixin_39859052 weixin_39859052 4月前

    I get 0.92308 for Aftermath.txt and 0.91389 for A Frozen Treat.jpg

    Well yeah, the built-in ranker will score “A Frozen Treat” higher than "A rofzen reat”, but not as high as “Aftermath”. I want contiguous letters to result in a lower score than letters at word boundaries.

    点赞 评论 复制链接分享
  • weixin_39856630 weixin_39856630 4月前

    This is awesome! Thanks for going through the painful task of debugging, and thanks for implementing. The /2.0 vs /2 thing is definitely a bad oversight... I thought I'd chime in to apologise for my awful code writing almost 6 years ago

    Thanks again!

    点赞 评论 复制链接分享