weixin_39662462
2021-01-11 07:49 阅读 1

Adaptive resign threshold

The current 92 move limit for resigns doesn't work. 1) it works terrible on games that are lost early (like ladder games), many self play and match games are extended unnecessary. 2) it works terrible on handicap games, if LeelaZ doesn't make a huge profit the first 92 moves it will resign, even while people that play handicap games often give away points later in the game, those that play high handicap often give points early game but also all the way in the endgame.

I propose adaptive resign threshold:

Set a resign threshold as normal (like 30%) Set a adaptive threshold (like 10%)

During the game, track the win %. Auto set the resign threshold to (max win%-adaptive threshold).

For example, a handicap game might start at 2% winrate and will not resign. But after 50 moves the winrate is 12%, and resign will be 2%. Some time later, winrate is up to 35% and adaptive resign will be 25%. Now a few moves later winrate drops back down again to 26% and adaptive resign will stay at 25%, one more move, winrate goes below 25% and LeelaZ resigns.

On the other hand, if winrate goes up to 41%, it swill stick to the 30% normal resign threshold and not increase it future.

It might be required to add another safeguard so games do not go on forever. The adaptive resign threshold could be scaled with game length, so not only do you set a % but also a number (like 300), and every move the adaptive resign threshold is decreased by 1/300th and eventually disabled at move 300.

This should probably be a command line argument, but with adaptive resign, it might be possible to remove the 92 move limit for resigns on self play and match games, so that will be a small profit too.

该提问来源于开源项目:leela-zero/leela-zero

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

9条回答 默认 最新

  • weixin_39975122 weixin_39975122 2021-01-11 07:49

    Pretty sure has previously questioned if the 92 move threshold should be around anymore for leelaz and that it was mainly added for leela to avoid some bugs / complaints: https://github.com/gcp/leela-zero/blob/97c2f8137a3ea24938116bfbb2b0ff05c83903f0/src/UCTSearch.cpp#L251-L257

    There does seem to be existing code to adjust the resign threshold with handicap: https://github.com/gcp/leela-zero/blob/97c2f8137a3ea24938116bfbb2b0ff05c83903f0/src/UCTSearch.cpp#L270-L283

    Although those handicaps incremented with set_free_handicap <position>. how are the handicap games started?

    点赞 评论 复制链接分享
  • weixin_39584549 weixin_39584549 2021-01-11 07:49

    Pretty sure has previously questioned if the 92 move threshold should be around anymore for leelaz and that it was mainly added for leela to avoid some bugs / complaints

    It was added to avoid resigning in handicap games, and it may not be necessary in LZ any more because it already has adaptive resigning.

    Almost everything asks for is already implemented, as far as I know? Do you have any specific handicap games that show issues with the current adaptive resignation system?

    点赞 评论 复制链接分享
  • weixin_39584549 weixin_39584549 2021-01-11 07:49

    See pull request #626 which was based on issue #622.

    点赞 评论 复制链接分享
  • weixin_39662462 weixin_39662462 2021-01-11 07:49

    Oh this is already in? Well i tried a 9h handicap game vs a LeelaZ bot on KGS and it resigned at i think 92 moves.

    Do the bot authors need to do anything special to activate this? Or is it only in the dev version?

    I believe the bot i played was heavily modified so it might not have all patches etc.

    点赞 评论 复制链接分享
  • weixin_39988779 weixin_39988779 2021-01-11 07:49

    This has been in a long time. Look under the code for the 92 threshold.

    点赞 评论 复制链接分享
  • weixin_39662462 weixin_39662462 2021-01-11 07:49

    I checked back, my handicap game started with 9 stones placed by kgs. It resigned at move 82 where winrate just wend up to 1% (or 0.01%, that's what the bot owner said but idk if that's even possible). (82 moves is actually 92 moves since the first 9 are placing handicap by kgs).

    The way i read the code, it already has the second safe guard where it slowly increase the resign threshold for 215 moves, but i do not see the part where it will not surrender as long as win % is increasing over multiple moves.

    My idea was to let LeelaZ not resign unless win% drops.

    点赞 评论 复制链接分享
  • weixin_39975122 weixin_39975122 2021-01-11 07:49

    I found this OGS game https://online-go.com/game/12275021 where there's 4 handicap, and RoyalZero resigns after move 92 with Variation: 0.1% 751/3041 visits. It's using the default 10% threshold, and with 4 handicap, the minimum handicap threshold is 2%, and the blended threshold for this move is around 5%.

    At least from RoyalZero's comment history, the highest eval was ever 0.4%, so removing the 92 limit won't quite work by itself as it would try to resign for an eval under 2%. Even if RoyalZero was run with --resignpct 1, with 4 handicap, the minimum handicap threshold is 0.2%, which looks to happen before move 20 in this game.

    you're right that the code does not implement your proposed "adaptive resign threshold" but it does have a "dynamic handicap-adjusted resign threshold." Although I'm not sure how useful it will be if implemented as your example has win rate going up to 41%, but with such a high handicap, quite likely eval will be close to 0.00% for a long time, e.g., https://online-go.com/game/12276810 and your game with 0.01%.

    Anyone know how accurate those evals and move selections are when win rate is close to 0.00%?

    点赞 评论 复制链接分享
  • weixin_39969143 weixin_39969143 2021-01-11 07:49

    Yeah we''re moving in that direction at https://github.com/online-go/gtp2ogs/issues/65.

    点赞 评论 复制链接分享
  • weixin_39535287 weixin_39535287 2021-01-11 07:49

    closing, no active discussion for ~1year with subsequent discussion in other threads

    点赞 评论 复制链接分享

相关推荐