weixin_39773239
weixin_39773239
2020-12-01 18:14

MoreComments.comments() return value

I've noticed an inconsistency in the way that MoreComments.comments() are being returned. Not sure if this is intended behavior or not, but I've found it very difficult to work around.

Here's an example:

 python
reddit = praw.Reddit(user_agent='reddit terminal viewer v0.0')
submission = reddit.get_submission('http://www.reddit.com/r/CollegeBasketball/comments/31owr1/game_thread_ncaa_national_championship_wisconsin/')
for comment in submission.comments:
    if isinstance(comment, praw.objects.MoreComments):
        print('MoreComments')
    else:
        print(comment._replies)

returns something like this

 python
[<praw.objects.comment object at>, <praw.objects.morecomments object at>]
[<praw.objects.comment object at>, <praw.objects.morecomments object at>]
[<praw.objects.comment object at>, <praw.objects.morecomments object at>]
[<praw.objects.morecomments object at>]
[<praw.objects.comment object at>, <praw.objects.morecomments object at>]
[]
[<praw.objects.comment object at>, <praw.objects.morecomments object at>]
[<praw.objects.comment object at>]
[<praw.objects.morecomments object at>]
[]
[<praw.objects.morecomments object at>]
</praw.objects.morecomments></praw.objects.morecomments></praw.objects.comment></praw.objects.morecomments></praw.objects.comment></praw.objects.morecomments></praw.objects.comment></praw.objects.morecomments></praw.objects.morecomments></praw.objects.comment></praw.objects.morecomments></praw.objects.comment></praw.objects.morecomments></praw.objects.comment>

However, doing the same loop over for comment in more_comments_object.comments(update=True) returns

 python
None
MoreComments
[]
None
MoreComments
None
MoreComments
None
MoreComments
[]
[]
[]

It looks like in the second case the MoreComments aren't getting attached the their parent's replies. All of the None values should actually be [MoreComments]. E.g.

 python
[MoreComments]
[]
[MoreComments]
[MoreComments]
[MoreComments]
[]
[]
[]

As a result, trying to build a comment tree by flattening each comment.replies inserts duplicate replies into the tree and makes a bunch of unnecessary calls to api/comments/id instead of /api/morechildren.

该提问来源于开源项目:praw-dev/praw

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

6条回答

  • weixin_39773239 weixin_39773239 5月前

    The problem that I have with replace_more_comments is that it only works on the submission level. I'm trying to build an application that lets users load more replies on-demand, like this

    screenshot from 2015-04-11 10 38 43

    There are a ton of mobile applications that accomplish this, so I have to believe that there's a way to do it through the API.

    I don't see any other way to make this work besides calling MoreComments.comments(). This returns a list that looks something like

    
    The fix is in
    I hope the team with the most points wins
    [More Comments: 15]
    At least Bo Ryan saved those timeouts.  He sure will need them in the off sea...
    That was a helluva game though. GG Wiscy. Congrats Duke. Fuck Michigan.
    [More Comments: 1]
    What the fuck does Wisconsin have to do to get a whistle?
    [More Comments: 2]
    These refs deserve to be taken out behind the shed.  Disgraceful showing.  Du...
    [More Comments: 1]
    [More Comments: 10068]
    

    The point that I was trying to make was that all of these MoreComment objects (except the one at the very bottom), do not belong in this list. Take a look at the json response.

    
                                  {'data': {'approved_by': None,
                                            'body': 'What the fuck does '
                                                    'Wisconsin have to do '
                                                    'to get a whistle?',
                                            'id': 'cq3qboi',
                                            'likes': None,
                                            'link_id': 't3_31owr1',
                                            'mod_reports': [],
                                            'name': 't1_cq3qboi',
                                            'num_reports': None,
                                            'parent_id': 't3_31owr1',
                                            'replies': None,},
                                   'kind': 't1'},
                                  {'data': {'children': ['cq3qetc',
                                                         'cq3qiue'],
                                            'count': 2,
                                            'id': 'cq3qetc',
                                            'name': 't1_cq3qetc',
                                            'parent_id': 't1_cq3qboi'},
                                   'kind': 'more'},
    

    The parent_id for the MoreComments object matches the id of the Comment above it. These hanging MoreComment objects should either be 1. attached to the correct spot as replies to their parent, or 2. thrown out of the MoreComments.comments() response all together.

    I am currently using approach number 2, here's the implementation. The problem with this approach it that for all of those comment objects, just checking if they have replies using comment.replies forces an extra API call. The JSON response tells us that they DO have replies, but that information is not accessible using PRAW.

    点赞 评论 复制链接分享
  • weixin_39965102 weixin_39965102 5月前

    Oh I think I what you're saying, correct me if I'm mistaken. The problem is that MoreComment.comments simply returns a flat list and does not result in building a tree. That's an artifact of the json API not having parent_id on more comment objects in the past, and could probably be fixed. I'll reopen. Thanks again.

    点赞 评论 复制链接分享
  • weixin_39773239 weixin_39773239 5月前

    Yes I think we're on the same page now :+1:

    点赞 评论 复制链接分享
  • weixin_39965102 weixin_39965102 5月前

    -lazar I thought about modifying the return value for the More objects to return a forest rather than a flat list. However, I'd prefer to keep the structure of the return similar to how Reddit returns the values. You should be able to build any such tree in your own code.

    点赞 评论 复制链接分享
  • weixin_39773239 weixin_39773239 5月前

    Here's what I did in case anyone else comes across this issue. Note that I had to use hasattr(comment, 'replies') instead of comment.replies to avoid the hidden http call. https://github.com/michael-lazar/rtv/blob/58f58f816736b18cd2985c39e38b052cd663e645/rtv/content.py#L49

    点赞 评论 复制链接分享
  • weixin_39965102 weixin_39965102 5月前

    If you want to replace more comments you should be using replace_more_comments. If you want to do something custom, I'd start with that method and adapt as necessary. I'm fairly confident that replace_more_comments fetches all comments for a submission as best as can be done.

    It looks like in the second case the MoreComments aren't getting attached the their parent's replies. All of the None values should actually be [MoreComments]. E.g.

    I'm not really sure what you mean here. Your code does:

     python
    if isinstance(comment, praw.objects.MoreComments):
            print('MoreComments')
        else:
            print(comment._replies)
    

    Which means it should print MoreComments when the object is a MoreComment object. Feel free to reopen with additional information. Thanks for taking the time to bring up a potential problem.

    点赞 评论 复制链接分享