weixin_39616379
weixin_39616379
2020-11-27 20:55

added --relationship flag to bioguide.py for getting family members

Congressional bios include familial relationships to other members past and present in a parenthetical directly after the name. e.g.:

KENNEDY, Joseph P. III, (son of Joseph Patrick Kennedy, II, great-nephew of Edward Moore Kennedy and John Fitzgerald Kennedy, and great-great-grandson of John Francis Fitzgerald, first cousin once removed of Patrick Joseph Kennedy), a Representative from Massachusetts...

Because the relationship word ("great-nephew") can apply to multiple people, as in JPK III's case here, I'm using the NLTK lib's chunking parser to get names, then groups of names, then relationships, ultimately organized as a tree. This is arguably overkill and does add a dependency for an optional feature, but it works quite well.

Right now, this does not reconcile the family members with their bioguide id or other identifying information. Wouldn't be a difficult feature to add.

该提问来源于开源项目:unitedstates/congress-legislators

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

5条回答

  • weixin_39733232 weixin_39733232 5月前

    Seems like all upside - merged! But I don't think I want to document the fields and implicitly commit to them until we can do a normalization to ID for detected family members.

    点赞 评论 复制链接分享
  • weixin_39616379 weixin_39616379 5月前

    Great, I'll put it on my todo list

    On Thu, Oct 17, 2013 at 3:25 PM, Eric Mill notifications.com wrote:

    Seems like all upside - merged! But I don't think I want to document the fields and implicitly commit to them until we can do a normalization to ID for detected family members.

    — Reply to this email directly or view it on GitHubhttps://github.com/unitedstates/congress-legislators/pull/145#issuecomment-26540839 .

    christopher.e.wilson.com 434.242.9728

    点赞 评论 复制链接分享
  • weixin_39789499 weixin_39789499 5月前

    Yeah, it seems this isn't particularly helpful until the IDs themselves are linked. (Otherwise, there's no indication—besides an assumption—that these relatives are even legislators.)

    Also, shouldn't some of these relationships be hyphenated (like son-in-law)?

    点赞 评论 复制链接分享
  • weixin_39616379 weixin_39616379 5月前

    There's a function that attempts to match plain-English names to ids somewhere, right? I vaguely recall writing one, in fact. Maybe in a different repo?

    The hyphens were confusing the POS tagger so I took them out, but that's fixable.

    On Thu, Oct 17, 2013 at 7:25 PM, Gordon P. Hemsley <notifications.com

    wrote:

    Yeah, it seems this isn't particularly helpful until the IDs themselves are linked. (Otherwise, there's no indication—besides an assumption—that these relatives are even legislators.)

    Also, shouldn't some of these relationships be hyphenated (like son-in-law)?

    — Reply to this email directly or view it on GitHubhttps://github.com/unitedstates/congress-legislators/pull/145#issuecomment-26561545 .

    christopher.e.wilson.com 434.242.9728

    点赞 评论 复制链接分享
  • weixin_39789499 weixin_39789499 5月前

    There's utils.lookup_legislator() in the congress project:

    https://github.com/unitedstates/congress/blob/master/tasks/utils.py#L553

    点赞 评论 复制链接分享

相关推荐