# 筛选账号
我的初始想法是先获取头部前 10000 名活跃的 github 账号。
github 日志数据有的 action 种类:
[ | |
{ | |
"type": "PublicEvent", | |
"action": "" | |
}, | |
{ | |
"type": "WatchEvent", | |
"action": "started" | |
}, | |
{ | |
"type": "IssueCommentEvent", | |
"action": "created" | |
}, | |
{ | |
"type": "ForkEvent", | |
"action": "" | |
}, | |
{ | |
"type": "PullRequestEvent", | |
"action": "opened" | |
}, | |
{ | |
"type": "CreateEvent", | |
"action": "" | |
}, | |
{ | |
"type": "DeleteEvent", | |
"action": "" | |
}, | |
{ | |
"type": "PullRequestEvent", | |
"action": "closed" | |
}, | |
{ | |
"type": "PushEvent", | |
"action": "" | |
}, | |
{ | |
"type": "IssuesEvent", | |
"action": "reopened" | |
}, | |
{ | |
"type": "IssuesEvent", | |
"action": "closed" | |
}, | |
{ | |
"type": "PullRequestReviewCommentEvent", | |
"action": "created" | |
}, | |
{ | |
"type": "IssuesEvent", | |
"action": "opened" | |
}, | |
{ | |
"type": "MemberEvent", | |
"action": "added" | |
}, | |
{ | |
"type": "CommitCommentEvent", | |
"action": "" | |
}, | |
{ | |
"type": "ReleaseEvent", | |
"action": "published" | |
}, | |
{ | |
"type": "GollumEvent", | |
"action": "" | |
}, | |
{ | |
"type": "PullRequestEvent", | |
"action": "reopened" | |
} | |
] |
对于这些行为,我目前还没法确认哪些是 github bot 账号不能有的。
所以,我打算先直接计算这些行为的数,然后确定 bot。
# 根据日期和行为数进行筛选
先筛选 2020-6-20
到 2020-6-26
的这一周数据。