【学术相关】MLNLP发布AI论文检索利器:AI-Paper-Collector
转载
1
『动机』
每当我们接触一个新领域需要调研的时候,都需要去检索相关主题的论文,为了方便大家检索和提高效率,我们开源了一个工具AI-Paper-Collector能够自动帮助大家获取指定主题的会议论文(目前已经支持CV与NLP超过20个常见会议),并且支持精准匹配和模糊匹配。
2
『搜索类别』
- [EMNLP 2019-2021] [ACL 2019-2022] [NAACL 2019-2022] [COLING 2020-2020]
- [ICASSP 2019-2022] [WWW 2019-2022] [ICLR 2019-2022] [ICML 2019-2022]
- [AAAI 2019-2022] [IJCAI 2019-2022] [CVPR 2019-2022] [ICCV 2019-2021]
- [MM 2019-2021] [KDD 2019-2022] [CIKM 2019-2021] [SIGIR 2019-2022]
- [WSDM 2019-2022] [ECIR 2019-2022] [ECCV 2020-2020] [COLT 2019-2022]
- [AISTATS 2019-2022] [INTERSPEECH 2019-2021] [ISWC 2019-2021] [JMLR 2019-2022]
- [VLDB 2019-2021] [ICME 2019-2021] [TIP 2020-2022] [TPAMI 2020-2022]
- [RECSYS 2019-2021] [TKDE 2020-2022] [TOIS 2020-2022] [ICDM 2019-2021]
- [TASLP 2020-2022] [BMVC 2019-2021] [NIPS 2019-2021] [MLSYS 2020-2022]
- [WACV 2020-2022]
3
『安装』
当前安装是克隆这个 repo。
git clone https://kgithub.com/MLNLP-World/AI-Paper-Collector.git
cd AI-Paper-Collector
pip install -r requirements.txt
4
『用法』
我们提供了三种使用模式,第一种是 交互 (main.py),第二种是 命令行 (cli_main.py),另一种是 网页界面 (app.py)。建议初次使用的用户使用交互模式。
交互式使用示例
要开始交互,请键入:
交互式搜索论文的几个步骤。
关键字查询
- 搜索模式(精确或模糊)
- (模糊)阈值
- 结果的极限
- 会议列表,以逗号分隔
- 输出的文件路径(命令预览的前 5 个,所有结果都在这个文件中)
例如
[+] Initializing System...
[+] Loading from cache...
[+] Enter your query: few-shot
[+] Select search mode:
[1] Exact
[2] Fuzzy
[+] Enter a number between 1 to 2: 2
[+] Enter threshold between 0 and 100 (default: 50):
[+] Enter limit >= 0 (default: None):
[+] Enter the list of confs separated by comma
E.g. "ACL,CVPR" or "AAAI" or enter nothing for all confs
[+] Enter your list of conferences (default: All Confs): SIGIR,WSDM,CIKM
[+] Search Results:
[=] Only show Top-5, Please Save results to see all.
[1] [CIKM2021] REFORM: Error-Aware Few-Shot Knowledge Graph Completion.
[2] [CIKM2021] Boosting Few-shot Abstractive Summarization with Auxiliary Tasks.
[3] [CIKM2021] Multi-objective Few-shot Learning for Fair Classification.
[4] [CIKM2020] Graph Few-shot Learning with Attribute Matching.
[5] [CIKM2020] Few-shot Insider Threat Detection.
[+] Enter Save filename:
[+] Writing results to output/fuzzy_None_SIGIR_WSDM_CIKM_few-shot.txt
[+] Writing results Done!
命令行用法
对于命令行使用,您可以使用以下命令:
# -q, --query: the input query, and the content with multiple words should be wrapped in quotation marks
# -m, --mode: the search mode: fuzzy or exact, default is exact
# -t, --threshold: the threshold for the fuzzy search, default is 50
# -l, --limit: the limit num of the fuzzy search result, default is None
# -c, --conf: the list of the conferences needs to search, default is all
# -o, --output: the output file name, default is [mode]_[threshold]_[confs]_[query].txt
# -f, --force: force to update the cache file incrementally
python cli_main.py --query QUERY \
[--mode {fuzzy,exact}] \
[--threshold THRESHOLD] [--limit LIMIT] [--conf CONF] \
[--output OUTPUT] [--force]
例如
# Note that the input query must be enclosed in `""`, such as "few shot".
python cli_main.py -q "few shot" -m fuzzy -l 10 -t 10 -c AAAI,ACL -o results.txt
网页界面使用
对于 Web 界面使用,您可以使用以下命令:
pip install -r requirements.txt
python app.py
然后打开以下网址:http://localhost:5000
效果如下:
比如我们对跨语言cross-lingual感兴趣,我们可以输入cross-lingual关键词
然后会得到在我们支持的会议中包含cross-lingual关键词的文章,非常方便:
5
『如何从DBLP添加新会议』
通过问题触发的工作流程自动更新
如果有人想添加新的会议列表。请按照此格式提出问题。我们将检查并标记它,然后工作流将自动运行。 问题格式
供克隆项目的用户使用
- 通过修改conf/dblp_conf.json文件添加新会议
[
# add the name and dblp_url of the new conf
{
"name": "WWW2021",
"url": "https://dblp.org/db/conf/www/www2021.html"
},
...
]
# force to update the cache file incrementally
python cli_main.py --query '' --force
6
『免责声明』
由于该工具处于开发阶段,我们不能保证找到的论文能够满足您的需求。我希望您谅解。此外,所有结果均来自 DBLP 、 ACL 、 NIPS 、 OpenReview ,如果侵犯了您的版权,您可以随时联系我们,我们会尽快删除,谢谢:)
7
『项目地址』
地址:https://github.com/MLNLP-World/AI-Paper-Collector
网页演示:https://ai-paper-collector.vercel.app/ (推荐)
Colab 笔记本:https://colab.research.google.com/github/Doragd/AI-Paper-collector-Dev/blob/main/colab/AI_Paper_Collector_Colab.ipynb
欢迎大家star,fork和参与pr。
MLNLP社区
2022/8/24