1

『动机』

每当我们接触一个新领域需要调研的时候,都需要去检索相关主题的论文,为了方便大家检索和提高效率,我们开源了一个工具AI-Paper-Collector能够自动帮助大家获取指定主题的会议论文(目前已经支持CV与NLP超过20个常见会议),并且支持精准匹配和模糊匹配。

【学术相关】MLNLP发布AI论文检索利器:AI-Paper-Collector_大数据

2

『搜索类别』

- [EMNLP 2019-2021] [ACL 2019-2022] [NAACL 2019-2022] [COLING 2020-2020] 
- [ICASSP 2019-2022] [WWW 2019-2022] [ICLR 2019-2022] [ICML 2019-2022]
- [AAAI 2019-2022] [IJCAI 2019-2022] [CVPR 2019-2022] [ICCV 2019-2021]
- [MM 2019-2021] [KDD 2019-2022] [CIKM 2019-2021] [SIGIR 2019-2022]
- [WSDM 2019-2022] [ECIR 2019-2022] [ECCV 2020-2020] [COLT 2019-2022]
- [AISTATS 2019-2022] [INTERSPEECH 2019-2021] [ISWC 2019-2021] [JMLR 2019-2022]
- [VLDB 2019-2021] [ICME 2019-2021] [TIP 2020-2022] [TPAMI 2020-2022]
- [RECSYS 2019-2021] [TKDE 2020-2022] [TOIS 2020-2022] [ICDM 2019-2021]
- [TASLP 2020-2022] [BMVC 2019-2021] [NIPS 2019-2021] [MLSYS 2020-2022]
- [WACV 2020-2022]

3

『安装』

当前安装是克隆这个 repo。

git clone https://kgithub.com/MLNLP-World/AI-Paper-Collector.git
cd AI-Paper-Collector
pip install -r requirements.txt

4

『用法』

我们提供了三种使用模式,第一种是 交互 (main.py),第二种是 命令行 (cli_main.py),另一种是 网页界面 (app.py)。建议初次使用的用户使用交互模式。

交互式使用示例

要开始交互,请键入:

python main.py

交互式搜索论文的几个步骤。

关键字查询

  1. 搜索模式(精确或模糊)
  2. (模糊)阈值
  3. 结果的极限
  4. 会议列表,以逗号分隔
  5. 输出的文件路径(命令预览的前 5 个,所有结果都在这个文件中)

例如

[+] Initializing System...
[+] Loading from cache...
[+] Enter your query: few-shot

[+] Select search mode:
[1] Exact
[2] Fuzzy
[+] Enter a number between 1 to 2: 2
[+] Enter threshold between 0 and 100 (default: 50):
[+] Enter limit >= 0 (default: None):
[+] Enter the list of confs separated by comma
E.g. "ACL,CVPR" or "AAAI" or enter nothing for all confs
[+] Enter your list of conferences (default: All Confs): SIGIR,WSDM,CIKM

[+] Search Results:
[=] Only show Top-5, Please Save results to see all.
[1] [CIKM2021] REFORM: Error-Aware Few-Shot Knowledge Graph Completion.
[2] [CIKM2021] Boosting Few-shot Abstractive Summarization with Auxiliary Tasks.
[3] [CIKM2021] Multi-objective Few-shot Learning for Fair Classification.
[4] [CIKM2020] Graph Few-shot Learning with Attribute Matching.
[5] [CIKM2020] Few-shot Insider Threat Detection.

[+] Enter Save filename:
[+] Writing results to output/fuzzy_None_SIGIR_WSDM_CIKM_few-shot.txt
[+] Writing results Done!

命令行用法

对于命令行使用,您可以使用以下命令:

# -q, --query:     the input query, and the content with multiple words should be wrapped in quotation marks
# -m, --mode: the search mode: fuzzy or exact, default is exact
# -t, --threshold: the threshold for the fuzzy search, default is 50
# -l, --limit: the limit num of the fuzzy search result, default is None
# -c, --conf: the list of the conferences needs to search, default is all
# -o, --output: the output file name, default is [mode]_[threshold]_[confs]_[query].txt
# -f, --force: force to update the cache file incrementally
python cli_main.py --query QUERY \
[--mode {fuzzy,exact}] \
[--threshold THRESHOLD] [--limit LIMIT] [--conf CONF] \
[--output OUTPUT] [--force]

例如

# Note that the input query must be enclosed in `""`, such as "few shot".
python cli_main.py -q "few shot" -m fuzzy -l 10 -t 10 -c AAAI,ACL -o results.txt

网页界面使用

对于 Web 界面使用,您可以使用以下命令:

pip install -r requirements.txt
python app.py

然后打开以下网址:http://localhost:5000

效果如下:

【学术相关】MLNLP发布AI论文检索利器:AI-Paper-Collector_linux_02

比如我们对跨语言cross-lingual感兴趣,我们可以输入cross-lingual关键词

【学术相关】MLNLP发布AI论文检索利器:AI-Paper-Collector_linux_03

然后会得到在我们支持的会议中包含cross-lingual关键词的文章,非常方便:

【学术相关】MLNLP发布AI论文检索利器:AI-Paper-Collector_python_04

5

『如何从DBLP添加新会议』

通过问题触发的工作流程自动更新

如果有人想添加新的会议列表。请按照此格式提出问题。我们将检查并标记它,然后工作流将自动运行。   问题格式

供克隆项目的用户使用

  • 通过修改conf/dblp_conf.json文件添加新会议
[
# add the name and dblp_url of the new conf
{
"name": "WWW2021",
"url": "https://dblp.org/db/conf/www/www2021.html"
},
...
]
  • 运行脚本
# force to update the cache file incrementally
python cli_main.py --query '' --force

6

『免责声明』

由于该工具处于开发阶段,我们不能保证找到的论文能够满足您的需求。我希望您谅解。此外,所有结果均来自 DBLP 、 ACL 、 NIPS 、 OpenReview ,如果侵犯了您的版权,您可以随时联系我们,我们会尽快删除,谢谢:)

7

『项目地址』

地址:https://github.com/MLNLP-World/AI-Paper-Collector

网页演示:https://ai-paper-collector.vercel.app/ (推荐)

Colab 笔记本:https://colab.research.google.com/github/Doragd/AI-Paper-collector-Dev/blob/main/colab/AI_Paper_Collector_Colab.ipynb

欢迎大家star,fork和参与pr。

MLNLP社区

2022/8/24