transformer bert 相关文章解读

原创

奇点_python_nlp 2021-11-20 15:28:33 博主文章分类：python ©著作权

文章标签 transformer bert attention 模态 ide 文章分类 NLP 人工智能

©著作权归作者所有：来自51CTO博客作者奇点_python_nlp的原创作品，请联系作者获取转载授权，否则将追究法律责任

我这几天结合之前的阅读梳理了一下关于multi-modal相关的文献，附件包含了所列文献，你可以按照以下内容进行系统调研：

一、调研建议首先从综述入手，再针对综述中的参考文献进行深入调研，以下是比较好的Multi-modal方面的一些综述：

（1）ACL 2020上有个Tutorial：Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web，地址：Multi-modal Information Extraction from Text, Semi-structured, and Tabular Data on the Web

（2）KDD2020 上有个Tutorial：Multi-modal Network Representation Learning，地址：https://chuxuzhang.github.io/KDD20_Tutorial.html

（3）多模态视觉语言表征学习研究综述 (软件学报2021)

（4）Survey on Deep Multi-modal Data Analytics Collaboration Rivalry and Fusion (ACM Trans. Multimedia Comput. Commun. Appl. 2021)

（5）A Survey on Deep Learning for Multimodal Data Fusion (Neural Computation2020)

（6）Multimodal Machine Learning A Survey and Taxonomy (IEEE Transactions on Pattern Analysis and Machine Intelligence 2019)

（7）A comprehensive survey on multimodal medical signals fusion for smart healthcare system (Information Fusion2021，偏重医疗领域传感器数据融合)

（8）A review of multimodal image matching Methods and application (Information Fusion2021，偏重图像匹配)

（9）Multi-source knowledge fusion a survey (World Wide Web Journal 2020，偏重知识图谱融合，部分和multi-modal有点关系)

（10）Heterogeneous network representation learning A unified framework with survey and benchmark (IEEE Transactions on Knowledge and Data Engineering 2020，heterogeneous network本身也包含了multi-modal信息，而且graph在很多应用中是一个模态)

二、Multi-modal的一个核心问题是进行信息融合，很多研究关注于此，以下是比较新的一些工作：

（1）Deep Multimodal Fusion by Channel Exchanging (NIPS2020)

（2）Memory based fusion for multi-modal deep learning (Information Fusion201)

（3）Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion (AAAI 2020)

三、Multi-modal数据上基于Transformer的模型应该是未来的一个发展方向，我们也需要关注，我们当前需要解决的很多多模态相关问题可以考虑从multi-modal transformer的角度来设计方法，以下是代表性文章：

（1）Dynamic Graph Representation Learning for Video Dialog via Multi-Modal Shuffled Transformer (AAAI2021)

（2）InterBERT Vision-and-Language Interaction for Multi-modal Pr

上一篇：svm原理详解（大厂面试必问）

下一篇：python+pytorch快速安装

提问和评论都可以，用心的回复会被更多人看到评论

发布评论

相关文章

官方博客	全部文章	热门标签	班级博客
了解我们	网站地图	意见反馈

鸿蒙开发者社区	51CTO学堂
51CTO	软考资讯