推荐 :基于知识蒸馏的BERT模型压缩
作者:孙思琦、成宇、甘哲、刘晶晶
本文为你介绍“耐心的知识蒸馏”模型。



图表1

图表2
Radford, Alec, et al. "Language models are unsupervised multitask learners." OpenAI Blog 1.8 (2019).
Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
Yang, Zhilin, et al. "XLNet: Generalized Autoregressive Pretraining for Language Understanding." arXiv preprint arXiv:1906.08237 (2019).
Liu, Yinhan, et al. "Roberta: A robustly optimized BERT pretraining approach." arXiv preprint arXiv:1907.11692 (2019).
Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015).
Siqi Sun: is a Research SDE in Microsoft. He is currently working on commonsense reasoning and knowledge graph related projects. Prior joining Microsoft, he was a PhD student in computer science at TTI Chicago, and before that he was an undergraduate student from school of mathematics at Fudan University.
Yu Cheng: is a senior researcher at Microsoft. His research is about deep learning in general, with specific interests in model compression, deep generative model and adversarial learning. He is also interested in solving real-world problems in computer vision and natural language processing. Yu received his Ph.D.from Northwestern University in 2015 and his bachelor from Tsinghua University in 2010. Before join Microsoft, he spent three years as a Research Staff Member at IBM Research/MIT-IBM Watson AI Lab.
Zhe Gan: is a senior researcher at Microsoft, primarily working on generative models, visual QA/dialog, machine reading comprehension (MRC), and natural language generation (NLG). He also has broad interests on various machine learning and NLP topics. Zhe received his PhD degree from Duke University in Spring 2018. Before that, he received his Master's and Bachelor's degree from Peking University in 2013 and 2010, respectively.
Jingjing (JJ) Liu: is a Principal Research Manager at Microsoft, leading a research team in NLP and Computer Vision. Her current research interests include Machine Reading Comprehension, Commonsense Reasoning, Visual QA/Dialog and Text-to-Image Generation. She received her PhD degree in Computer Science from MIT EECS in 2011. She also holds an MBA degree from Judge Business School at University of Cambridge.Before joining MSR, Dr.Liu was the Director of Product at Mobvoi Inc and Research Scientist at MIT CSAIL.
公众号后台回复“BERT”,获取论文地址。
本文转自:数据派THU ;获授权;
END
合作请加QQ:365242293
数据分析(ID : ecshujufenxi )互联网科技与数据圈自己的微信,也是WeMedia自媒体联盟成员之一,WeMedia联盟覆盖5000万人群。

关注公众号:拾黑(shiheibook)了解更多
[广告]赞助链接:
四季很好,只要有你,文娱排行榜:https://www.yaopaiming.com/
让资讯触达的更精准有趣:https://www.0xu.cn/
关注网络尖刀微信公众号随时掌握互联网精彩
- 1 中法元首相会都江堰 7904033
- 2 对日斗争突发新情况 7809376
- 3 美最新报告:不允许任何国家过于强大 7712629
- 4 国际机构看中国经济 关键词亮了 7618110
- 5 男子欠近5000元房费 酒店倒贴都不搬 7522666
- 6 荒野求生女选手疑遭骚扰 榕江通报 7426653
- 7 净网:网民造谣汽车造成8杀被查处 7327659
- 8 海军、国防部、外交部 严正批驳×3 7231696
- 9 中国女游客度假时从酒店9楼坠亡 7140445
- 10 千吨级“巨无霸”就位 7048628







数据分析
