A brief introduction to weakly supervised learning(简要介绍弱监督学习) (6)

在带有真值标签的大量训练样本的强监督条件下,监督学习技术已经取得了巨大的成功。然而,在真实的任务中,收集监督信息往往代价高昂,因此探索弱监督学习通常是更好的方式。 

本文聚焦于三种典型的弱监督学习:不完全、不确切和不准确监督。尽管三者可以分开讨论,但在实际中它们常常同时出现,如图1所示。当然也有针对“混合”情况的相关研究【52,92,93】。此外,还有一些其他类型的弱监督。例如,延时监督也可以视为弱监督,它主要出现在增强学习环境中【94】。由于篇幅限制,本文与其说是一个全面的总结回顾,不如说只是一个文献的索引。对于一些细节感兴趣的读者可以阅读参考文献中的相关文章。值得注意的是,越来越多的研究者开始关注弱监督学习,例如部分监督学习(partially supervised learning),主要关注不完全监督的学习【95】,【96,97】,同时还有一些其他关于弱监督的讨论。

为了便于讨论,本文只关注了二分类问题,而大多数讨论经稍事修改后就可推广至多类问题或回归问题。在多类分类任务中可能出现更复杂的情况【98】。在考虑多标签学习(multi-label learning)【99】时情况可能更为复杂,此时每个样本可能被同时赋予多个标签。用不完全监督举个例子:除了标注示例和未标注示例,多标签任务还会遇到部分标注示例,也就是说一个训练示例只给出了一部分标签【100】。即使只考虑标注数据和未标注数据,这种情况也要比单标签有更多选项,例如在主动学习中,对于选定的未标注示例,既可以询问示例的所有标签【101】,也可以询问某一个特定标签【102】,还可以给一对标签的相关排序【103】。尽管如此,不论是何种数据、何种任务,弱监督学习正在变得越来越重要。

参考文献:

Goodfellow I, Bengio Y and Courville A. Deep Learning. Cambridge: MIT Press, 2016. 

Settles B. Active learning literature survey. Technical Re- port 1648. Department of Computer Sciences, University of Wisconsin at Madison, Wisconsin, WI, 2010 [ cs.wisc.edu/∼bsettles/pub/settles.activelearning.pdf]. 

Chapelle O, Scho ̈lkopf B and Zien A (eds). Semi-Supervised Learning. Cambridge: MIT Press, 2006. 

Zhu X. Semi-supervised learning literature survey. Technical Report 1530. Department of Computer Sciences, University of Wisconsin at Madison, Madison, WI, 2008 [ wisc.edu/∼jerryzhu/pub/ssl ̇survey.pdf]. 

Zhou Z-H and Li M. Semi-supervised learning by disagreement. Knowl Inform Syst 2010; 24: 415–39. 

Huang SJ, Jin R and Zhou ZH. Active learning by querying informative and representative examples. IEEE Trans Pattern Anal Mach Intell 2014; 36: 1936–49. 

Lewis D and Gale W. A sequential algorithm for training text classi ers. In 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, 1994; 3–12. 

Seung H, Opper M and Sompolinsky H. Query by committee. In 5th ACM Workshop on Computational Learning Theory, Pitts- burgh, PA, 1992; 287–94. 

Abe N and Mamitsuka H. Query learning strategies using boosting and bagging. In 15th International Conference on Ma- chine Learning, Madison, WI, 1998; 1–9. 

Nguyen HT and Smeulders AWM. Active learning using pre- clustering. In 21st International Conference on Machine Learn- ing, Banff, Canada, 2004; 623–30. 

Dasgupta S and Hsu D. Hierarchical sampling for active learn- ing. In 25th International Conference on Machine Learning, Helsinki, Finland, 2008; 208–15. 

Wang Z and Ye J. Querying discriminative and representative samples for batch mode active learning. In 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, 2013; 158–66. 

Dasgupta S, Kalai AT and Monteleoni C. Analysis of perceptron-based active learning. In 28th Conference on Learn- ing Theory, Paris, France, 2005; 249–63. 

Dasgupta S. Analysis of a greedy active learning strategy. In Advances in Neural Information Processing Systems 17, Cambridge, MA: MIT Press, 2005; 337–44. 

Ka ̈a ̈ria ̈inen M. Active learning in the non-realizable case. In 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 2006; 63–77. 

Balcan MF, Broder AZ and Zhang T. Margin based active learn- ing. In 20th Annual Conference on Learning Theory, San Diego, CA, 2007; 35–50. 

Hanneke S. Adaptive rates of convergence in active learning. In 22nd Conference on Learning Theory, Montreal, Canada, 2009. 

Wang W and Zhou ZH. Multi-view active learning in the non-realizable case. In Advances in Neural Information Processing Systems 23, Cambridge, MA: MIT Press, 2010; 2388–96.   

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/wpgwyz.html