


























This paper proposes an architecture for deep neural networks with hidden layer branches that learn targets of lower hierarchy than final layer targets. The branches provide a channel for enforcing useful information in hidden layer which helps in attaining better accuracy, both for the final layer and hidden layers. The shared layers modify their weights using the gradients of all cost functions higher than the branching layer. This model provides a flexible inference system with many levels of targets which is modular and can be used efficiently in situations requiring different levels of results according to complexity. This paper applies the idea to a text classification task on 20 Newsgroups data set with two level of hierarchical targets and a comparison is made with training without the use of hidden layer branches.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。