Abstract
Initial results of neural architecture search (NAS) in natural language processing (NLP) have been achieved, but the search space of most NAS methods is based on the simplest recurrent cell and thus does not consider the modeling of long sequences. The remote information tends to disappear gradually when the input sequence is long, resulting in poor model performance. In this paper, we present an approach based on dual cells to search for a better-performing network architecture. We construct a search space that is more compatible with language modeling tasks by adding an information storage cell inside the search cell, so that we can make better use of the remote information of the sequence and improve the performance of the model. The language model searched by our method achieves better results than those of the baseline method on the Penn Treebank data set and WikiText-2 data set.
Get full access to this article
View all access options for this article.
