Low-Dimensional Context-Dependent Translation Models
2018-11-27T22:16:31Z (GMT) by
Context matters when modeling language translation, but state-of-the-art approaches predominantly model these dependencies via larger translation units. This decision results in problems related to computational efficiency (runtime and memory) and statistical efficiency (millions of sentences, but billions of translation rules), and as a result such methods stop short of conditioning on extreme amounts of local context or global context. This thesis takes a step back from the current zeitgeist and posits another view: while context influences translation, its influence is inherently low-dimensional, and problems of computational and statistical tractability can be solved by using<br>dimensionality reduction and representation learning techniques. The lowdimensional representations we recover intuitively capture this observation, that the phenomena that drive translation are controlled by context residing in a more<br>compact space than the lexical-based (word or n-gram) “one-hot” or count-based spaces. We consider low-dimensional representations of context, recovered via a multiview canonical correlations analysis, as well as low-dimensional representations of translation units that are expressed (featurized) in terms of context, recovered by<br>a rank-reduced SVD of a feature space defined over inside and outside trees in a synchronous grammar. Lastly, we test our low-dimensional hypothesis in the limit, by considering a semi-supervised learning scenario where contextual information is gleaned from large amounts of unlabeled data. All empirical setups show improvements<br>by taking into account the low-dimensional hypothesis, indicating that this route is an effective way to boost performance while maintaining model parsimony.