pkdd-cons2v-presentation.pdf (235.99 kB)
Download file

CON-S2V: A Generic Framework for Incorporating Extra-Sentential Context into Sen2Vec

Download (707.98 kB)
journal contribution
posted on 17.09.2017, 15:48 by Tanay Kumar SahaTanay Kumar Saha
We present a novel approach to learn distributed representation of sentences from unlabeled data by modeling both content and context of a sentence. The content model learns sentence representation by predicting its words. On the other hand, the context model comprises a neighbor prediction component and a regularizer to model distributional and proximity hypotheses, respectively. We propose an online algorithm to train the model components jointly. We evaluate the models in a setup, where contextual information is available. The experimental results on tasks involving classification, clustering, and ranking of sentences show that our model outperforms the best existing models by a wide margin across multiple datasets.


Usage metrics