vec2sparql.pdf

<div>Recent developments in machine learning have led to a rise of large</div><div>number of methods for extracting features from structured data. The</div><div>features are represented as vectors and may encode for some semantic</div><div>aspects of data. They can be used in a machine learning models for</div><div>different tasks or to compute similarities between the entities of the</div><div>data.</div><div>SPARQL is a query language for structured data originally developed</div><div>for querying Resource Description Framework (RDF) data. It has been in</div><div>use for over a decade as a standardized NoSQL query language. Many</div><div>different tools have been developed to enable data sharing with</div><div>SPARQL. For example, SPARQL endpoints make your data interoperable</div><div>and available to the world. SPARQL queries can be executed across</div><div>multiple endpoints.</div><div>We have developed a Vec2SPARQL, which is a general framework for</div><div>integrating structured data and their vector space representations.</div><div>Vec2SPARQL allows jointly querying vector functions such as computing</div><div>similarities (cosine, correlations) or classifications with machine</div><div>learning models within a single SPARQL query. We demonstrate</div><div>applications of our approach for biomedical and clinical use cases.</div>