The overall architecture of the new relevance search algorithm in PubMed.
(a) It consists of two stages: processing first by BM25, a classic term-weighting algorithm; the top 500 results are then re-ranked by LambdaMART, a high-performance L2R algorithm. The machine-learning–based ranking model is learned offline using relevance-ranked training data together with a set of features extracted from queries, documents, or both. (b) Features designed and experimented in this study with their brief descriptions and identifiers. D, document; IDF, inverse document frequency; L2R, learning to rank; Q, query; QD, query–document relationship; TIAB, title and abstract