Efficient and Explainable Neural Ranking

Download statistics - Document (COUNTER):

Leonhardt, Lutz Jurek: Efficient and Explainable Neural Ranking. Hannover : Gottfried Wilhelm Leibniz Universität, Diss., 2023, xii, 165 S., DOI: https://doi.org/10.15488/15769

Selected time period:

year: 
month: 

Sum total of downloads: 399




Thumbnail
Abstract: 
The recent availability of increasingly powerful hardware has caused a shift from traditional information retrieval (IR) approaches based on term matching, which remained the state of the art for several decades, to large pre-trained neural language models. These neural rankers achieve substantial improvements in performance, as their complexity and extensive pre-training give them the ability of understanding natural language in a way. As a result, neural rankers go beyond term matching by performing relevance estimation based on the semantics of queries and documents.However, these improvements in performance don't come without sacrifice. In this thesis, we focus on two fundamental challenges of neural ranking models, specifically, ones based on large language models: On the one hand, due to their complexity, the models are inefficient; they require considerable amounts of computational power, which often comes in the form of specialized hardware, such as GPUs or TPUs. Consequently, the carbon footprint is an increasingly important aspect of systems using neural IR. This effect is amplified when low latency is required, as in, for example, web search. On the other hand, neural models are known for being inherently unexplainable; in other words, it is often not comprehensible for humans why a neural model produced a specific output. In general, explainability is deemed important in order to identify undesired behavior, such as bias.We tackle the efficiency challenge of neural rankers by proposing Fast-Forward indexes, which are simple vector forward indexes that heavily utilize pre-computation techniques. Our approach substantially reduces the computational load during query processing, enabling efficient ranking solely on CPUs without requiring hardware acceleration. Furthermore, we introduce BERT-DMN to show that the training efficiency of neural rankers can be improved by training only parts of the model.In order to improve the explainability of neural ranking, we propose the Select-and-Rank paradigm to make ranking models explainable by design: First, a query-dependent subset of the input document is extracted to serve as an explanation; second, the ranking model makes its decision based only on the extracted subset, rather than the complete document. We show that our models exhibit performance similar to models that are not explainable by design and conduct a user study to determine the faithfulness of the explanations.Finally, we introduce BoilerNet, a web content extraction technique that allows the removal of boilerplate from web pages, leaving only the main content in plain text. Our method requires no feature engineering and can be used to aid in the process of creating new document corpora from the web.
License of this version: CC BY 3.0 DE
Document Type: DoctoralThesis
Publishing status: publishedVersion
Issue Date: 2023
Appears in Collections:Fakultät für Elektrotechnik und Informatik
Dissertationen

distribution of downloads over the selected time period:

downloads by country:

pos. country downloads
total perc.
1 image of flag of Netherlands Netherlands 168 42.11%
2 image of flag of Germany Germany 104 26.07%
3 image of flag of United States United States 45 11.28%
4 image of flag of Canada Canada 9 2.26%
5 image of flag of Russian Federation Russian Federation 8 2.01%
6 image of flag of Korea, Republic of Korea, Republic of 8 2.01%
7 image of flag of United Kingdom United Kingdom 8 2.01%
8 image of flag of Turkey Turkey 6 1.50%
9 image of flag of France France 6 1.50%
10 image of flag of Taiwan Taiwan 5 1.25%
    other countries 32 8.02%

Further download figures and rankings:


Hinweis

Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Search the repository


Browse