Efficient and Explainable Neural Ranking

Leonhardt, Lutz Jurek

Download statistics - Document (COUNTER):

Leonhardt, Lutz Jurek: Efficient and Explainable Neural Ranking. Hannover : Gottfried Wilhelm Leibniz Universität, Diss., 2023, xii, 165 S., DOI: https://doi.org/10.15488/15769

Selected time period:

Sum total of downloads: 399

distribution of downloads over the selected time period
downloads by country

back to single item view (close usage statistics)

Filephdthesis_leonhar ...

Size1.86 MB

FormatAdobe PDF

View

Abstract:
The recent availability of increasingly powerful hardware has caused a shift from traditional information retrieval (IR) approaches based on term matching, which remained the state of the art for several decades, to large pre-trained neural language models. These neural rankers achieve substantial improvements in performance, as their complexity and extensive pre-training give them the ability of understanding natural language in a way. As a result, neural rankers go beyond term matching by performing relevance estimation based on the semantics of queries and documents.However, these improvements in performance don't come without sacrifice. In this thesis, we focus on two fundamental challenges of neural ranking models, specifically, ones based on large language models: On the one hand, due to their complexity, the models are inefficient; they require considerable amounts of computational power, which often comes in the form of specialized hardware, such as GPUs or TPUs. Consequently, the carbon footprint is an increasingly important aspect of systems using neural IR. This effect is amplified when low latency is required, as in, for example, web search. On the other hand, neural models are known for being inherently unexplainable; in other words, it is often not comprehensible for humans why a neural model produced a specific output. In general, explainability is deemed important in order to identify undesired behavior, such as bias.We tackle the efficiency challenge of neural rankers by proposing Fast-Forward indexes, which are simple vector forward indexes that heavily utilize pre-computation techniques. Our approach substantially reduces the computational load during query processing, enabling efficient ranking solely on CPUs without requiring hardware acceleration. Furthermore, we introduce BERT-DMN to show that the training efficiency of neural rankers can be improved by training only parts of the model.In order to improve the explainability of neural ranking, we propose the Select-and-Rank paradigm to make ranking models explainable by design: First, a query-dependent subset of the input document is extracted to serve as an explanation; second, the ranking model makes its decision based only on the extracted subset, rather than the complete document. We show that our models exhibit performance similar to models that are not explainable by design and conduct a user study to determine the faithfulness of the explanations.Finally, we introduce BoilerNet, a web content extraction technique that allows the removal of boilerplate from web pages, leaving only the main content in plain text. Our method requires no feature engineering and can be used to aid in the process of creating new document corpora from the web.
License of this version:	CC BY 3.0 DE
Document Type:	DoctoralThesis
Publishing status:	publishedVersion
Issue Date:	2023
Appears in Collections:	Fakultät für Elektrotechnik und Informatik Dissertationen

distribution of downloads over the selected time period:

downloads by country:

pos.	country		downloads
pos.	country		total	perc.
1		Netherlands	168	42.11%
2		Germany	104	26.07%
3		United States	45	11.28%
4		Canada	9	2.26%
5		Russian Federation	8	2.01%
6		Korea, Republic of	8	2.01%
7		United Kingdom	8	2.01%
8		Turkey	6	1.50%
9		France	6	1.50%
10		Taiwan	5	1.25%
		other countries	32	8.02%

Further download figures and rankings:

Hinweis

Zur Erhebung der Downloadstatistiken kommen entsprechend dem „COUNTER Code of Practice for e-Resources“ international anerkannte Regeln und Normen zur Anwendung. COUNTER ist eine internationale Non-Profit-Organisation, in der Bibliotheksverbände, Datenbankanbieter und Verlage gemeinsam an Standards zur Erhebung, Speicherung und Verarbeitung von Nutzungsdaten elektronischer Ressourcen arbeiten, welche so Objektivität und Vergleichbarkeit gewährleisten sollen. Es werden hierbei ausschließlich Zugriffe auf die entsprechenden Volltexte ausgewertet, keine Aufrufe der Website an sich.

Search the repository

Browse

All content
- Communities & Collections
- By Issue Date
- Authors
- Titles
- Subjects
- Subjects (GND)
- DDC
- License
- Type
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Subjects (GND)
- DDC
- License
- Type