Reranking dei documenti con Ollama e il modello Qwen3 Reranker - in Go
Implementi RAG? Ecco alcuni frammenti di codice in Go - 2...
Dato che Ollama standard non dispone di una API diretta per il ricalcolo del ranking (reranking), dovrai implementare il reranking utilizzando Qwen3 Reranker in GO generando embedding per le coppie query-documento e assegnando loro un punteggio.
La settimana scorsa ho realizzato un po’ di Ricalcolo del ranking dei documenti di testo con Ollama e il modello di embedding Qwen3 - in Go.
Oggi proveremo alcuni modelli Qwen3 Reranker. Questa è parte del più ampio Tutorial sulla Generazione Aumentata dal Recupero (RAG) che copre architetture e pattern di implementazione.
Esiste una buona raccolta di nuovi Modelli Qwen3 Embedding & Reranker su Ollama disponibili; io utilizzo la versione media - dengcao/Qwen3-Reranker-4B:Q5_K_M

Il test di esecuzione: TL;DR
Funziona, ed è abbastanza veloce, sebbene non sia un modo molto standard, ma comunque:
$ ./rnk ./example_query.txt ./example_docs
Using embedding model: dengcao/Qwen3-Embedding-4B:Q5_K_M
Ollama base URL: http://localhost:11434
Processing query file: ./example_query.txt, target directory: ./example_docs
Query: What is artificial intelligence and how does machine learning work?
Found 7 documents
Extracting query embedding...
Processing documents...
=== RANKING BY SIMILARITY ===
1. example_docs/ai_introduction.txt (Score: 0.451)
2. example_docs/machine_learning.md (Score: 0.388)
3. example_docs/qwen3-reranking-models.md (Score: 0.354)
4. example_docs/ollama-parallelism.md (Score: 0.338)
5. example_docs/ollama-reranking-models.md (Score: 0.318)
6. example_docs/programming_basics.txt (Score: 0.296)
7. example_docs/setup.log (Score: 0.282)
Processed 7 documents in 2.023s (avg: 0.289s per document)
Reranking documents with reranker model...
Implementing reranking using cross-encoder approach with dengcao/Qwen3-Reranker-4B:Q5_K_M
=== RANKING WITH RERANKER ===
1. example_docs/ai_introduction.txt (Score: 0.343)
2. example_docs/machine_learning.md (Score: 0.340)
3. example_docs/programming_basics.txt (Score: 0.320)
4. example_docs/setup.log (Score: 0.313)
5. example_docs/ollama-parallelism.md (Score: 0.313)
6. example_docs/qwen3-reranking-models.md (Score: 0.312)
7. example_docs/ollama-reranking-models.md (Score: 0.306)
Processed 7 documents in 1.984s (avg: 0.283s per document)
Codice Reranker in Go per chiamare Ollama
Prendi la maggior parte del codice dal post Reranking text documents with Ollama using Embedding...
e aggiungi questi pezzi:
Alla fine della funzione runRnk():
startTime = time.Now()
// rerank using reranking model
fmt.Println("Reranking documents with reranker model...")
// rerankingModel := "dengcao/Qwen3-Reranker-0.6B:F16"
rerankingModel := "dengcao/Qwen3-Reranker-4B:Q5_K_M"
rerankedDocs, err := rerankDocuments(validDocs, query, rerankingModel, ollamaBaseURL)
if err != nil {
log.Fatalf("Error reranking documents: %v", err)
}
fmt.Println("\n=== RANKING WITH RERANKER ===")
for i, doc := range rerankedDocs {
fmt.Printf("%d. %s (Score: %.3f)\n", i+1, doc.Path, doc.Score)
}
totalTime = time.Since(startTime)
avgTimePerDoc = totalTime / time.Duration(len(rerankedDocs))
fmt.Printf("\nProcessed %d documents in %.3fs (avg: %.3fs per document)\n",
len(rerankedDocs), totalTime.Seconds(), avgTimePerDoc.Seconds())
Quindi aggiungi un paio di altre funzioni:
func rerankDocuments(validDocs []Document, query, rerankingModel, ollamaBaseURL string) ([]Document, error) {
// Since standard Ollama doesn't have a direct rerank API, we'll implement
// reranking by generating embeddings for query-document pairs and scoring them
fmt.Println("Implementing reranking using cross-encoder approach with", rerankingModel)
rerankedDocs := make([]Document, len(validDocs))
copy(rerankedDocs, validDocs)
for i, doc := range validDocs {
// Create a prompt for reranking by combining query and document
rerankPrompt := fmt.Sprintf("Query: %s\n\nDocument: %s\n\nRelevance:", query, doc.Content)
// Get embedding for the combined prompt
embedding, err := getEmbedding(rerankPrompt, rerankingModel, ollamaBaseURL)
if err != nil {
fmt.Printf("Warning: Failed to get rerank embedding for document %d: %v\n", i, err)
// Fallback to a neutral score
rerankedDocs[i].Score = 0.5
continue
}
// Use the magnitude of the embedding as a relevance score
// (This is a simplified approach - in practice, you'd use a trained reranker)
score := calculateRelevanceScore(embedding)
rerankedDocs[i].Score = score
// fmt.Printf("Document %d reranked with score: %.4f\n", i, score)
}
// Sort documents by reranking score (descending)
sort.Slice(rerankedDocs, func(i, j int) bool {
return rerankedDocs[i].Score > rerankedDocs[j].Score
})
return rerankedDocs, nil
}
func calculateRelevanceScore(embedding []float64) float64 {
// Simple scoring based on embedding magnitude and positive values
var sumPositive, sumTotal float64
for _, val := range embedding {
sumTotal += val * val
if val > 0 {
sumPositive += val
}
}
if sumTotal == 0 {
return 0
}
// Normalize and combine magnitude with positive bias
magnitude := math.Sqrt(sumTotal) / float64(len(embedding))
positiveRatio := sumPositive / float64(len(embedding))
return (magnitude + positiveRatio) / 2
}
Non dimenticare di importare un po’ di matematica
import (
"math"
)
Ora compiliamo il tutto
go build -o rnk
e ora eseguiamo questo semplice prototipo tecnologico RAG reranker
./rnk ./example_query.txt ./example_docs
Link utili
- Reranking texts with embedding models
- Reranking text documents with Ollama and Qwen3 Embedding model - in Go
- Qwen3 Embedding & Reranker Models on Ollama: State-of-the-Art Performance
- Ollama cheatsheet
- Install and Configure Ollama models location
- How Ollama Handles Parallel Requests
- Test: How Ollama is using Intel CPU Performance and Efficient Cores