Reclassement de documents avec Ollama et le modèle Qwen3 Reranker — en Go

Mise en œuvre de RAG ? Voici quelques extraits de code Go – 2...

Sommaire

Puisque la version standard d’Ollama ne possède pas d’API de reranking directe, vous devrez implémenter le reranking en utilisant Qwen3 Reranker en GO en générant des embeddings pour les paires requête-document et en les notant.

La semaine dernière, j’ai effectué quelques tests de Reranking de documents textuels avec Ollama et le modèle d’embedding Qwen3 - en Go.

Aujourd’hui, nous allons essayer quelques modèles Qwen3 Reranker. Cela fait partie du tutoriel plus large sur la Génération Augmentée par Récupération (RAG), couvrant les architectures et les motifs d’implémentation. Il existe un ensemble assez complet de nouveaux Modèles d’Embedding et de Reranker Qwen3 sur Ollama disponibles. J’utilise la version moyenne - dengcao/Qwen3-Reranker-4B:Q5_K_M.

reranking dogs

Exécution du test : TL;DR

Ça marche, et plutôt vite, pas de manière très standard, mais quand même :

$ ./rnk ./example_query.txt ./example_docs

Using embedding model: dengcao/Qwen3-Embedding-4B:Q5_K_M
Ollama base URL: http://localhost:11434
Processing query file: ./example_query.txt, target directory: ./example_docs
Query: What is artificial intelligence and how does machine learning work?
Found 7 documents
Extracting query embedding...
Processing documents...

=== RANKING BY SIMILARITY ===
1. example_docs/ai_introduction.txt (Score: 0.451)
2. example_docs/machine_learning.md (Score: 0.388)
3. example_docs/qwen3-reranking-models.md (Score: 0.354)
4. example_docs/ollama-parallelism.md (Score: 0.338)
5. example_docs/ollama-reranking-models.md (Score: 0.318)
6. example_docs/programming_basics.txt (Score: 0.296)
7. example_docs/setup.log (Score: 0.282)

Processed 7 documents in 2.023s (avg: 0.289s per document)
Reranking documents with reranker model...
Implementing reranking using cross-encoder approach with dengcao/Qwen3-Reranker-4B:Q5_K_M

=== RANKING WITH RERANKER ===
1. example_docs/ai_introduction.txt (Score: 0.343)
2. example_docs/machine_learning.md (Score: 0.340)
3. example_docs/programming_basics.txt (Score: 0.320)
4. example_docs/setup.log (Score: 0.313)
5. example_docs/ollama-parallelism.md (Score: 0.313)
6. example_docs/qwen3-reranking-models.md (Score: 0.312)
7. example_docs/ollama-reranking-models.md (Score: 0.306)

Processed 7 documents in 1.984s (avg: 0.283s per document)

Code Reranker en Go pour appeler Ollama

Prenez la majeure partie du code de l’article Reranking text documents with Ollama using Embedding... et ajoutez ces éléments :

À la fin de la fonction runRnk() :

  startTime = time.Now()
	// rerank using reranking model
	fmt.Println("Reranking documents with reranker model...")

	// rerankingModel := "dengcao/Qwen3-Reranker-0.6B:F16"
	rerankingModel := "dengcao/Qwen3-Reranker-4B:Q5_K_M"
	rerankedDocs, err := rerankDocuments(validDocs, query, rerankingModel, ollamaBaseURL)
	if err != nil {
		log.Fatalf("Error reranking documents: %v", err)
	}

	fmt.Println("\n=== RANKING WITH RERANKER ===")
	for i, doc := range rerankedDocs {
		fmt.Printf("%d. %s (Score: %.3f)\n", i+1, doc.Path, doc.Score)
	}

	totalTime = time.Since(startTime)
	avgTimePerDoc = totalTime / time.Duration(len(rerankedDocs))

	fmt.Printf("\nProcessed %d documents in %.3fs (avg: %.3fs per document)\n",
		len(rerankedDocs), totalTime.Seconds(), avgTimePerDoc.Seconds())

Puis ajoutez quelques fonctions supplémentaires :

func rerankDocuments(validDocs []Document, query, rerankingModel, ollamaBaseURL string) ([]Document, error) {
	// Since standard Ollama doesn't have a direct rerank API, we'll implement
	// reranking by generating embeddings for query-document pairs and scoring them

	fmt.Println("Implementing reranking using cross-encoder approach with", rerankingModel)

	rerankedDocs := make([]Document, len(validDocs))
	copy(rerankedDocs, validDocs)

	for i, doc := range validDocs {
		// Create a prompt for reranking by combining query and document
		rerankPrompt := fmt.Sprintf("Query: %s\n\nDocument: %s\n\nRelevance:", query, doc.Content)

		// Get embedding for the combined prompt
		embedding, err := getEmbedding(rerankPrompt, rerankingModel, ollamaBaseURL)
		if err != nil {
			fmt.Printf("Warning: Failed to get rerank embedding for document %d: %v\n", i, err)
			// Fallback to a neutral score
			rerankedDocs[i].Score = 0.5
			continue
		}

		// Use the magnitude of the embedding as a relevance score
		// (This is a simplified approach - in practice, you'd use a trained reranker)
		score := calculateRelevanceScore(embedding)
		rerankedDocs[i].Score = score
		// fmt.Printf("Document %d reranked with score: %.4f\n", i, score)
	}

	// Sort documents by reranking score (descending)
	sort.Slice(rerankedDocs, func(i, j int) bool {
		return rerankedDocs[i].Score > rerankedDocs[j].Score
	})

	return rerankedDocs, nil
}

func calculateRelevanceScore(embedding []float64) float64 {
	// Simple scoring based on embedding magnitude and positive values
	var sumPositive, sumTotal float64
	for _, val := range embedding {
		sumTotal += val * val
		if val > 0 {
			sumPositive += val
		}
	}

	if sumTotal == 0 {
		return 0
	}

	// Normalize and combine magnitude with positive bias
	magnitude := math.Sqrt(sumTotal) / float64(len(embedding))
	positiveRatio := sumPositive / float64(len(embedding))

	return (magnitude + positiveRatio) / 2
}

N’oubliez pas d’importer un peu de mathématiques :

import (
	"math"
)

Maintenant, compilons-le :

go build -o rnk

et maintenant exécutons ce prototype technique de reranker RAG simple :

./rnk ./example_query.txt ./example_docs

Liens utiles