Jak ponowne rankowanie dokumentów tekstowych za pomocą Ollama i modelu reranker qwen3?

Ponieważ standardowa wersja Ollama nie posiada bezpośredniego API do rerankingu, musisz zaimplementować tę funkcję, generując wektory osadzone (embeddings) dla par zapytanie-dokument i obliczając dla nich punkty oceny.

Reranking dokumentów przy użyciu Ollama i modelu Qwen3 Reranker – w języku Go

Wdrażasz RAG? Oto kilka fragmentów kodu w Go – część 2...

Page content

Ponieważ standardowe Ollama nie posiada bezpośredniego interfejsu API do ponownego rankingu (reranking), musisz zaimplementować ponowny ranking przy użyciu Qwen3 Reranker w GO, generując wektory (embeddings) dla par zapytanie-dokument i przypisując im oceny.

W zeszłym tygodniu zająłem się ponownym rankingiem dokumentów tekstowych przy użyciu Ollama i modelu Qwen3 Embedding - w Go.

Dziś spróbuję modeli Qwen3 Reranker. Jest to część szerszego Tutorium o Generacji Rozszerzonej Pobieraniem (RAG), obejmującego wzorce architektury i implementacji. Dostępna jest całkiem spora kolekcja nowych modeli Qwen3 Embedding & Reranker na Ollama, ja używam wersji średniej - dengcao/Qwen3-Reranker-4B:Q5_K_M.

reranking dogs

Test: TL;DR

Działa, i to dość szybko, choć nie w sposób zbyt standardowy, ale wciąż:

$ ./rnk ./example_query.txt ./example_docs

Using embedding model: dengcao/Qwen3-Embedding-4B:Q5_K_M
Ollama base URL: http://localhost:11434
Processing query file: ./example_query.txt, target directory: ./example_docs
Query: What is artificial intelligence and how does machine learning work?
Found 7 documents
Extracting query embedding...
Processing documents...

=== RANKING BY SIMILARITY ===
1. example_docs/ai_introduction.txt (Score: 0.451)
2. example_docs/machine_learning.md (Score: 0.388)
3. example_docs/qwen3-reranking-models.md (Score: 0.354)
4. example_docs/ollama-parallelism.md (Score: 0.338)
5. example_docs/ollama-reranking-models.md (Score: 0.318)
6. example_docs/programming_basics.txt (Score: 0.296)
7. example_docs/setup.log (Score: 0.282)

Processed 7 documents in 2.023s (avg: 0.289s per document)
Reranking documents with reranker model...
Implementing reranking using cross-encoder approach with dengcao/Qwen3-Reranker-4B:Q5_K_M

=== RANKING WITH RERANKER ===
1. example_docs/ai_introduction.txt (Score: 0.343)
2. example_docs/machine_learning.md (Score: 0.340)
3. example_docs/programming_basics.txt (Score: 0.320)
4. example_docs/setup.log (Score: 0.313)
5. example_docs/ollama-parallelism.md (Score: 0.313)
6. example_docs/qwen3-reranking-models.md (Score: 0.312)
7. example_docs/ollama-reranking-models.md (Score: 0.306)

Processed 7 documents in 1.984s (avg: 0.283s per document)

Kod Rerankera w Go do wywoływania Ollama

Pobierz większość kodu z artykułu Reranking text documents with Ollama using Embedding... i dodaj te fragmenty:

Na końcu funkcji runRnk():

  startTime = time.Now()
	// rerank using reranking model
	fmt.Println("Reranking documents with reranker model...")

	// rerankingModel := "dengcao/Qwen3-Reranker-0.6B:F16"
	rerankingModel := "dengcao/Qwen3-Reranker-4B:Q5_K_M"
	rerankedDocs, err := rerankDocuments(validDocs, query, rerankingModel, ollamaBaseURL)
	if err != nil {
		log.Fatalf("Error reranking documents: %v", err)
	}

	fmt.Println("\n=== RANKING WITH RERANKER ===")
	for i, doc := range rerankedDocs {
		fmt.Printf("%d. %s (Score: %.3f)\n", i+1, doc.Path, doc.Score)
	}

	totalTime = time.Since(startTime)
	avgTimePerDoc = totalTime / time.Duration(len(rerankedDocs))

	fmt.Printf("\nProcessed %d documents in %.3fs (avg: %.3fs per document)\n",
		len(rerankedDocs), totalTime.Seconds(), avgTimePerDoc.Seconds())

Następnie dodaj kilka dodatkowych funkcji:

func rerankDocuments(validDocs []Document, query, rerankingModel, ollamaBaseURL string) ([]Document, error) {
	// Since standard Ollama doesn't have a direct rerank API, we'll implement
	// reranking by generating embeddings for query-document pairs and scoring them

	fmt.Println("Implementing reranking using cross-encoder approach with", rerankingModel)

	rerankedDocs := make([]Document, len(validDocs))
	copy(rerankedDocs, validDocs)

	for i, doc := range validDocs {
		// Create a prompt for reranking by combining query and document
		rerankPrompt := fmt.Sprintf("Query: %s\n\nDocument: %s\n\nRelevance:", query, doc.Content)

		// Get embedding for the combined prompt
		embedding, err := getEmbedding(rerankPrompt, rerankingModel, ollamaBaseURL)
		if err != nil {
			fmt.Printf("Warning: Failed to get rerank embedding for document %d: %v\n", i, err)
			// Fallback to a neutral score
			rerankedDocs[i].Score = 0.5
			continue
		}

		// Use the magnitude of the embedding as a relevance score
		// (This is a simplified approach - in practice, you'd use a trained reranker)
		score := calculateRelevanceScore(embedding)
		rerankedDocs[i].Score = score
		// fmt.Printf("Document %d reranked with score: %.4f\n", i, score)
	}

	// Sort documents by reranking score (descending)
	sort.Slice(rerankedDocs, func(i, j int) bool {
		return rerankedDocs[i].Score > rerankedDocs[j].Score
	})

	return rerankedDocs, nil
}

func calculateRelevanceScore(embedding []float64) float64 {
	// Simple scoring based on embedding magnitude and positive values
	var sumPositive, sumTotal float64
	for _, val := range embedding {
		sumTotal += val * val
		if val > 0 {
			sumPositive += val
		}
	}

	if sumTotal == 0 {
		return 0
	}

	// Normalize and combine magnitude with positive bias
	magnitude := math.Sqrt(sumTotal) / float64(len(embedding))
	positiveRatio := sumPositive / float64(len(embedding))

	return (magnitude + positiveRatio) / 2
}

Nie zapomnij zaimportować nieco biblioteki math

import (
	"math"
)

Teraz skompilujmy to

go build -o rnk

i uruchommy ten prosty prototyp technologiczny RAG rerankera

./rnk ./example_query.txt ./example_docs

Test: TL;DR

Kod Rerankera w Go do wywoływania Ollama

Przydatne linki