Single LoRA adapter for improved semantic cache accuracy across medical, law, programming, and psychology domains
Multi-domain LoRA is a single 596KB adapter trained on semantic cache triplets from medical, law, programming, and psychology domains. It improves cache accuracy by helping the base embedding model better distinguish between semantically similar but different queries.
Comparison of baseline (no LoRA), domain-specific LoRAs, and multi-domain LoRA across test sets.
| Domain | Test Triplets | Baseline Margin | Domain-Specific | Multi-Domain | Winner |
|---|---|---|---|---|---|
| Medical | 200 | 0.4416 | 0.6305 (+42.8%) | 0.5517 (+24.9%) | Domain-Specific |
| Law | 20,862 | 0.4940 | 0.6219 (+25.9%) | 0.6290 (+27.3%) | Multi-Domain ✓ |
| Programming | 20,862 | 0.2358 | 0.2367 (+0.4%) | 0.2651 (+12.4%) | Multi-Domain ✓ |
| Psychology | N/A | - | - | - | No test set |
The multi-domain LoRA achieves ≥10% improvement on all tested domains with a single adapter. While domain-specific LoRAs excel in medical (+42.8%), the multi-domain approach wins for law and programming, making it the recommended choice for production due to simplicity and consistency.
Triplets are synthetically generated using Qwen/Qwen2.5-7B-Instruct to create:
The 7B model provides high-quality paraphrases and challenging hard negatives that are semantically related but distinct enough to require separate LLM responses.
from sentence_transformers import SentenceTransformer
from peft import PeftModel
# Load base model
base_model = SentenceTransformer("sentence-transformers/all-MiniLM-L12-v2")
# Apply multi-domain LoRA
base_model[0].auto_model = PeftModel.from_pretrained(
base_model[0].auto_model,
"llm-semantic-router/multi-domain-cache-lora-L12"
)
# Use for embeddings
embedding = base_model.encode("What are the symptoms of diabetes?")
Use the multi-domain LoRA for production deployments. It provides consistent performance across all domains with a single 596KB adapter, requires no domain detection logic, and is easier to maintain than managing multiple domain-specific adapters.