量子随机采样器 是 AI Skill Hub 本期精选AI工具之一。综合评分 7.2 分,整体质量较高。我们推荐使用将其纳入你的 AI 工具库,帮助提升工作效率。
将量子随机数生成器集成到大模型令牌采样中的工具。支持接入任意熵源,通过gRPC提供分布式推理服务。适合需要高质量随机性的LLM应用开发者和研究人员。
量子随机采样器 是一款基于 Python 开发的开源工具,专注于 量子随机数、LLM采样、熵源集成 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
将量子随机数生成器集成到大模型令牌采样中的工具。支持接入任意熵源,通过gRPC提供分布式推理服务。适合需要高质量随机性的LLM应用开发者和研究人员。
量子随机采样器 是一款基于 Python 开发的开源工具,专注于 量子随机数、LLM采样、熵源集成 等核心功能。作为 GitHub 开源项目,它拥有活跃的社区支持和持续的版本迭代,代码完全透明可审计,支持本地部署以保护数据隐私。无论是个人使用还是集成到企业工作流,都能提供稳定可靠的解决方案。
# 方式一:pip 安装(推荐)
pip install qr-sampler
# 方式二:虚拟环境安装(推荐生产环境)
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install qr-sampler
# 方式三:从源码安装(获取最新功能)
git clone https://github.com/Entropic-Science/qr-sampler
cd qr-sampler
pip install -e .
# 验证安装
python -c "import qr_sampler; print('安装成功')"
# 命令行使用
qr-sampler --help
# 基本用法
qr-sampler input_file -o output_file
# Python 代码中调用
import qr_sampler
# 示例
result = qr_sampler.process("input")
print(result)
# qr-sampler 配置文件示例(config.yml) app: name: "qr-sampler" debug: false log_level: "INFO" # 运行时指定配置文件 qr-sampler --config config.yml # 或通过环境变量配置 export QR_SAMPLER_API_KEY="your-key" export QR_SAMPLER_OUTPUT_DIR="./output"
```bash
qr-sampler build --engine vllm --entropy quantum_grpc --output ./deploy
```bash pip install qr-sampler
Generate Docker Compose deployment files:
```bash
qr-sampler build --engine vllm --entropy quantum_grpc --output ./deploy
pip install qr-sampler python examples/servers/simple_urandom_server.py --address 0.0.0.0:50051
Plug any randomness source into LLM token sampling.
qr-sampler is an engine-agnostic framework that replaces standard pseudorandom token sampling with entropy from external sources — quantum random number generators (QRNGs), processor timing jitter, hardware noise, or any source you connect via gRPC or Python plugin. The core sampling pipeline has zero inference-engine dependencies; thin engine adapters integrate it with vLLM, vLLM-Metal, or any engine that supports logits processing.
pip install qr-sampler
---
Standard LLM inference uses pseudorandom number generators (PRNGs) for token sampling. PRNGs are deterministic — given the same seed, they produce the same output every time. qr-sampler replaces this with true randomness from physical processes:
EntropySource ABC or connect any hardware via the gRPC protocolos.urandom() as a fallback or baselinevllm serve Qwen/Qwen2.5-1.5B-Instruct --dtype half --max-model-len 8192 --gpu-memory-utilization 0.80
Configure the entropy source via environment variables:
bash export QR_ENTROPY_SOURCE_TYPE=quantum_grpc export QR_GRPC_SERVER_ADDRESS=localhost:50051 vllm serve Qwen/Qwen2.5-1.5B-Instruct --dtype half --max-model-len 8192 --gpu-memory-utilization 0.80 ```
Browse available components:
qr-sampler list engines # Engine profiles (vllm, vllm_metal)
qr-sampler list models --engine vllm # Known-working models for an engine
qr-sampler list entropy-sources # All entropy source profiles
qr-sampler list amplifiers # Signal amplification algorithms
qr-sampler list samplers # Temperature strategies
qr-sampler list presets # Preset bundles (creative_sampling, normal_t1)
Detailed information about a specific component:
qr-sampler info engine vllm
qr-sampler info entropy quantum_grpc
qr-sampler info amplifier zscore_mean
qr-sampler info sampler edt
qr-sampler info preset creative_sampling
Check stack compatibility before deploying:
```bash
export QR_ENTROPY_SOURCE_TYPE=quantum_grpc export QR_GRPC_SERVER_ADDRESS=localhost:50051
Three transport modes:
| Mode | `QR_GRPC_MODE` | Latency | Best for |
|---|---|---|---|
| **Unary** | `unary` | ~1-2ms | Simplicity, debugging |
| **Server streaming** | `server_streaming` | ~0.5-1ms | Middle ground |
| **Bidirectional** | `bidi_streaming` | ~50-100us (same machine) | Production, lowest latency |
For co-located hardware, use Unix domain sockets:
bash python my_qrng_server.py --address unix:///var/run/qrng.sock export QR_GRPC_SERVER_ADDRESS=unix:///var/run/qrng.sock export QR_GRPC_MODE=bidi_streaming ```
The gRPC client is protocol-agnostic. It uses configurable method paths and generic protobuf wire-format encoding. The only requirement is that your proto puts the byte count as field 1 in the request and the random bytes as field 1 in the response. Configure custom protos via QR_GRPC_METHOD_PATH and QR_GRPC_STREAM_METHOD_PATH.
The gRPC client includes an adaptive circuit breaker:
max(5ms, P99 * 1.5) (configurable via QR_CB_* env vars)QR_FALLBACK_MODE when the circuit is openA pre-built filter function injects qr-sampler per-request parameters into every chat message via the Open WebUI Valves system. This lets you adjust temperature, top-k, top-p, sample count, and other sampling parameters from the admin panel.
To set it up:
examples/open-webui/qr_sampler_filter.json.See examples/open-webui/README.md for the full guide.
Open WebUI is entirely optional. qr-sampler works the same way with direct API calls, curl, Python clients, or any OpenAI-compatible tool.
---
The CLI checks compatibility of engines, models, entropy sources, and amplifiers before you deploy:
```bash pip install qr-sampler[cli]
docker compose up --build ```
qr-sampler validate --config stack.yaml ```
Exit codes: 0 = all known-working, 1 = untested combinations (warnings), 2 = incompatible or missing.
qr-sampler build --config stack.yaml --output ./deploy ```
---
QR_PRESET=creative_sampling python -m vllm serve Qwen/Qwen2.5-1.5B-Instruct
All configuration is done via environment variables with the QR_ prefix. Per-request overrides use the qr_ prefix in extra_args.
qr-sampler is designed to connect any randomness source to LLM token sampling. There are two approaches.
cd deploy
docker compose up --build
Or configure directly via environment variables (bare-metal):
bash export QR_ENTROPY_SOURCE_TYPE=quantum_grpc export QR_GRPC_SERVER_ADDRESS=localhost:50051 vllm serve Qwen/Qwen2.5-1.5B-Instruct --dtype half --max-model-len 8192 --gpu-memory-utilization 0.80
The template handles all gRPC boilerplate (unary + bidirectional streaming, health checks, graceful shutdown). You only write the hardware-specific code.
#### The gRPC protocol
protobuf service EntropyService { rpc GetEntropy (EntropyRequest) returns (EntropyResponse); rpc StreamEntropy (stream EntropyRequest) returns (stream EntropyResponse); }
message EntropyRequest { int32 bytes_needed = 1; int64 sequence_id = 2; }
message EntropyResponse { bytes data = 1; int64 sequence_id = 2; int64 generation_timestamp_ns = 3; string device_id = 4; }
Any language that supports gRPC can implement this server — Python, C++, Rust, Go, etc.
#### Just-in-time constraint
The entropy must be generated **after** the client sends the request, not from a pre-generated pool:
- No buffering or caching of previously generated bytes
- The physical measurement happens during the `generate()` call
- `generation_timestamp_ns` in the response proves freshness
This is critical for consciousness-research applications where the timing relationship between logit computation and entropy generation matters.
#### Deployment options
**systemd (Linux):**
bash sudo cp examples/systemd/qr-entropy-server.service /etc/systemd/system/ sudo cp examples/systemd/qr-entropy-server.env /etc/default/qr-entropy-server sudo systemctl enable --now qr-entropy-server
**Unix domain sockets** (lowest latency for co-located hardware):
bash python my_qrng_server.py --address unix:///var/run/qrng.sock export QR_GRPC_SERVER_ADDRESS=unix:///var/run/qrng.sock ```
The CLI requires the [cli] extra: pip install qr-sampler[cli]
Engine adapter calls pipeline.sample_token(logits_1d)
│
├─ Temperature strategy ─────── Compute per-token temperature
│ (fixed or entropy-dependent) from the logit distribution
│
├─ Entropy source ───────────── Fetch fresh random bytes
│ (gRPC / system / timing / just-in-time, after logits exist
│ openentropy / custom)
│
├─ Signal amplification ─────── Convert 20,480 bytes → one float u ∈ (0,1)
│ (z-score or ECDF) via statistical aggregation
│
├─ Token selector ───────────── top-k → softmax → top-p → CDF → select
│ (CDF binary search with u) token from probability distribution
│
└─ Force one-hot logits ─────── Set selected token to 0.0, all others to -inf
(engine picks exactly (returned as numpy; adapter converts
this token) to engine-native tensor)
The core pipeline is importable and functional without vLLM, torch, or any engine package. Engine adapters convert between engine-native tensors and numpy, delegate to SamplingPipeline.sample_token(), and write the result back.
---
[project.entry-points."qr_sampler.entropy_sources"] lava_lamp = "my_package:LavaLampEntropySource" ```
The source is auto-discovered when qr-sampler starts. See Setting up your own entropy source below.
---
For entropy sources that don't need a separate server, implement the EntropySource ABC directly:
from qr_sampler.entropy.base import EntropySource
from qr_sampler.entropy.registry import register_entropy_source
@register_entropy_source("my_source")
class MyEntropySource(EntropySource):
@property
def name(self) -> str:
return "my_source"
@property
def is_available(self) -> bool:
return True
def get_random_bytes(self, n: int) -> bytes:
return my_hardware.read(n)
def close(self) -> None:
my_hardware.disconnect()
Register via entry points in your package's pyproject.toml:
[project.entry-points."qr_sampler.entropy_sources"]
my_source = "my_package.entropy:MyEntropySource"
Then set QR_ENTROPY_SOURCE_TYPE=my_source.
qr-sampler uses a registry + entry-points pattern for extensibility:
qr_sampler.entropy_sources Third-party entropy sources
qr_sampler.engine_adapters Third-party engine adapters
vllm.logits_processors vLLM plugin registration
Each subsystem (entropy, amplification, temperature, engines) has its own registry with decorator-based registration for built-in implementations and entry-point discovery for third-party extensions. The pipeline never instantiates strategy classes directly — it always goes through the registry.
New engine adapter: Subclass EngineAdapter, implement get_pipeline(). Register with @EngineAdapterRegistry.register("name"). Add entry point under qr_sampler.engine_adapters.
New entropy source: Subclass EntropySource, implement name, is_available, get_random_bytes(), close(). Register with @register_entropy_source("name").
New signal amplifier: Subclass SignalAmplifier, implement amplify(raw_bytes) -> AmplificationResult. Register with @AmplifierRegistry.register("name").
New temperature strategy: Subclass TemperatureStrategy, implement compute_temperature(logits, config) -> TemperatureResult. Always compute Shannon entropy. Register with @TemperatureStrategyRegistry.register("name").
See CONTRIBUTING.md for detailed development instructions.
---
创新工具,将量子计算与LLM结合。Stars较低说明关注度有限,gRPC架构设计合理,适合专业应用场景。
AI Skill Hub 为第三方内容聚合平台,本页面信息基于公开数据整理,不对工具功能和质量作任何法律背书。
建议在沙箱或测试环境中充分验证后,再部署至生产环境,并做好必要的安全评估。
✅ Apache 2.0 — 宽松开源协议,可商用,需保留版权声明和 NOTICE 文件,含专利授权条款。
经综合评估,量子随机采样器 在AI工具赛道中表现稳健,质量良好。如果你已有明确的使用需求,可以直接上手体验;如果还在评估阶段,建议对比同类工具后再做决策。
| 原始名称 | qr-sampler |
| 原始描述 | 开源AI工具:Integrate any source of randomness into LLM token sampling. Easily create new pr。⭐30 · Python |
| Topics | 量子随机数LLM采样熵源集成gRPC分布式令牌采样 |
| GitHub | https://github.com/Entropic-Science/qr-sampler |
| License | Apache-2.0 |
| 语言 | Python |
收录时间:2026-05-23 · 更新时间:2026-05-30 · License:Apache-2.0 · AI Skill Hub 不对第三方内容的准确性作法律背书。