--- layout: tap site_name: arxiv tap_name: search description: "Search arXiv papers" intent: read columns: [] args: - name: keyword type: string args_json: | {"keyword":{"type":"string"}} health_json: | {"min_rows":3,"non_empty":["title"]} example_args: "--keyword 'large language model'" source_url: https://github.com/LeonTing1010/tap-skills/blob/main/community/arxiv/search.plan.json license: MIT ---

What it does

Search arXiv papers

Install Taprun once

Taprun ships as a single MCP server exposing a catalog of compiled taps. One-time setup on macOS / Linux:

brew install LeonTing1010/tap/taprun
tap mcp connect

Or drop this into your claude_desktop_config.json (works identically in Claude Code, Cursor, Cline, Windsurf — any MCP host):

{
  "mcpServers": {
    "tap": {
      "command": "tap",
      "args": ["mcp", "start"]
    }
  }
}

Call arxiv/search

Terminal, once installed:

tap run arxiv/search --keyword 'large language model'

From the MCP host — exact same compiled plan, deterministic replay, zero LLM tokens:

tap.run({ site: "arxiv", name: "search", args: {"keyword":"large language model"} })

Why compile it once

This plan was forged once — the AI read arxiv, picked stable structural addresses (JSON-LD, ARIA, RSS, or declared API endpoints, in that priority order), and saved them to a .plan.json. Every replay since then has used zero LLM tokens. When arxiv ships a site change that breaks the extraction, tap verify surfaces it before your data goes stale — not after your pipeline silently writes garbage for a week.