Keeping up with AI research is exhausting. New papers drop daily. Most "paper discovery" tools either require an account, burn API tokens on every search, or give you a bloated interface when all you wanted was a list.
Here's what I use instead:
$ npx -y @taprun/cli arxiv search --keyword "LLM" \
| npx -y @taprun/cli sort --field published \
| npx -y @taprun/cli table
Output: 20 papers sorted newest-first, with title, authors, published date, abstract, and URL — in under 2 seconds.
No account. No API key. No AI tokens consumed. First run downloads a ~30MB binary and caches it; every subsequent run is instant.
Most developer tools for arXiv are wrappers: they call the same free arXiv API, add a web UI, and charge a subscription. The arXiv Atom/API endpoint is public and has been stable for 15 years. You don't need a middleman.
arxiv/search is a Tap skill — a 20-line deterministic program that calls the arXiv API directly:
// The entire skill — no framework, no dependencies async tap(handle, args) { const data = await handle.fetch( `https://export.arxiv.org/api/query?search_query=all:${args.keyword}&max_results=20` ); // parse Atom XML → structured rows return papers.map(p => ({ title: p.title, authors: p.authors, published: p.published.slice(0, 10), abstract: p.abstract.slice(0, 120) + "...", url: p.url })); }
AI wrote this once. It runs forever at $0.
The real power isn't the search — it's that every Tap skill is a composable Unix filter. Data flows left to right as JSON:
# Search $ npx -y @taprun/cli arxiv search --keyword "RAG" # Search → sort by date $ npx -y @taprun/cli arxiv search --keyword "RAG" \ | npx -y @taprun/cli sort --field published # Search → sort → keep only recent ones $ npx -y @taprun/cli arxiv search --keyword "RAG" \ | npx -y @taprun/cli sort --field published \ | npx -y @taprun/cli filter --field published --gt "2025-01-01" # Search → sort → filter → display as table $ npx -y @taprun/cli arxiv search --keyword "RAG" \ | npx -y @taprun/cli sort --field published \ | npx -y @taprun/cli filter --field published --gt "2025-01-01" \ | npx -y @taprun/cli table
Each command is stateless. Each reads JSON from stdin, writes JSON to stdout — except table, which renders for humans. Exactly how Unix tools work.
Because every skill outputs the same JSON format, you can combine data from different sources in one pipeline:
# What's trending on GitHub + HackerNews, together $ npx -y @taprun/cli github trending \ | npx -y @taprun/cli filter --field language --eq "Python" \ | npx -y @taprun/cli sort --field stars \ | npx -y @taprun/cli limit --n 5 \ | npx -y @taprun/cli table
No glue code. No schema mapping. The pipeline handles it.
Because it's a command, it runs anywhere:
# GitHub Actions — daily paper digest to Slack - name: Paper digest run: | npx -y @taprun/cli arxiv search --keyword "LLM agents" \ | npx -y @taprun/cli sort --field published \ | npx -y @taprun/cli limit --n 5 \ > papers.json # post papers.json to Slack webhook
The binary is cached by npm/npx after the first run, so CI is fast too.
arXiv is one of 200+ community skills that follow the same pattern: call an API, return structured rows, compose with any other skill.
| Skill | What it returns |
|---|---|
arxiv/search --keyword X | Papers matching keyword |
github/trending | Trending repos today |
hackernews/hot | HN front page |
reddit/search --keyword X | Posts matching keyword |
stackoverflow/hot | Hot questions |
Each one is a 20–40 line .tap.js file. Browse and contribute on GitHub.
$ npx -y @taprun/cli arxiv search --keyword "your topic" \ | npx -y @taprun/cli sort --field published \ | npx -y @taprun/cli table
No install required. Works on any machine with Node.js. → taprun.dev