← Tap · Blog

Search arXiv in One Command — No API Key, No Tokens

April 7, 2026 · Leon Ting · 4 min read

Keeping up with AI research is exhausting. New papers drop daily. Most "paper discovery" tools either require an account, burn API tokens on every search, or give you a bloated interface when all you wanted was a list.

Here's what I use instead:

$ npx -y @taprun/cli arxiv search --keyword "LLM" \
    | npx -y @taprun/cli sort --field published \
    | npx -y @taprun/cli table

Output: 20 papers sorted newest-first, with title, authors, published date, abstract, and URL — in under 2 seconds.

No account. No API key. No AI tokens consumed. First run downloads a ~30MB binary and caches it; every subsequent run is instant.

Why This Matters

Most developer tools for arXiv are wrappers: they call the same free arXiv API, add a web UI, and charge a subscription. The arXiv Atom/API endpoint is public and has been stable for 15 years. You don't need a middleman.

arxiv/search is a Tap skill — a 20-line deterministic program that calls the arXiv API directly:

// The entire skill — no framework, no dependencies
async tap(handle, args) {
  const data = await handle.fetch(
    `https://export.arxiv.org/api/query?search_query=all:${args.keyword}&max_results=20`
  );
  // parse Atom XML → structured rows
  return papers.map(p => ({
    title: p.title,
    authors: p.authors,
    published: p.published.slice(0, 10),
    abstract: p.abstract.slice(0, 120) + "...",
    url: p.url
  }));
}

AI wrote this once. It runs forever at $0.

The Unix Pipeline Model

The real power isn't the search — it's that every Tap skill is a composable Unix filter. Data flows left to right as JSON:

# Search
$ npx -y @taprun/cli arxiv search --keyword "RAG"

# Search → sort by date
$ npx -y @taprun/cli arxiv search --keyword "RAG" \
    | npx -y @taprun/cli sort --field published

# Search → sort → keep only recent ones
$ npx -y @taprun/cli arxiv search --keyword "RAG" \
    | npx -y @taprun/cli sort --field published \
    | npx -y @taprun/cli filter --field published --gt "2025-01-01"

# Search → sort → filter → display as table
$ npx -y @taprun/cli arxiv search --keyword "RAG" \
    | npx -y @taprun/cli sort --field published \
    | npx -y @taprun/cli filter --field published --gt "2025-01-01" \
    | npx -y @taprun/cli table

Each command is stateless. Each reads JSON from stdin, writes JSON to stdout — except table, which renders for humans. Exactly how Unix tools work.

Combine with Other Sources

Because every skill outputs the same JSON format, you can combine data from different sources in one pipeline:

# What's trending on GitHub + HackerNews, together
$ npx -y @taprun/cli github trending \
    | npx -y @taprun/cli filter --field language --eq "Python" \
    | npx -y @taprun/cli sort --field stars \
    | npx -y @taprun/cli limit --n 5 \
    | npx -y @taprun/cli table

No glue code. No schema mapping. The pipeline handles it.

Use It in CI or Cron

Because it's a command, it runs anywhere:

# GitHub Actions — daily paper digest to Slack
- name: Paper digest
  run: |
    npx -y @taprun/cli arxiv search --keyword "LLM agents" \
      | npx -y @taprun/cli sort --field published \
      | npx -y @taprun/cli limit --n 5 \
      > papers.json
    # post papers.json to Slack webhook

The binary is cached by npm/npx after the first run, so CI is fast too.

200+ Skills, Same Pattern

arXiv is one of 200+ community skills that follow the same pattern: call an API, return structured rows, compose with any other skill.

SkillWhat it returns
arxiv/search --keyword XPapers matching keyword
github/trendingTrending repos today
hackernews/hotHN front page
reddit/search --keyword XPosts matching keyword
stackoverflow/hotHot questions

Each one is a 20–40 line .tap.js file. Browse and contribute on GitHub.


Try it now

$ npx -y @taprun/cli arxiv search --keyword "your topic" \
    | npx -y @taprun/cli sort --field published \
    | npx -y @taprun/cli table

No install required. Works on any machine with Node.js. → taprun.dev