ASA Agent Pipeline: Current Implementation

This report represents the current agent pipeline as implemented in the workspace on 2026-04-07. It is based on the code paths that actually construct, invoke, and constrain the agent runtime: the R entrypoints, the LangGraph agent core, the research workflow graph, the DuckDuckGo transport stack, and the OpenWebpage tool.

Repo: /Users/cjerzak/Documents/asa-software Primary package: asa/ Report type: implementation visualization

Implementation Summary

The current pipeline is layered. R builds the toolset and selects either the standard ReAct graph or the memory-folding graph. Per run, R injects budgets and policies into Python state. The LangGraph core then decides when to use tools, when to automatically open webpages, when to compress context, and when to force finalization. Separately, the research enumeration workflow wraps a planner/searcher/deduper/stopper loop with optional temporal verification and low-yield webpage escalation.

R orchestration layer

asa/R/agent__initialize.R builds the tool list and picks either create_standard_agent() or create_memory_folding_agent(). asa/R/agent__run_engine.R seeds recursion limits, search budgets, auto-open-webpage policy, field rules, source policy, retry policy, finalization policy, and performance profile.

LangGraph core

asa/inst/python/asa_backend/graph/agent_graph_core.py implements the main agent state machine, tool node wrapper, budget accounting, structured-output repair, finalization rules, memory folding, observational memory, and automatic webpage follow-up.

Internet access layer

Search uses a multi-tier DuckDuckGo stack in asa/inst/python/asa_backend/search/ddg_transport.py. Full-page reading uses asa/inst/python/tools/webpage_reader_tool.py, which enforces SSRF checks, supports HTML and PDF extraction, caches blocked fetches, and returns relevance-filtered excerpts.

Runtime Layers

The base agent path and the research path share tools but not the same graph. The base agent is the general run-task pipeline; the research workflow is a higher-level enumeration graph that internally runs its own bounded tool loop inside the searcher.

General run-task path

initialize_agent() Build Wikipedia, Search, optional OpenWebpage tools; choose memory-folding or standard graph.
.run_agent() Resolve recursion limit, budgets, policies, thread id, expected schema.
Python graph invoke Invoke compiled LangGraph with seeded state via reticulate bridge.

Research enumeration path

asa_enumerate() Collect config, schema, temporal settings, webpage policy, checkpoint options.
create_research_graph() Compile planner/searcher/deduper/stopper workflow.
run_research() Invoke graph; searcher runs a bounded internal tool loop using the same tool family.
Agent/control nodes Tool execution Budgets/policies/finalization Memory and observation stages Research graph stages

Base Agent Graphs

The standard and memory-folding agents are distinct LangGraph workflows in the same core module. Both are RemainingSteps-aware. Both can route to finalize when recursion or budget conditions make another tool/model turn unsafe.

Standard agent graph

agent Model call with tools bound unless at finalization cutoff.
tools ToolNode wrapper, scratchpad extraction, diagnostics, auto-open-webpage.
agent Continue reasoning on tool outputs.
nudge Inject a continuation prompt when the agent tries to end too early with unresolved fields.
agent Retry with more explicit continuation pressure.
finalize -> END Best-effort terminal response when cut off by recursion, budget, or policy.
Key routes come from should_continue() and after_tools() in agent_graph_core.py. The graph shape is agent -> tools/finalize/nudge/end, tools -> agent/finalize/end, nudge -> agent, finalize -> END.

Memory-folding agent graph

agent Same core reasoning loop, but with summary/archive state and retrieval context injection.
tools Execute tools, then check fold budget before returning to the model.
observe / summarize Compress transcript when fold budget is exceeded.
reflect Optional observational-memory reflection path.
agent Resume with summary, archive, observations, reflections.
forced finalize Used near cutoff or when terminal budget must be reserved.
END Final response emitted or recursion edge reached.
Folding is not just message-count based. The implementation checks character budget, estimated tokens, safe fold boundaries, and preserves critical terminal exchanges near the finalization edge.

Research Workflow Graph

The research workflow in asa/inst/python/workflows/research_graph_workflow.py is a separate StateGraph tuned for open-ended enumeration. It uses a planner, a bounded searcher round, deduplication, and a deterministic outcome gate.

entry_router Fresh runs start at planner; resumed runs can jump straight to searcher.
planner Infer entity type, optional Wikidata type, and planned search queries.
searcher Try Wikidata first when type is supported, then fall back to tool-driven web search.
deduper Hash on schema fields plus fuzzy match on the first schema field.
stopper Evaluate round, query, token, time, target-items, and novelty stop conditions.
searcher or END Loop until the completion gate says stop, or until recursion protection cuts off another round.
Research node Current implementation detail
Planner Prompts for entity type, optional supported Wikidata type, and up to several search sub-queries.
Searcher Runs a minimal internal tool loop built as agent -> tools -> agent with a hard max_tool_calls per round.
Webpage enablement allow_read_webpages="auto" turns on OpenWebpage after low-yield rounds: current logic checks for two weak novelty rounds or too few existing results after round 2.
Temporal logic Time filter can be pushed into DDG. Strict temporal mode also verifies page dates in parallel and can optionally accept Wayback snapshots within the requested range.
Stopping Stopper delegates to evaluate_research_outcome() using rounds, queries, tokens, elapsed time, target items, and novelty history.

Internet Logic: Search and Web Access

The internet path is not a single tool call. It is a layered search transport plus a separate full-page reader, both wrapped by policy logic in the agent core. The diagrams below show the actual fallback order and where policy gates are applied.

Search path

Search tool call Triggered explicitly by the model or inside the research searcher loop.
Preflight Inter-search delay, session reset, proactive Tor rotation, clean-exit selection, proxy blocklist checks.
Tiered transport Try successive DDG access strategies until one yields usable results.
Normalize + rerank Add _tier marker, clean URLs, rerank, compact result rows.

DuckDuckGo tier order

Tier 0
curl_cffi on https://html.duckduckgo.com/html with rotating browser impersonation. Captcha or rate-limit detection triggers Tor rotation and retry.
Tier 1
primp with rotating impersonation profiles and the same captcha-aware retry logic.
Tier 2
selenium browser, only when use_browser=TRUE. Engine order is policy-driven: Firefox-first by default, Chrome-first when configured.
Tier 3
ddgs HTTP API, with retries and polite user-agent defaults.
Tier 4
requests scrape against the DDG HTML endpoint, plus a relaxed-query retry when the first scrape yields nothing usable.

OpenWebpage path

OpenWebpage Can be explicit, research-enabled, or added automatically after search snippet analysis.
URL guard Reject non-http(s), credentials in URL, localhost, private IPs, and resolvable internal hosts.
Fetch Stream HTML or PDF with byte caps, blocked-response detection, retry/backoff, and cache.
Extract excerpts Inline useful links, chunk content, then choose lexical or embedding-ranked excerpts.
If HTTPS fails with a TLS-style error, the tool can retry as HTTP only for some government domains (.gov or .gob). PDF extraction is enabled and uses pdftotext when available.

Current Policy Defaults Observed in Code

These are the defaults that materially shape internet behavior when not overridden by R-side options or per-run state.

Area Current default Operational effect
DDG search config timeout=30s, max_retries=3, inter_search_delay=1.5s, humanize_timing=TRUE, allow_direct_fallback=FALSE Search prefers Tor/proxy anonymity, waits between calls, and avoids direct-IP fallback unless explicitly enabled.
Selenium browser order firefox_first Browser tier attempts Firefox, then UC Chrome, then standard Chrome unless config flips it.
OpenWebpage base config allow_read_webpages=FALSE, max_chars=8000, max_chunks=6, chunk_chars=1200, relevance_mode=auto Full-page reading is opt-in and excerpt-oriented rather than full-document dumping.
OpenWebpage content caps max_bytes=2,000,000, PDF enabled, pdf_max_bytes=8,000,000, pdf_max_pages=8 Large fetches are bounded; PDFs are allowed but clipped.
Performance profile balanced Feeds the default webpage policy in the agent core.
Balanced webpage policy max_open_calls=3, host_cooldown_seconds=45, blocked_host_ttl_seconds=900, open_only_if_score_ge=0.40, parallel_open_limit=1 Auto follow-up only opens a few high-scoring pages and cools down failing hosts.
Research workflow defaults max_workers=4, max_rounds=8, budget_queries=50, allow_read_webpages=FALSE, temporal_strictness=best_effort Research starts conservative and only escalates webpage reading when explicitly requested or in low-yield auto mode.

Automatic Internet Escalation Logic

The current implementation uses both explicit and automatic webpage expansion. The automatic path lives in the tool-node wrapper in the LangGraph core; it is separate from the research workflow's allow_read_webpages="auto" switch.

Agent-core auto-open-webpage follow-up

  • Only considered when the last tool round produced Search tool messages containing __START_OF_SOURCE blocks and <URL> markers.
  • Disabled if auto_openwebpage_policy resolves to off, if webpage policy is disabled, if no tool budget remains, or if max_open_calls has already been reached.
  • Uses preferred URLs from snippet extraction first, then generic ranked candidates.
  • Applies host cooldown and blocked-host TTL so repeated hard failures on the same host are suppressed.
  • Can be forced by snippet-recovery logic when structured extraction from search snippets failed but escalation candidates were found.

Research-workflow webpage auto mode

  • If allow_read_webpages is the string "auto", the searcher delays webpage reading until results are low-yield.
  • The current low-yield trigger is: at least round 2, plus either the last two novelty rates are each below 0.2 or the run still has fewer than two results after round 2.
  • When enabled, the per-round tool-call cap increases from 3 to 5.
  • The search prompt then explicitly allows OpenWebpage and asks for focused queries when opening pages.

Stopping, Finalization, and Budget Logic

The current implementation tries hard to stop cleanly instead of letting LangGraph die at the raw recursion limit. This is one of the main reasons the pipeline has explicit finalize and nudge nodes.

Base agent stop controls

  • remaining_steps is checked throughout routing; the graph may reserve terminal budget after tools.
  • search_budget_limit and tool/model counters are normalized into budget_state.
  • unknown_after_searches caps how long unresolved schema fields stay eligible before being treated as exhausted.
  • finalize_on_all_fields_resolved can force an end once all resolvable fields are complete.
  • The tool wrapper also tightens budgets on repeated low-signal, low-efficiency, dedupe-heavy, or empty-round behavior.

Research stop controls

  • The stopper uses evaluate_research_outcome() rather than only checking round counts.
  • It combines rounds, query count, token use, elapsed time, target-items goal, plateau rounds, and novelty minimum.
  • Recursion-limit protection is treated as a first-class stop signal and is passed into the completion gate.
  • Strict temporal mode can also fail fast if the temporal verifier module is unavailable.

Source-of-Truth Files

These files define the current implementation shown above. If the pipeline changes, these are the files to diff first.

File Why it matters
asa/R/agent__initialize.R Creates Search, Wikipedia, and OpenWebpage tools; chooses standard vs memory-folding agent; configures proxy/Tor and search defaults.
asa/R/agent__run_engine.R Injects per-run runtime state such as recursion limit, schema, search budget, auto-open-webpage policy, webpage policy, and finalization settings.
asa/inst/python/asa_backend/graph/agent_graph_core.py Main LangGraph agent implementation, tool wrapper, memory folding, observational memory, automatic webpage follow-up, diagnostics, and budget control.
asa/inst/python/workflows/research_graph_workflow.py Planner/searcher/deduper/stopper workflow for enumeration-style research, including temporal filtering and low-yield webpage auto mode.
asa/inst/python/asa_backend/search/ddg_transport.py Tiered DDG transport, anti-detection timing, Tor rotation, proxy blocklist, Selenium fallback, DDGS fallback, and raw scrape fallback.
asa/inst/python/tools/webpage_reader_tool.py OpenWebpage implementation: SSRF guard, blocked-fetch detection, cache, PDF extraction, excerpt selection, and error payload formatting.
asa/tests/testthat/test-search-tiers.R Confirms the expected DDG tier markers exposed in traces: curl_cffi, primp, selenium, ddgs, requests.

Practical Read of the Current Design

The current system is not just a search tool attached to a model. It is a controlled graph runtime with explicit retry and finalization nodes, a transport stack built to survive DDG blocking, and a second layer of webpage escalation that is constrained by scored URL selection, host cooldowns, and budgets. The research workflow is a separate planner/search loop, but it still relies on the same search and webpage primitives.

In short: R selects and seeds the graph, Python runs the state machine, search uses a five-tier DuckDuckGo fallback ladder, and webpage reading is optional, cached, relevance- filtered, and policy-gated rather than always-on.
Generated from code inspection in the local workspace. No external sources were used.