What. Removed the entire "text then object" structuring path from PipeLLM down through cogt and Temporal. The StructuringMethod.PRELIMINARY_TEXT enum value stays so a future implementation can opt in; selecting it at runtime now raises NotImplementedError.
Why. The mechanism produced a 4-way matrix in PipeLLM._llm_gen_object_stuff_content (single/list × direct/preliminary-text) plus is_with_preliminary_text plumbing that leaked into PipeRunParams, helpers, config, the cogt content-generator protocol, and four Temporal workflows. It's getting reimplemented later in a completely different way.
Impact. PipeLLM's object branch collapses from 4 cases to 2. The cogt protocol thins by 2 methods, and the surviving pair gets renamed (make_object_direct → make_object, make_object_list_direct → make_object_list) — the _direct suffix existed only to contrast with the deleted _text_then_object variants. Two Temporal workflows go away. One config flag and three TOML templates go away. Pre-cleanup for the upcoming collapse-content-generation-workflow-layer refactor.
Before the refactor, when a PipeLLM had to produce a structured object, it could pick between two structuring methods. Both were live in code; the choice produced a 4-way matrix because each combined with the multiplicity (single object vs list).
PipeLLM._live_run_operator_pipe(...)
│
├─ derive is_with_preliminary_text from
│ self.structuring_method == PRELIMINARY_TEXT
│ OR get_config().pipelex.structure_config.is_default_text_then_structure
│
├─ inject is_with_preliminary_text into PipeRunParams
│
├─ build llm_prompt_2_factory ◄── extra factory only
│ match self.structuring_method: for the second LLM call
│ case DIRECT: llm_prompt_2_factory = None
│ case PRELIMINARY_TEXT: llm_prompt_2_factory =
│ LLMPromptTemplate.make_for_structuring_from_preliminary_text()
│
└─ _llm_gen_object_stuff_content(... llm_prompt_2_factory ...)
│
├─ if is_multiple_output:
│ ├─ if llm_prompt_2_factory: make_text_then_object_list ◄─┐
│ └─ else: make_object_list_direct │ 4-way
│ │ matrix
└─ else (single): │
├─ if llm_prompt_2_factory: make_text_then_object ◄─┘
└─ else: make_object_direct
Each make_text_then_object* call resolved through the ContentGenerator protocol, which had four implementations (direct, dry, Temporal child, Temporal top). Inside Temporal, those calls dispatched as their own child workflows (WfMakeTextThenObject, WfMakeTextThenObjectList) that ran two activities in sequence with a LLMAssignmentFactory.make_llm_assignment(preliminary_text=…) call between them.
PipeLLM._live_run_operator_pipe(...)
│
├─ if self.structuring_method == PRELIMINARY_TEXT:
│ raise NotImplementedError(...) ◄── only guard at runtime
│
├─ no is_with_preliminary_text plumbing
│
└─ _llm_gen_object_stuff_content(...)
│
├─ if is_multiple_output: make_object_list ◄── 2 cases, flat
└─ else: make_object ◄── "_direct" suffix dropped
since there's no other way
The StructuringMethod.PRELIMINARY_TEXT enum value stays — pipes can still declare it. The runtime guard at the top of _live_run_operator_pipe is the only place that knows about it. Everything in the dotted path between PipeLLM and the activities is gone.
PipeLLM operator flattenThe central simplification. pipe_llm.py shed ~110 lines: the is_with_preliminary_text derivation, the llm_prompt_2_factory match block, two of the four branches in _llm_gen_object_stuff_content, the "text_then_object" string in the execution-data tracker, and unused imports of LLMPromptFactoryAbstract / LLMPromptTemplate / cast.
PipeRunParams + helpers.py trim signaturesPipeRunParams.is_with_preliminary_text field deleted. PipeRunParams.copy_by_injecting_multiplicity drops the kwarg. helpers.get_output_structure_prompt drops the parameter (used to pick between two TOML templates; now there's only one).
ContentGeneratorProtocol.make_text_then_object + make_text_then_object_list — gone.ContentGenerator (direct) and ContentGeneratorDry — gone.TextThenObjectAssignment data class — gone.LLMAssignmentFactory — gone (only used by the deleted text-then-object code).LLMPromptTemplate.make_for_structuring_from_preliminary_text() classmethod — gone.make_object_direct → make_object, make_object_list_direct → make_object_list. Propagated through all four protocol implementations, both Temporal generators (Top + Child), and every call site / test reference.WfMakeTextThenObject + WfMakeTextThenObjectList workflows — gone.ContentGeneratorTop and ContentGeneratorChild — gone.TextThenObjectAssignment dropped from the AssignmentType Union.tasks.py's crafting TaskPack.workflow_list.StructureConfig class deleted (its only field, is_default_text_then_structure, served only as the global "always use preliminary text" override).structure_from_preliminary_text_system, structure_from_preliminary_text_user, and the now-redundant output_structure_prompt deleted; output_structure_prompt_no_preliminary_text renamed to output_structure_prompt (it's now the only structure prompt).preliminary_text dropped from filter sets in 5 blueprint files (kept place_holder, the only other one).Two big diffs tell the story. First the runtime guard + the dropped plumbing in _live_run_operator_pipe:
content_generator = content_generator or get_content_generator() + if self.structuring_method == StructuringMethod.PRELIMINARY_TEXT: + msg = ( + f"PipeLLM '{self.code}': structuring_method='preliminary_text' is not currently supported. " + "The text-then-object mechanism was removed; a new implementation is planned." + ) + raise NotImplementedError(msg) output_stuff_spec = self.resolve_dynamic_output_stuff_spec(...) ... - is_with_preliminary_text = ( - self.structuring_method == StructuringMethod.PRELIMINARY_TEXT - ) or get_config().pipelex.structure_config.is_default_text_then_structure - log.verbose(f"is_with_preliminary_text: {is_with_preliminary_text} ...") llm_prompt_run_params = PipeRunParams.copy_by_injecting_multiplicity( pipe_run_params=pipe_run_params, applied_output_multiplicity=applied_output_multiplicity, - is_with_preliminary_text=is_with_preliminary_text, ) ... - llm_prompt_2_factory: LLMPromptFactoryAbstract | None - if self.structuring_method: - structuring_method = cast("StructuringMethod", self.structuring_method) - match structuring_method: - case StructuringMethod.DIRECT: - llm_prompt_2_factory = None - case StructuringMethod.PRELIMINARY_TEXT: - llm_prompt_2_factory = LLMPromptTemplate.make_for_structuring_from_preliminary_text() - elif get_config().pipelex.structure_config.is_default_text_then_structure: - llm_prompt_2_factory = LLMPromptTemplate.make_for_structuring_from_preliminary_text() - else: - llm_prompt_2_factory = None
Second, the body of _llm_gen_object_stuff_content — the 4-way matrix becomes a 2-way split:
if is_multiple_output:
if llm_prompt_2_factory is not None:
# text_then_object_list
objs = await content_generator
.make_text_then_object_list(...)
else:
# object_list_direct
objs = await content_generator
.make_object_list_direct(...)
the_content = ListContent(items=objs)
else:
if llm_prompt_2_factory is not None:
# text_then_object
the_content = await content_generator
.make_text_then_object(...)
else:
# object_direct
the_content = await content_generator
.make_object_direct(...)
if is_multiple_output:
objs = await content_generator
.make_object_list(...)
return ListContent(items=objs)
return await content_generator
.make_object(...)
(This was the question my co-developer asked me to clarify. Three classes lived in this neighborhood; their fate differs.)
| Class | Where | Status | Why |
|---|---|---|---|
LLMPromptFactoryAbstract |
pipelex/cogt/llm/llm_prompt_factory_abstract.py |
kept | Abstract base. Defines make_llm_prompt_from_args(**prompt_arguments). Still the parent of LLMPromptTemplate. |
LLMPromptTemplate |
pipelex/cogt/llm/llm_prompt_template.py |
kept | Concrete subclass. Builds an LLMPrompt from a proto_prompt + arguments. Still tested. One classmethod removed: make_for_structuring_from_preliminary_text() — its only job was to construct a template wired to the deleted TOML templates. |
LLMAssignmentFactory |
was pipelex/cogt/content_generation/assignment_models.py |
deleted | Bundled (JobMetadata + LLMSetting + LLMPromptFactoryAbstract). Its only callers were the second-step assignment in make_text_then_object / make_text_then_object_list + the matching workflow variants WfMakeTextThenObject*. Zero callers remain. |
TextThenObjectAssignment |
was pipelex/cogt/content_generation/assignment_models.py |
deleted | Data class held the first call's LLMAssignment + the second call's LLMAssignmentFactory. Carried by the deleted Temporal workflows. |
Net effect: the prompt-factory mechanism (LLMPromptFactoryAbstract + LLMPromptTemplate) is intact and unchanged in shape. The assignment factory (LLMAssignmentFactory) and one classmethod that produced a "structure from preliminary text" template are both gone.
LLMAssignmentFactory.make_llm_assignment(preliminary_text=…) actually do?
class LLMAssignmentFactory(BaseModel):
job_metadata: JobMetadata
llm_setting: LLMSetting
llm_prompt_factory: LLMPromptFactoryAbstract # typically an LLMPromptTemplate
async def make_llm_assignment(self, **prompt_arguments) -> LLMAssignment:
# render the template with the kwargs (e.g. preliminary_text=...)
llm_prompt = await self.llm_prompt_factory.make_llm_prompt_from_args(**prompt_arguments)
return LLMAssignment(job_metadata=..., llm_setting=..., llm_prompt=llm_prompt)
It deferred prompt construction so the second LLM call could substitute the first call's text output as a template variable. With the second-call path gone, it has no purpose.
ContentGeneratorProtocol thins (and renames)Two methods deleted, two methods renamed. The _direct suffix existed only to disambiguate from the deleted _text_then_object variants — once those are gone, the suffix is dead weight.
- def make_object_direct(...) -> Coroutine[..., BaseModelTypeVar]: ... + def make_object(...) -> Coroutine[..., BaseModelTypeVar]: ... - def make_text_then_object( - self, ..., - llm_prompt_factory_for_object: LLMPromptFactoryAbstract | None = None, - ) -> Coroutine[..., BaseModelTypeVar]: ... - def make_object_list_direct(...) -> Coroutine[..., list[BaseModelTypeVar]]: ... + def make_object_list(...) -> Coroutine[..., list[BaseModelTypeVar]]: ... - def make_text_then_object_list(...) -> Coroutine[..., list[BaseModelTypeVar]]: ...
Renames propagated through all four implementations (ContentGenerator, ContentGeneratorDry, ContentGeneratorChild, ContentGeneratorTop), the call sites in PipeLLM and WfTestContentGeneratorChild, the docstring references in temporal_data_converter.py, and the matching test methods.
AssignmentType Union shrinksAssignmentType = Union[ LLMAssignment, ObjectAssignment, - TextThenObjectAssignment, ImgGenAssignment, ]
workflow_list=[ WfMakeObject, WfMakeLLMText, - WfMakeTextThenObject, WfMakeObjectList, - WfMakeTextThenObjectList, WfMakeImages, WfMakeExtract, WfMakeJinja2Text, WfRenderPageViews, ],
preliminary_text stops being specialBefore: 5 blueprint files filtered both preliminary_text and place_holder out of "required variables", because the runtime would inject preliminary_text on its own. After: only place_holder is filtered — preliminary_text is now a regular variable name (no special semantics).
-if not root.startswith("_") and root not in {"preliminary_text", "place_holder"}: +if not root.startswith("_") and root != "place_holder":
-class StructureConfig(ConfigModel): - is_default_text_then_structure: bool class Pipelex(ConfigModel): ... - structure_config: StructureConfig prompting_config: PromptingConfig
-structure_from_preliminary_text_system = """ ... """ -structure_from_preliminary_text_user = """ ... """ -output_structure_prompt = """ ... (the variant that paired with the preliminary text) """ -output_structure_prompt_no_preliminary_text = """ ... """ +output_structure_prompt = """ ... (the surviving variant — renamed) """
| Test | Action |
|---|---|
tests/integration/.../pipe_llm/test_pipe_llm.py | Removed the StructuringMethod.PRELIMINARY_TEXT parametrize entry. Added test_pipe_llm_preliminary_text_raises_not_implemented — asserts that running a PipeLLM with structuring_method=PRELIMINARY_TEXT raises NotImplementedError. |
parallel_text_analysis.mthds | Removed structuring_method = "preliminary_text" from two pipes (the third occurrence is inside a prompt string, not a config setting — left as-is). |
test_assignment_models_schema.py + test_assignment_models_security.py | Kept; ported the ObjectAssignment-specific assertions (per the project's "security perimeter tests" rule). Dropped the TextThenObjectAssignment cases. |
test_tprl_content_generator_top.py | Dropped test_tprl_make_text_then_object + test_tprl_make_text_then_object_list. The other 5 methods on TestTprlCrafterTop stay. |
wf_test_content_generator_child.py | Dropped the two make_text_then_object / make_text_then_object_list call blocks. The surrounding test workflow (make_llm_text, make_object, make_object_list, templating, extract) survives. |
test_llm_prompt_blueprint.py, test_pipe_llm_blueprint.py, test_construct_blueprint.py | The "filters preliminary_text" assertions updated to reflect that preliminary_text is no longer special — only place_holder is filtered. |
make agent-check — clean (Ruff, Pyright, mypy, plxt). 0 errors.make tb (boot test) — passes. Confirms StructureConfig removal didn't break config loading.tests/unit/pipelex/pipe_operators/ + tests/unit/pipelex/cogt/content_generation/ + tests/integration/pipelex/pipes/ — 435 passed, 2 skipped, 1 xfailed (pre-existing).tests/integration/pipelex/temporal/ — 122 passed, 1 deselected.Known flaky test under -n auto: test_wf_pipe_batch.py::TestWfPipeBatch::test_batch_sequence_via_temporal[isolated] intermittently fails with KajsonDecoderError: Class 'temporal_batch_test__Topic' not found in module 'builtins' or global registry when run in parallel. Passes in isolation and passes serially. This appears to be a class-registry race when sibling pytest-xdist workers reload bundles. Marked as xfail(strict=False) in this PR — to be investigated as a separate issue. Same root cause is independent of this refactor.
StructuringMethod.PRELIMINARY_TEXT enum value remains; pipes can still set it. Whatever replaces this is going to look completely different.collapse-content-generation-workflow-layer refactor (see wip/temporal-primitives/collapse-content-generation-workflow-layer.md). That refactor's delete list overlapped with this one (WfMakeTextThenObject*, ContentGenerator{Child,Top}.make_text_then_object*); those are now already removed, shrinking its scope. The remaining WfMake* wrappers (WfMakeObject, WfMakeObjectList, WfMakeLLMText, …) are still there, untouched.test_wf_pipe_batch.py::test_batch_sequence_via_temporal[isolated] when run with pytest-xdist.The next step on this branch is the collapse-content-generation-workflow-layer refactor (see wip/temporal-primitives/collapse-content-generation-workflow-layer-v2.md): delete the WfMake* child-workflow wrappers entirely and have the in-workflow content generator call workflow.execute_activity(act_*, …) directly from inside WfPipeRouter. That cuts ≈12 history events per content-generation call down to 3, while keeping the per-LLM-call activity boundary (so durability is preserved). This text-then-object cleanup was the prerequisite: WfMakeTextThenObject / WfMakeTextThenObjectList were the only WfMake* workflows that ran two activities with non-trivial Python glue between them — exactly the case that would have made the collapse risky to do in a single shot. With them gone, every surviving WfMake* is now a structurally identical single-activity wrapper, and the collapse becomes pure mechanical deletion.