$ conda run -n Engi python verification/evaluator.py baseline/solution.py
{"combined_score": 0.4607170813812293, "valid": 1.0, "mean_candidate_loss": 154.12792152124075, "mean_baseline_loss": 154.12792152124075, "mean_improvement_ratio": 0.0, "total_candidate_sim_calls": 96.0, "cases_evaluated": 4.0}

$ conda run -n Engi python -m frontier_eval task=unified task.benchmark=AdditiveManufacturing/DiffSimThermalControl task.runtime.conda_env=Engi algorithm.iterations=0
Pass: unified loaded the benchmark and evaluated the initial program with combined_score=0.4607170813812293, valid=1.0, benchmark_returncode=0.0
