overview
The backend wraps the original biomedical pipeline when OpenRouter and ToolUniverse are available. In local recorded mode, it displays an original submission CSV trace with the real plan, tool calls, Tool Facts, and final answer.
Reasoning System
A CURE-Bench Plan-Act-Verify system that uses a model planner, biomedical tools, curated Tool Facts, and a final answer pass.
Published acc.
0.69564
fine-tuned GPT-4.1 + tools
Passes
2
plan then answer/verify
Tool families
6
FDA, DailyMed, RxNav, more
overview
The backend wraps the original biomedical pipeline when OpenRouter and ToolUniverse are available. In local recorded mode, it displays an original submission CSV trace with the real plan, tool calls, Tool Facts, and final answer.
role
Portfolio integration: converted the original benchmark pipeline and submission artifacts into a readable question, plan, tool retrieval, verification, and answer display.
backend runner
Calls /api/biomedical/run on the FastAPI wrapper at http://localhost:8000.
question and choices
backend contract
Start the backend with uvicorn backend.main:app --reload --port 8000. The UI displays mode and provenance so recorded artifacts are clearly distinguished from live computation.
Configure the inputs and run the backend to display the original project trace and outputs here.
architecture flow
The live pipeline trace appears in the backend runner after execution. This section shows the original project components that the backend wraps.
planner
Analyzes the stem and choices, extracts keywords, selects facts needed, and proposes biomedical tools.
retrieval
Runs curated biomedical tools and records success/failure traces for evidence gathering.
filter
Filters, deduplicates, clips, and diversifies successful facts before the answer pass.
verifier
Combines prior analysis, curated facts, and the full MCQ to produce one final answer letter.
tools and models
planner
Analyzes the stem and choices, extracts keywords, selects facts needed, and proposes biomedical tools.
retrieval
Runs curated biomedical tools and records success/failure traces for evidence gathering.
filter
Filters, deduplicates, clips, and diversifies successful facts before the answer pass.
verifier
Combines prior analysis, curated facts, and the full MCQ to produce one final answer letter.
example input
A pediatric generalized myasthenia gravis multiple-choice question with drug choices.
final result
The backend returns either a live original pipeline run or a provenance-backed recorded submission row from the original CURE-Bench outputs.
limitations