Reasoning System

Plan–Act–Verify Biomedical Reasoning

An agentic biomedical QA system that answers medical reasoning problems through planning, evidence retrieval, Tool Facts, verification, and final answer synthesis.

project credential

First Author, MIDI 2025

Abstract PDF MIDI 2025 Venue

Run demo Original GitHub

Published acc.

0.69564

fine-tuned GPT-4.1 + tools

Passes

plan then answer/verify

Tool families

FDA, DailyMed, RxNav, more

overview

The backend wraps the original biomedical pipeline when OpenRouter and ToolUniverse are available. In local recorded mode, it displays an original submission CSV trace with the real plan, tool calls, Tool Facts, and final answer.

role

Portfolio integration: converted the original benchmark pipeline and submission artifacts into a readable question, plan, tool retrieval, verification, and answer display.

backend runner

Run original project workflow

Calls /api/biomedical/run on the FastAPI wrapper at https://ruizelab-api.onrender.com.

question and choices

ABCD

Attempt live pipeline if backend credentials are configured

backend contract

Start the backend with uvicorn backend.main:app --reload --port 8000. The UI displays mode and provenance so recorded artifacts are clearly distinguished from live computation.

No backend result yet

Configure the inputs and run the backend to display the original project trace and outputs here.

architecture flow

Agent and model flow

The live pipeline trace appears in the backend runner after execution. This section shows the original project components that the backend wraps.

planner

GPT5Model.plan

plan JSON

Analyzes the stem and choices, extracts keywords, selects facts needed, and proposes biomedical tools.

retrieval

ToolAgent.collect

tool calls

Runs curated biomedical tools and records success/failure traces for evidence gathering.

filter

Tool Fact Curator

10 facts max

Filters, deduplicates, clips, and diversifies successful facts before the answer pass.

verifier

Pass-2 Answer Prompt

Final answer

Combines prior analysis, curated facts, and the full MCQ to produce one final answer letter.

tools and models

Components behind the demo