import data
automatically generate training data from your RAG data source (pinecone, turbopuffer, chroma) or your agent traces in observability tools (braintrust, langfuse, langsmith). no manual labeling required.
from trainer.qa_generation.cgft_pipeline import CgftPipeline
from trainer.corpus.turbopuffer.source import TpufChunkSource
# supports turbopuffer, pinecone, chroma, postgres
source = TpufChunkSource(
namespace="legal-koblex",
api_key="tpuf_...",
)
cfg = CgftPipelineConfig(
platform=PlatformConfig(api_key="sk_..."),
)
rag_dataset = CgftPipeline(cfg, source_factory=lambda _: source).run() from trainer.traces import TracesPipeline, PivotConfig
# supports braintrust, langfuse, langsmith
pipeline = TracesPipeline(
traces=traces,
min_completion_chars=40,
tool_relay=0.8,
dedup=0.9,
pivot=PivotConfig(llm_client=..., model=..., threshold=0.6),
target_examples=1000,
output_dir="outputs/agentic",
)
result = pipeline.run() generate training examples
source
Turbopuffer koblex 14,820 chunks
corpus summary Korean statutes and enforcement decrees spanning civil, criminal, administrative,
tax, and election law.
example prompts question mix single-hop
multi-hop
comparison
| prompt | ground truth |
|---|---|
| Under the Public Official Election Act, when may a Si/Gun/Gu
election commission directly appoint a party-recommended member? |
Within 24 hours of a delayed recommendation list under Art. 52, where
the political party has not timely submitted nominees to the commission
[Source: PUBLIC OFFICIAL ELECTION ACT].
|
| When are customs duties waived for re-imported goods, and what
rules apply to goods exported only for testing? |
Re-imported within two years, or exported solely for overseas testing
and research, qualify for waiver; separate rules apply to returned
containers [Source: CUSTOMS ACT].
|
| Under the Act on Prevention of Contagious Animal Diseases, can
subsidized livestock owners also receive mental-health treatment? |
Yes — post-destruction hardship subsidies include mental-health
treatment where prescribed by Presidential Decree [Source: ACT ON THE
PREVENTION OF CONTAGIOUS ANIMAL DISEASES].
|
| Which provision in the Framework Act on Environmental Policy
assigns general responsibility to the polluter? |
Article 7 — the polluter is generally responsible for preventing damage
and restoring the environment, subject to exceptions under relevant
statutes [Source: FRAMEWORK ACT ON ENVIRONMENTAL POLICY].
|
build training dataset
source response mix
Braintrust korea-law-agent-prod 584 traces
filters drop tool-echo turns threshold 0.8
dedup near-duplicate completions threshold 0.9
pivot filter (llm counterfactual) threshold 0.6
read-only tool call
mutating tool call
text response
| prompt | trajectory |
|---|---|
| Person A and Person B agreed to resolve their dispute through civil
mediation. Person B's representative lacked documentation — if
Person B later ratifies, does it have retroactive effect? |
search_statutes("civil mediation ratification") → read_article("Civil
Procedure Act Art. 220") → respond("retroactive effect applies…")
[Source: CIVIL PROCEDURE ACT]
|
| Person A occupied Person B's land as a warehouse without
authorization, knowing it was vacant with no usage agreement. What
remedies does Person B have? |
search_statutes("unauthorized land use") → read_article("Civil Act Art.
213-214") → respond("injunction + damages under Art. 214…") [Source:
CIVIL ACT]
|
| Person A applied for a real-estate permit in City B's jurisdiction.
City B denied, citing environmental protection. On what grounds can
Person A challenge the denial? |
search_statutes("permit denial environmental") →
read_article("Administrative Litigation Act Art. 27") →
respond("challenge on proportionality grounds…") [Source: ADMINISTRATIVE
LITIGATION ACT]
|
| Person A, a friend of incarcerated Person B, sent a drone
delivering a letter and alcohol to the correctional facility. What
offenses may Person A face? |
search_statutes("contraband drone inmate") → read_article("Act on
Execution of Sentences & Aviation Safety Act") → respond("contraband
delivery + aviation violation…") [Source: AVIATION SAFETY ACT]
|
define tools & rewards
plug in the same tools your agent uses in production, and encode the real-world quality bar with composable rewards.
from benchmax.envs.base_env import BaseEnv
class SearchEnv(BaseEnv):
system_prompt = """you are a retrieval agent..."""
async def run_tool(self, name, **args):
if name == "search_corpus":
return self.corpus.search(args)
async def compute_reward(
self, rollout_id, completion, ground_truth
):
return {
"correctness": await compute_correctness(),
"conciseness": await compute_conciseness(),
"citation": await compute_citation(),
"tool_call_efficiency": await compute_tool_call_efficiency(),
}
environment
🛠️ tools
search_corpus 🏆 rewards
correctness
conciseness
citation
tc_efficiency
agent
system_prompt: you are a retrieval agent.
👤 task
Under the Civil Act, when does the statute of limitations begin for tort
claims?
executing tool 3-year limit from when the victim learned of the damage and the
offender's identity · Civil Act Art. 766
1 tool call ‹
💬 answer
🧠 learning from feedback
📊 score
correctness 0.80
conciseness 0.40
citation 0.20
tc_efficiency 0.65
computing rewards
monitor training
watch reward curves climb in real time and drill into any rollout. catch reward hacking and regressions early.
rag-legal-koblex
pending
step 0/120
average reward
0.800.600.400.200.00
0 120
response lengths
36003150270022501800
0 120
max reward
1.000.750.500.250.00
0 120
solve rate
1.000.750.500.250.00
0 120
prompt
preview
step
destroy gpt
achieve frontier model performance at a fraction of the cost with a model you own and control.
finetuned 4b
🏎️
0.00¢ 0.0s
gpt-5.4
🚜
0.00¢ 0.0s
ready
set
GO!