todo(3): open
This commit is contained in:
parent
908f59f5a0
commit
1d672ff8d9
1 changed files with 15 additions and 0 deletions
15
TODO
15
TODO
|
|
@ -46,3 +46,18 @@ Description: Add the supervised fine-tuner: render trajectories through the
|
||||||
tokenizer's own chat template (matching serving), canonicalise
|
tokenizer's own chat template (matching serving), canonicalise
|
||||||
turns (fold system, merge consecutive), derive an assistant-only
|
turns (fold system, merge consecutive), derive an assistant-only
|
||||||
loss mask by token-prefix differencing, and train a QLoRA adapter.
|
loss mask by token-prefix differencing, and train a QLoRA adapter.
|
||||||
|
|
||||||
|
--ISSUE
|
||||||
|
Content-Type: application/issue
|
||||||
|
ID: 3
|
||||||
|
Type: feature
|
||||||
|
Title: Behavioural evaluator
|
||||||
|
Status: open
|
||||||
|
Priority: medium
|
||||||
|
Created: 2026-06-16
|
||||||
|
Module: sekft
|
||||||
|
Relationships:
|
||||||
|
Description: Add the behavioural eval: load base plus LoRA adapter, drop it into
|
||||||
|
held-out scenarios with no scaffold, drive them through a local
|
||||||
|
operator that renders with the model's chat template, and report
|
||||||
|
reach/terminate/checker rates.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue