todo(3): open
This commit is contained in:
parent
908f59f5a0
commit
1d672ff8d9
1 changed files with 15 additions and 0 deletions
15
TODO
15
TODO
|
|
@ -46,3 +46,18 @@ Description: Add the supervised fine-tuner: render trajectories through the
|
|||
tokenizer's own chat template (matching serving), canonicalise
|
||||
turns (fold system, merge consecutive), derive an assistant-only
|
||||
loss mask by token-prefix differencing, and train a QLoRA adapter.
|
||||
|
||||
--ISSUE
|
||||
Content-Type: application/issue
|
||||
ID: 3
|
||||
Type: feature
|
||||
Title: Behavioural evaluator
|
||||
Status: open
|
||||
Priority: medium
|
||||
Created: 2026-06-16
|
||||
Module: sekft
|
||||
Relationships:
|
||||
Description: Add the behavioural eval: load base plus LoRA adapter, drop it into
|
||||
held-out scenarios with no scaffold, drive them through a local
|
||||
operator that renders with the model's chat template, and report
|
||||
reach/terminate/checker rates.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue