todo(10): open

This commit is contained in:
Tiara Rodney 2026-06-17 23:43:08 +02:00
parent 86df915524
commit 814261dc56
Signed by: tiara
GPG key ID: 5CD8EC1D46106723

23
TODO
View file

@ -149,3 +149,26 @@ Description: Make the lint env honestly pass: add mypy as a dev dependency,
ignore_missing_imports for the ML libs, fully annotate
eval/resident/sft (including the inner operator callables), and
ship a py.typed marker so the Typing::Typed claim is real.
--ISSUE
Content-Type: application/issue
ID: 10
Type: feature
Title: structured logging for the trainer (sft)
Status: open
Priority: medium
Created: 2026-06-17
Module: sekft
Relationships:
Description: The trainer is nearly silent: outside an example count and a save
line it prints nothing through tokenizer load, the ~14GB base-model
load, example building, and the whole training loop, and
trajectories dropped for exceeding --max-len or having an empty
loss mask vanish without a trace. Add a small shared logging setup
(_log.py, stderr so stdout stays clean for results) and a module
logger; give sekft-train -v/--verbose and -q/--quiet. Log the run
config and each phase, report dataset accounting (keepers ->
usable, with counts dropped for length / empty-mask and a warning
when any are dropped), and raise transformers' verbosity during
training so the per-step curve shows. Apply to train() and
inspect().