todo(10): open

2026-06-17 23:43:08 +02:00 · 2026-06-17 23:43:08 +02:00 · 814261dc56
commit 814261dc56
parent 86df915524
1 changed files with 23 additions and 0 deletions
--- a/23
+++ b/23
@ -149,3 +149,26 @@ Description: Make the lint env honestly pass: add mypy as a dev dependency,
             ignore_missing_imports for the ML libs, fully annotate
             eval/resident/sft (including the inner operator callables), and
             ship a py.typed marker so the Typing::Typed claim is real.
+
+--ISSUE
+Content-Type: application/issue
+ID: 10
+Type: feature
+Title: structured logging for the trainer (sft)
+Status: open
+Priority: medium
+Created: 2026-06-17
+Module: sekft
+Relationships: 
+Description: The trainer is nearly silent: outside an example count and a save
+             line it prints nothing through tokenizer load, the ~14GB base-model
+             load, example building, and the whole training loop, and
+             trajectories dropped for exceeding --max-len or having an empty
+             loss mask vanish without a trace. Add a small shared logging setup
+             (_log.py, stderr so stdout stays clean for results) and a module
+             logger; give sekft-train -v/--verbose and -q/--quiet. Log the run
+             config and each phase, report dataset accounting (keepers ->
+             usable, with counts dropped for length / empty-mask and a warning
+             when any are dropped), and raise transformers' verbosity during
+             training so the per-step curve shows. Apply to train() and
+             inspect().