The trainer was nearly silent: outside an example count and a save line it
printed nothing through tokenizer load, the base-model load, example building, or
the training loop, and trajectories dropped for length or empty mask vanished
without a trace.
Add _log.py (a shared stderr logging setup so stdout stays clean for results)
and a module logger. sekft-train gains -v/--verbose and -q/--quiet. Log the run
config and each phase; report dataset accounting (keepers -> usable, with counts
dropped for over-length and empty-mask and a warning when any are dropped);
inside the build loop, a per-trajectory debug line and a progress line every 100;
raise transformers' verbosity during training so the per-step curve shows. Prints
in train() and inspect() are routed through the logger.
sekft-train logs run config and each phase (tokenizer load, model load, training, save) via a module logger to stderr; -v/--verbose and -q/--quiet control level; dataset accounting reports keepers->usable with counts dropped for length and empty-mask and warns when any are dropped; transformers verbosity raised so the per-step curve shows during training; inspect() logs likewise; existing sft tests stay green; mypy strict clean.
mypy --strict src tests passes (no issues, 6 files); ML/posix-sdc imports ignored via overrides; sft/eval/resident/tests fully annotated; mypy is a declared dev dep; py.typed ships in the package.
README describes the trainer/eval/resident, the posix-sdc dependency, install via the gpu extra, the chat-template render contract, and console-script usage; module docstrings reference sekft-train/eval/resident and apply_chat_template, not the old flat scripts.
tests/unit and tests/smoke pass under pytest; the mask test proves assistant-only training and raises on non-additive templates; entry-point smoke tests pass without torch.
Resident loads the base once; fit trains an adapter and unloads it; evaluate attaches an adapter (or the base baseline) and renders via the shared chat-template canonicalisation.
eval reuses the posix-sdc rollout with a local operator; renders via apply_chat_template + normalize_for_template; reports command-mode, terminate, and verified rates over held-out scenarios.
sft renders via apply_chat_template; normalize_for_template folds system and merges consecutive turns; the loss mask trains assistant turns only and raises on a non-additive template; --inspect reports mask stats.