diff --git a/CHANGELOG.md b/CHANGELOG.md new file mode 100644 index 0000000..4408334 --- /dev/null +++ b/CHANGELOG.md @@ -0,0 +1,41 @@ +# Changelog + +All notable changes to sekft, the shell-operator SFT trainer behind the +[posix-sdc](https://huggingface.co/datasets/tiararodney/posix-sdc) experiment, +are documented in this file. + +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), +and the project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html). + +## [1.0.0] - 2026-06-18 + +First release: the training and evaluation pipeline that turns posix-sdc +trajectories into a fine-tuned shell operator. + +### Added +- `sekft-train`: LoRA / QLoRA supervised fine-tuning of a base model on + shell-operation trajectories, with an **assistant-only loss mask** derived by + token-prefix differencing — the commands and the terminal `exit` / `panic` + token are trained; the environment turns (orientation, prompts, command + output) are masked to `-100`. The render uses the tokenizer's own + `apply_chat_template`, so training matches what the serving harness sends + (train = serve), with `normalize_for_template` canonicalising trajectories for + instruct templates that have no system role and require strict user/assistant + alternation. +- Three sources of training data: a directory of raw rollout `.json` + (keep-filtered), a curated `.jsonl` corpus, or the published posix-sdc corpus + over the Hugging Face Hub (`--hub`). +- `--inspect` for mask and token statistics without training, and structured + stderr logging across every phase (`-v` / `-q`): per-trajectory and progress + lines while the corpus is tokenized, dataset accounting that warns on dropped + (over-length / empty-mask) trajectories, and the per-step training curve. +- `sekft-eval`: behavioural evaluation that drops the tuned model into held-out + scenarios with no scaffold and scores whether it operates and terminates. +- `sekft-resident`: a resident-base harness that loads the base model once and + fits several adapters without reloading, for paired / STaR-style runs. +- Packaging: the `tiararodney.sekft` namespace package with `sekft-train`, + `sekft-eval`, and `sekft-resident` console scripts; a typed (`py.typed`), + mypy-strict codebase; an optional `[gpu]` extra (torch / transformers / peft); + and a dependency on `posix-sdc[hub]`. Released under GPL-2.0. + +[1.0.0]: https://git.code.tiararodney.com/tiara/sekft/releases/tag/v1.0.0