release(1.0.0): add changelog

First release: the trainer (assistant-only mask, train=serve render), the three-source data loader (raw dir / curated jsonl / Hub), eval, resident, and structured logging. Packaged, typed, [gpu] extra, depends on posix-sdc[hub].
2026-06-18 00:57:54 +02:00 · 2026-06-18 00:57:54 +02:00 · 7856ef6032
commit 7856ef6032
parent 34bfe675c4
1 changed files with 41 additions and 0 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -0,0 +1,41 @@
+# Changelog
+
+All notable changes to sekft, the shell-operator SFT trainer behind the
+[posix-sdc](https://huggingface.co/datasets/tiararodney/posix-sdc) experiment,
+are documented in this file.
+
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
+and the project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+
+## [1.0.0] - 2026-06-18
+
+First release: the training and evaluation pipeline that turns posix-sdc
+trajectories into a fine-tuned shell operator.
+
+### Added
+- `sekft-train`: LoRA / QLoRA supervised fine-tuning of a base model on
+  shell-operation trajectories, with an **assistant-only loss mask** derived by
+  token-prefix differencing — the commands and the terminal `exit` / `panic`
+  token are trained; the environment turns (orientation, prompts, command
+  output) are masked to `-100`. The render uses the tokenizer's own
+  `apply_chat_template`, so training matches what the serving harness sends
+  (train = serve), with `normalize_for_template` canonicalising trajectories for
+  instruct templates that have no system role and require strict user/assistant
+  alternation.
+- Three sources of training data: a directory of raw rollout `.json`
+  (keep-filtered), a curated `.jsonl` corpus, or the published posix-sdc corpus
+  over the Hugging Face Hub (`--hub`).
+- `--inspect` for mask and token statistics without training, and structured
+  stderr logging across every phase (`-v` / `-q`): per-trajectory and progress
+  lines while the corpus is tokenized, dataset accounting that warns on dropped
+  (over-length / empty-mask) trajectories, and the per-step training curve.
+- `sekft-eval`: behavioural evaluation that drops the tuned model into held-out
+  scenarios with no scaffold and scores whether it operates and terminates.
+- `sekft-resident`: a resident-base harness that loads the base model once and
+  fits several adapters without reloading, for paired / STaR-style runs.
+- Packaging: the `tiararodney.sekft` namespace package with `sekft-train`,
+  `sekft-eval`, and `sekft-resident` console scripts; a typed (`py.typed`),
+  mypy-strict codebase; an optional `[gpu]` extra (torch / transformers / peft);
+  and a dependency on `posix-sdc[hub]`. Released under GPL-2.0.
+
+[1.0.0]: https://git.code.tiararodney.com/tiara/sekft/releases/tag/v1.0.0