sekft

Author	SHA1	Message	Date
Tiara Rodney	87cfccd54e	todo(16): open	2026-06-18 16:46:10 +02:00
Tiara Rodney	f9913b45c3	todo(15): done _render_ids extracts input_ids from a BatchEncoding (5.x) or passes a list through (4.x); regression test asserts the BatchEncoding path yields the same mask; 10 tests pass; mypy strict clean. End-to-end box verification of the correct mask against Mistral done before this release. No submodule changes.	2026-06-18 12:37:03 +02:00
Tiara Rodney	7853224796	todo(15): in-progress build_masked_example normalises apply_chat_template's BatchEncoding (transformers 5.x) and list[int] (4.x) returns via a _render_ids helper; a BatchEncoding-returning fake tokenizer produces the same mask as the list-returning one (regression test added); existing tests pass; mypy strict clean; verified end-to-end on the box that sekft-train --inspect produces a correct assistant-only mask against Mistral.	2026-06-18 12:34:55 +02:00
Tiara Rodney	01e4d75237	todo(15): open	2026-06-18 12:34:37 +02:00
Tiara Rodney	b4bf90d275	todo(14): done Pipfile.lock pins posix-sdc ==1.2.2 from the private index in default + develop sections with huggingface_hub and the [hub] closure present; committed; tree clean. No submodule changes.	2026-06-18 00:44:38 +02:00
Tiara Rodney	baae4f3631	todo(14): in-progress Pipfile.lock pins tiararodney.posix-sdc ==1.2.2 from the private index in both default and develop sections, with huggingface_hub and its [hub] transitive deps present; committed on develop; tree clean after.	2026-06-18 00:44:25 +02:00
Tiara Rodney	cc26e61e6b	todo(14): open	2026-06-18 00:43:56 +02:00
Tiara Rodney	c3024d3df9	todo(13): done pyproject declares posix-sdc[hub]; Pipfile [packages] = private-index wheel, [dev-packages] = git develop; Pipfile.lock committed alongside. Tree clean. No submodule changes.	2026-06-18 00:18:06 +02:00
Tiara Rodney	b7878251ed	todo(13): in-progress pyproject declares posix-sdc[hub]; Pipfile [packages] pulls posix-sdc from the private index and [dev-packages] from git develop; Pipfile.lock committed and consistent; tree clean after commit.	2026-06-18 00:17:47 +02:00
Tiara Rodney	a37e52aacf	todo(13): open	2026-06-18 00:17:06 +02:00
Tiara Rodney	15e598bda1	todo(12): done sft.load_turns(data, hub, revision) yields turns from a raw rollout dir (keep-filtered), a curated .jsonl file, or the published corpus via posix-sdc load_trajectories (lazy import; Hub fallback) - verified the hub path yields all 787 trajectories via the editable checkout's data/; sekft-train gains --hub/--revision and dispatches --data by dir-vs-.jsonl; train()+inspect() use it; 9 tests pass (3 new for raw-dir/jsonl/reject); mypy strict clean (5 files). No submodule changes.	2026-06-18 00:05:36 +02:00
Tiara Rodney	d78a8028d2	todo(12): in-progress sft.load_turns(data, hub, revision) yields turns from a raw rollout dir (keep-filtered), a curated .jsonl file, or the published corpus via posix-sdc load_trajectories (Hub fallback), imported lazily; sekft-train gains --hub and --revision and dispatches --data by dir-vs-.jsonl; train() and inspect() use it; unit tests cover the raw-dir and jsonl paths; existing tests stay green; mypy strict clean.	2026-06-18 00:03:19 +02:00
Tiara Rodney	d47ba8a56e	todo(12): open	2026-06-18 00:03:10 +02:00
Tiara Rodney	c6939c0a64	todo(11): done bool() wraps the operate_rate predicate in eval.py and resident.py so the sum counts cleanly-operated trajectories without summing None; mypy strict clean across the whole package (5 files); 6 tests pass.	2026-06-17 23:49:27 +02:00
Tiara Rodney	637b746d1d	todo(11): in-progress operate_rate in eval.py and resident.py wraps the steps>0 and meta.get('clean') predicate in bool() so the sum counts cleanly-operated trajectories without summing None; mypy strict clean across the whole package (5 files); tests pass.	2026-06-17 23:48:29 +02:00
Tiara Rodney	299b2ce488	todo(11): open	2026-06-17 23:48:21 +02:00
Tiara Rodney	7673e47002	todo(10): done sekft-train logs run config and each phase (tokenizer/model load, build, training, save) via a sekft.train logger to stderr; -v/--verbose and -q/--quiet added; dataset accounting reports keepers->usable with over-length/empty-mask drop counts and a warning; build loop logs per-trajectory (debug) and every 100 (info); transformers verbosity raised so the per-step curve shows; inspect() logs likewise; 6 tests pass; mypy strict clean on sft.py + _log.py (verified loop logging fires with FakeTok). No submodule changes.	2026-06-17 23:47:52 +02:00
Tiara Rodney	47b84a0dce	todo(10): in-progress sekft-train logs run config and each phase (tokenizer load, model load, training, save) via a module logger to stderr; -v/--verbose and -q/--quiet control level; dataset accounting reports keepers->usable with counts dropped for length and empty-mask and warns when any are dropped; transformers verbosity raised so the per-step curve shows during training; inspect() logs likewise; existing sft tests stay green; mypy strict clean.	2026-06-17 23:43:21 +02:00
Tiara Rodney	814261dc56	todo(10): open	2026-06-17 23:43:08 +02:00
Tiara Rodney	90e0ebbb45	todo(9): done mypy --strict src tests passes (no issues, 6 files); ML/posix-sdc imports ignored via overrides; sft/eval/resident/tests fully annotated; mypy is a declared dev dep; py.typed ships in the package.	2026-06-17 14:03:54 +02:00
Tiara Rodney	ee2a729438	todo(9): in-progress mypy --strict src tests passes; ML imports are configured to not error; eval/resident/sft are fully annotated; py.typed ships in the package.	2026-06-17 13:53:09 +02:00
Tiara Rodney	4fa082478c	todo(9): open	2026-06-17 13:53:02 +02:00
Tiara Rodney	26f956f25a	todo(8): done README rewritten for the trainer (sft/eval/resident, posix-sdc dependency, gpu extra, render contract, console-script usage); module docstrings updated to sekft-train/eval/resident and apply_chat_template.	2026-06-16 23:49:04 +02:00
Tiara Rodney	9cdb2bdc97	todo(8): in-progress README describes the trainer/eval/resident, the posix-sdc dependency, install via the gpu extra, the chat-template render contract, and console-script usage; module docstrings reference sekft-train/eval/resident and apply_chat_template, not the old flat scripts.	2026-06-16 23:47:43 +02:00
Tiara Rodney	4646d34d9d	todo(8): open	2026-06-16 23:47:41 +02:00
Tiara Rodney	a34ce78dc5	todo(7): done GPL-2.0 LICENSE added with pyproject license-files and GPLv2 classifier; Dockerfile removed (now in posix-sdc docker/alpine-dash).	2026-06-16 20:28:19 +02:00
Tiara Rodney	acaf8dd061	todo(7): in-progress LICENSE holds the GPL-2.0 text; pyproject declares it via license-files and a GPLv2 classifier; the Dockerfile is gone from sekft.	2026-06-16 20:27:51 +02:00
Tiara Rodney	0a03dd0cfa	todo(7): open	2026-06-16 20:27:50 +02:00
Tiara Rodney	95ce275301	todo(6): done 6 tests pass (4 unit render/mask, 2 smoke entry points); mask test covers assistant-only training and the non-additive guard.	2026-06-16 20:15:21 +02:00
Tiara Rodney	45120dea97	todo(6): in-progress tests/unit and tests/smoke pass under pytest; the mask test proves assistant-only training and raises on non-additive templates; entry-point smoke tests pass without torch.	2026-06-16 20:15:18 +02:00
Tiara Rodney	9bd61f99a3	todo(6): open	2026-06-16 20:15:16 +02:00
Tiara Rodney	cf130e032e	todo(5): done README documents the pipeline, the posix-sdc dependency, and the box workflow.	2026-06-16 20:15:13 +02:00
Tiara Rodney	12f10854c3	todo(5): in-progress README describes the trainer/eval/resident roles, the posix-sdc dependency, the chat-template render contract, and the box workflow.	2026-06-16 20:15:11 +02:00
Tiara Rodney	a24c6c9681	todo(5): open	2026-06-16 20:15:09 +02:00
Tiara Rodney	fd30ab47ed	todo(4): done Resident base loads once; fit/evaluate cycle adapters without reloading; renders match the trainer.	2026-06-16 20:14:49 +02:00
Tiara Rodney	e810d0e442	todo(4): in-progress Resident loads the base once; fit trains an adapter and unloads it; evaluate attaches an adapter (or the base baseline) and renders via the shared chat-template canonicalisation.	2026-06-16 20:14:46 +02:00
Tiara Rodney	6f5245cda6	todo(4): open	2026-06-16 20:14:44 +02:00
Tiara Rodney	0dc0380537	todo(3): done Evaluator drives held-out scenarios with a chat-template local operator and reports the behavioural rates.	2026-06-16 20:14:41 +02:00
Tiara Rodney	7ee0d63522	todo(3): in-progress eval reuses the posix-sdc rollout with a local operator; renders via apply_chat_template + normalize_for_template; reports command-mode, terminate, and verified rates over held-out scenarios.	2026-06-16 20:14:38 +02:00
Tiara Rodney	1d672ff8d9	todo(3): open	2026-06-16 20:14:35 +02:00
Tiara Rodney	4e45bacf0b	todo(2): done Trainer renders with the model's chat template, canonicalises turns, masks to assistant-only by token-prefix differencing, and trains a QLoRA adapter.	2026-06-16 20:14:15 +02:00
Tiara Rodney	4533a04021	todo(2): in-progress sft renders via apply_chat_template; normalize_for_template folds system and merges consecutive turns; the loss mask trains assistant turns only and raises on a non-additive template; --inspect reports mask stats.	2026-06-16 20:14:11 +02:00
Tiara Rodney	fab16d175d	todo(2): open	2026-06-16 20:14:09 +02:00
Tiara Rodney	046683a371	todo(1): done Namespace package builds; gpu extra isolates torch/transformers/peft/datasets; posix-sdc declared in pyproject and overridden editable in Pipfile; console scripts wired; tox envs defined.	2026-06-16 20:13:51 +02:00
Tiara Rodney	0ebf19b4ec	todo(1): in-progress tiararodney.sekft imports; sekft-train/eval/resident console scripts resolve; pyproject declares posix-sdc as a dependency with torch et al. behind a gpu extra; Pipfile carries the local editable posix-sdc override; tox envs exist.	2026-06-16 20:13:47 +02:00
Tiara Rodney	c4701e3d7f	todo(1): open	2026-06-16 20:13:45 +02:00
Tiara Rodney	ad027a7d52	chore: initial commit	2026-06-16 20:13:14 +02:00

47 commits