Commit graph

84 commits

Author SHA1 Message Date
Tiara Rodney
f9913b45c3
todo(15): done
_render_ids extracts input_ids from a BatchEncoding (5.x) or passes a list through (4.x); regression test asserts the BatchEncoding path yields the same mask; 10 tests pass; mypy strict clean. End-to-end box verification of the correct mask against Mistral done before this release. No submodule changes.
2026-06-18 12:37:03 +02:00
Tiara Rodney
4987d951ce
bugfix(15): normalise apply_chat_template's BatchEncoding (transformers 5.x)
apply_chat_template returns a BatchEncoding ({input_ids: [...]}) on transformers
>= 5 where 4.x returned a bare list[int]. build_masked_example treated the render
as a dict, so len/slicing were wrong and the prefix-differencing spuriously
raised "chat template is not additive" on every real model. Extract the id
sequence via a _render_ids helper; verified the assistant-only mask against
mistralai/Mistral-7B-Instruct-v0.2. The fake tokenizer returned a bare list and
missed this, so a BatchEncoding-returning variant now guards it.
2026-06-18 12:37:01 +02:00
Tiara Rodney
7853224796
todo(15): in-progress
build_masked_example normalises apply_chat_template's BatchEncoding (transformers 5.x) and list[int] (4.x) returns via a _render_ids helper; a BatchEncoding-returning fake tokenizer produces the same mask as the list-returning one (regression test added); existing tests pass; mypy strict clean; verified end-to-end on the box that sekft-train --inspect produces a correct assistant-only mask against Mistral.
2026-06-18 12:34:55 +02:00
Tiara Rodney
01e4d75237
todo(15): open 2026-06-18 12:34:37 +02:00
Tiara Rodney
705b4a028b
Merge branch 'develop' 2026-06-18 00:59:08 +02:00
Tiara Rodney
7856ef6032
release(1.0.0): add changelog
First release: the trainer (assistant-only mask, train=serve render), the
three-source data loader (raw dir / curated jsonl / Hub), eval, resident, and
structured logging. Packaged, typed, [gpu] extra, depends on posix-sdc[hub].
2026-06-18 00:57:54 +02:00
Tiara Rodney
34bfe675c4
Merge branch 'bugfix/14'
bugfix(14): refresh Pipfile.lock against published posix-sdc 1.2.2
2026-06-18 00:44:39 +02:00
Tiara Rodney
b4bf90d275
todo(14): done
Pipfile.lock pins posix-sdc ==1.2.2 from the private index in default + develop sections with huggingface_hub and the [hub] closure present; committed; tree clean. No submodule changes.
2026-06-18 00:44:38 +02:00
Tiara Rodney
44ed08a239
bugfix(14): refresh Pipfile.lock against published posix-sdc 1.2.2
The lock committed with the triplet (#13) predated the published wheel. Now that
posix-sdc 1.2.2 is on the private index, re-lock: it pins ==1.2.2 from the index
in both the default and develop sections, and the [hub] extra pulls
huggingface_hub and its transitive deps into the lock, so the next machine
installs the published wheel with the Hub path available.
2026-06-18 00:44:36 +02:00
Tiara Rodney
baae4f3631
todo(14): in-progress
Pipfile.lock pins tiararodney.posix-sdc ==1.2.2 from the private index in both default and develop sections, with huggingface_hub and its [hub] transitive deps present; committed on develop; tree clean after.
2026-06-18 00:44:25 +02:00
Tiara Rodney
cc26e61e6b
todo(14): open 2026-06-18 00:43:56 +02:00
Tiara Rodney
c295a98748
Merge branch 'feature/13'
feat(13): reference posix-sdc three ways for seamless multi-machine dev
2026-06-18 00:18:06 +02:00
Tiara Rodney
c3024d3df9
todo(13): done
pyproject declares posix-sdc[hub]; Pipfile [packages] = private-index wheel, [dev-packages] = git develop; Pipfile.lock committed alongside. Tree clean. No submodule changes.
2026-06-18 00:18:06 +02:00
Tiara Rodney
74d9793e76
feat(13): reference posix-sdc three ways for seamless multi-machine dev
Wire the posix-sdc dependency as a triplet:
- pyproject declares the abstract posix-sdc[hub], so the trainer's --hub path can
  reach the Hub (huggingface_hub) wherever sekft is installed;
- Pipfile [packages] pulls the published wheel from the private index;
- Pipfile [dev-packages] pulls the git develop branch for develop-time work.

Commit Pipfile.lock so the dependency surface and its lock land together.
2026-06-18 00:17:56 +02:00
Tiara Rodney
b7878251ed
todo(13): in-progress
pyproject declares posix-sdc[hub]; Pipfile [packages] pulls posix-sdc from the private index and [dev-packages] from git develop; Pipfile.lock committed and consistent; tree clean after commit.
2026-06-18 00:17:47 +02:00
Tiara Rodney
a37e52aacf
todo(13): open 2026-06-18 00:17:06 +02:00
Tiara Rodney
b8843557d7
Merge branch 'feature/12'
feat(12): load training data from a raw dir, a curated jsonl, or the Hub
2026-06-18 00:05:36 +02:00
Tiara Rodney
15e598bda1
todo(12): done
sft.load_turns(data, hub, revision) yields turns from a raw rollout dir (keep-filtered), a curated .jsonl file, or the published corpus via posix-sdc load_trajectories (lazy import; Hub fallback) - verified the hub path yields all 787 trajectories via the editable checkout's data/; sekft-train gains --hub/--revision and dispatches --data by dir-vs-.jsonl; train()+inspect() use it; 9 tests pass (3 new for raw-dir/jsonl/reject); mypy strict clean (5 files). No submodule changes.
2026-06-18 00:05:36 +02:00
Tiara Rodney
414e963825
feat(12): load training data from a raw dir, a curated jsonl, or the Hub
iter_keepers read only raw per-trajectory .json -- one of three input shapes.
Add load_turns(data, hub, revision) yielding assistant-bearing turns from a raw
rollout dir (keep-filtered), a curated .jsonl corpus (one record per line), or
the published corpus via posix-sdc's load_trajectories (the in-repo data/ of a
checkout, else the Hugging Face Hub). sekft-train gains --hub and --revision and
dispatches --data by dir-vs-.jsonl; train() and inspect() use it.

Raw-rollout reading stays sekft-local; curated + Hub reuse posix-sdc's loader,
imported lazily so the raw/jsonl paths need neither posix-sdc nor huggingface_hub
installed. Unit tests cover the raw-dir and jsonl dispatch.
2026-06-18 00:05:27 +02:00
Tiara Rodney
d78a8028d2
todo(12): in-progress
sft.load_turns(data, hub, revision) yields turns from a raw rollout dir (keep-filtered), a curated .jsonl file, or the published corpus via posix-sdc load_trajectories (Hub fallback), imported lazily; sekft-train gains --hub and --revision and dispatches --data by dir-vs-.jsonl; train() and inspect() use it; unit tests cover the raw-dir and jsonl paths; existing tests stay green; mypy strict clean.
2026-06-18 00:03:19 +02:00
Tiara Rodney
d47ba8a56e
todo(12): open 2026-06-18 00:03:10 +02:00
Tiara Rodney
2209ade52c
Merge branch 'bugfix/11'
bugfix(11): operate_rate must not sum a None
2026-06-17 23:49:27 +02:00
Tiara Rodney
c6939c0a64
todo(11): done
bool() wraps the operate_rate predicate in eval.py and resident.py so the sum counts cleanly-operated trajectories without summing None; mypy strict clean across the whole package (5 files); 6 tests pass.
2026-06-17 23:49:27 +02:00
Tiara Rodney
157bb4955d
bugfix(11): operate_rate must not sum a None
sum(t.steps > 0 and t.meta.get("clean") for t in rows) yields the right operand
of `and` when steps>0, so a trajectory whose meta lacks the "clean" key
contributes None and sum() raises TypeError. Wrap the predicate in bool() so it
counts trajectories that operated and are clean. Surfaced by mypy once posix-sdc
began shipping py.typed (Trajectory is now typed).
2026-06-17 23:49:26 +02:00
Tiara Rodney
637b746d1d
todo(11): in-progress
operate_rate in eval.py and resident.py wraps the steps>0 and meta.get('clean') predicate in bool() so the sum counts cleanly-operated trajectories without summing None; mypy strict clean across the whole package (5 files); tests pass.
2026-06-17 23:48:29 +02:00
Tiara Rodney
299b2ce488
todo(11): open 2026-06-17 23:48:21 +02:00
Tiara Rodney
1c890c703f
Merge branch 'feature/10'
feat(10): structured logging for the trainer
2026-06-17 23:47:52 +02:00
Tiara Rodney
7673e47002
todo(10): done
sekft-train logs run config and each phase (tokenizer/model load, build, training, save) via a sekft.train logger to stderr; -v/--verbose and -q/--quiet added; dataset accounting reports keepers->usable with over-length/empty-mask drop counts and a warning; build loop logs per-trajectory (debug) and every 100 (info); transformers verbosity raised so the per-step curve shows; inspect() logs likewise; 6 tests pass; mypy strict clean on sft.py + _log.py (verified loop logging fires with FakeTok). No submodule changes.
2026-06-17 23:47:52 +02:00
Tiara Rodney
e4e88c5c18
feat(10): structured logging for the trainer
The trainer was nearly silent: outside an example count and a save line it
printed nothing through tokenizer load, the base-model load, example building, or
the training loop, and trajectories dropped for length or empty mask vanished
without a trace.

Add _log.py (a shared stderr logging setup so stdout stays clean for results)
and a module logger. sekft-train gains -v/--verbose and -q/--quiet. Log the run
config and each phase; report dataset accounting (keepers -> usable, with counts
dropped for over-length and empty-mask and a warning when any are dropped);
inside the build loop, a per-trajectory debug line and a progress line every 100;
raise transformers' verbosity during training so the per-step curve shows. Prints
in train() and inspect() are routed through the logger.
2026-06-17 23:47:42 +02:00
Tiara Rodney
47b84a0dce
todo(10): in-progress
sekft-train logs run config and each phase (tokenizer load, model load, training, save) via a module logger to stderr; -v/--verbose and -q/--quiet control level; dataset accounting reports keepers->usable with counts dropped for length and empty-mask and warns when any are dropped; transformers verbosity raised so the per-step curve shows during training; inspect() logs likewise; existing sft tests stay green; mypy strict clean.
2026-06-17 23:43:21 +02:00
Tiara Rodney
814261dc56
todo(10): open 2026-06-17 23:43:08 +02:00
Tiara Rodney
86df915524
Merge branch 'feature/9' 2026-06-17 14:03:55 +02:00
Tiara Rodney
90e0ebbb45
todo(9): done
mypy --strict src tests passes (no issues, 6 files); ML/posix-sdc imports ignored via overrides; sft/eval/resident/tests fully annotated; mypy is a declared dev dep; py.typed ships in the package.
2026-06-17 14:03:54 +02:00
Tiara Rodney
64020c321d
test: annotate the sft test helpers 2026-06-17 14:03:52 +02:00
Tiara Rodney
9397280e6f
refactor: annotate the trainer modules under mypy strict 2026-06-17 14:03:52 +02:00
Tiara Rodney
e60495b2ce
chore: set up mypy strict checking and ship py.typed 2026-06-17 14:03:46 +02:00
Tiara Rodney
ee2a729438
todo(9): in-progress
mypy --strict src tests passes; ML imports are configured to not error; eval/resident/sft are fully annotated; py.typed ships in the package.
2026-06-17 13:53:09 +02:00
Tiara Rodney
4fa082478c
todo(9): open 2026-06-17 13:53:02 +02:00
Tiara Rodney
e46e12c70b
Merge branch 'feature/8' 2026-06-16 23:49:04 +02:00
Tiara Rodney
26f956f25a
todo(8): done
README rewritten for the trainer (sft/eval/resident, posix-sdc dependency, gpu extra, render contract, console-script usage); module docstrings updated to sekft-train/eval/resident and apply_chat_template.
2026-06-16 23:49:04 +02:00
Tiara Rodney
9e487bc2bf
docs: refresh module docstrings for the packaged layout and render contract 2026-06-16 23:49:02 +02:00
Tiara Rodney
a0b1fbc0c1
docs: rewrite README for the packaged trainer 2026-06-16 23:49:01 +02:00
Tiara Rodney
9cdb2bdc97
todo(8): in-progress
README describes the trainer/eval/resident, the posix-sdc dependency, install via the gpu extra, the chat-template render contract, and console-script usage; module docstrings reference sekft-train/eval/resident and apply_chat_template, not the old flat scripts.
2026-06-16 23:47:43 +02:00
Tiara Rodney
4646d34d9d
todo(8): open 2026-06-16 23:47:41 +02:00
Tiara Rodney
9731dd250c
Merge branch 'feature/7' 2026-06-16 20:28:20 +02:00
Tiara Rodney
a34ce78dc5
todo(7): done
GPL-2.0 LICENSE added with pyproject license-files and GPLv2 classifier; Dockerfile removed (now in posix-sdc docker/alpine-dash).
2026-06-16 20:28:19 +02:00
Tiara Rodney
f471db2cde
chore: remove Dockerfile relocated to posix-sdc 2026-06-16 20:28:18 +02:00
Tiara Rodney
66365f3d5f
chore: add GPL-2.0 license 2026-06-16 20:28:17 +02:00
Tiara Rodney
acaf8dd061
todo(7): in-progress
LICENSE holds the GPL-2.0 text; pyproject declares it via license-files and a GPLv2 classifier; the Dockerfile is gone from sekft.
2026-06-16 20:27:51 +02:00
Tiara Rodney
0a03dd0cfa
todo(7): open 2026-06-16 20:27:50 +02:00