sekft

Author	SHA1	Message	Date
Tiara Rodney	847d3dac10	Merge branch 'feature/17'	2026-06-18 23:11:03 +02:00
Tiara Rodney	15201302b2	todo(17): done Intro blockquote added: 'not tool-calling'. Stays on develop for the next sekft release.	2026-06-18 23:11:03 +02:00
Tiara Rodney	b87578d0b0	todo(17): in-progress The README intro carries a clear 'not tool-calling' callout before the 'training half' paragraph.	2026-06-18 23:10:27 +02:00
Tiara Rodney	7edfb0640c	todo(17): open	2026-06-18 23:10:26 +02:00
Tiara Rodney	0a4adbdc5f	Merge branch 'develop'	2026-06-18 16:49:49 +02:00
Tiara Rodney	bd04c02b41	Merge branch 'bugfix/16' bugfix(16): operators must not feed a BatchEncoding to model.generate	2026-06-18 16:49:33 +02:00
Tiara Rodney	1fb35e8e10	todo(16): done _input_ids extracts ids from a BatchEncoding (5.x) or bare list/tensor (4.x); render_prompt_ids builds the generation prompt and extracts the tensor; eval.py + resident.py operators use it (no more BatchEncoding to generate); 12 tests pass (2 new); mypy strict clean. Box eval verification follows this release. No submodule changes.	2026-06-18 16:49:32 +02:00
Tiara Rodney	1279bc8965	bugfix(16): operators must not feed a BatchEncoding to model.generate The transformers 5.x return-type change behind #15 also breaks generation: apply_chat_template(add_generation_prompt=True, return_tensors="pt") returns a BatchEncoding, and eval.py + resident.py passed it to model.generate, which does inputs.shape[0] -> AttributeError (the holdout eval crashed on scenario 1). #15 fixed only the trainer. Factor a shared _input_ids helper and a render_prompt_ids function; both operators use it. Tests cover _input_ids for both shapes and render_prompt_ids.	2026-06-18 16:49:30 +02:00
Tiara Rodney	d261919404	todo(16): in-progress A shared _input_ids helper extracts the id sequence from a BatchEncoding (5.x) or bare list/tensor (4.x); _render_ids uses it; a new render_prompt_ids(tokenizer, messages, device) builds the generation prompt and extracts input_ids; eval.py and resident.py operators use render_prompt_ids instead of passing a BatchEncoding to generate; unit test covers _input_ids for both shapes; existing tests pass; mypy strict clean; holdout eval runs on the box without the AttributeError.	2026-06-18 16:46:57 +02:00
Tiara Rodney	87cfccd54e	todo(16): open	2026-06-18 16:46:10 +02:00
Tiara Rodney	a76470e55d	Merge branch 'develop'	2026-06-18 12:37:18 +02:00
Tiara Rodney	e1f8ef8d1a	Merge branch 'bugfix/15' bugfix(15): normalise apply_chat_template's BatchEncoding (transformers 5.x)	2026-06-18 12:37:04 +02:00
Tiara Rodney	f9913b45c3	todo(15): done _render_ids extracts input_ids from a BatchEncoding (5.x) or passes a list through (4.x); regression test asserts the BatchEncoding path yields the same mask; 10 tests pass; mypy strict clean. End-to-end box verification of the correct mask against Mistral done before this release. No submodule changes.	2026-06-18 12:37:03 +02:00
Tiara Rodney	4987d951ce	bugfix(15): normalise apply_chat_template's BatchEncoding (transformers 5.x) apply_chat_template returns a BatchEncoding ({input_ids: [...]}) on transformers >= 5 where 4.x returned a bare list[int]. build_masked_example treated the render as a dict, so len/slicing were wrong and the prefix-differencing spuriously raised "chat template is not additive" on every real model. Extract the id sequence via a _render_ids helper; verified the assistant-only mask against mistralai/Mistral-7B-Instruct-v0.2. The fake tokenizer returned a bare list and missed this, so a BatchEncoding-returning variant now guards it.	2026-06-18 12:37:01 +02:00
Tiara Rodney	7853224796	todo(15): in-progress build_masked_example normalises apply_chat_template's BatchEncoding (transformers 5.x) and list[int] (4.x) returns via a _render_ids helper; a BatchEncoding-returning fake tokenizer produces the same mask as the list-returning one (regression test added); existing tests pass; mypy strict clean; verified end-to-end on the box that sekft-train --inspect produces a correct assistant-only mask against Mistral.	2026-06-18 12:34:55 +02:00
Tiara Rodney	01e4d75237	todo(15): open	2026-06-18 12:34:37 +02:00
Tiara Rodney	705b4a028b	Merge branch 'develop'	2026-06-18 00:59:08 +02:00
Tiara Rodney	7856ef6032	release(1.0.0): add changelog First release: the trainer (assistant-only mask, train=serve render), the three-source data loader (raw dir / curated jsonl / Hub), eval, resident, and structured logging. Packaged, typed, [gpu] extra, depends on posix-sdc[hub].	2026-06-18 00:57:54 +02:00
Tiara Rodney	34bfe675c4	Merge branch 'bugfix/14' bugfix(14): refresh Pipfile.lock against published posix-sdc 1.2.2	2026-06-18 00:44:39 +02:00
Tiara Rodney	b4bf90d275	todo(14): done Pipfile.lock pins posix-sdc ==1.2.2 from the private index in default + develop sections with huggingface_hub and the [hub] closure present; committed; tree clean. No submodule changes.	2026-06-18 00:44:38 +02:00
Tiara Rodney	44ed08a239	bugfix(14): refresh Pipfile.lock against published posix-sdc 1.2.2 The lock committed with the triplet (#13) predated the published wheel. Now that posix-sdc 1.2.2 is on the private index, re-lock: it pins ==1.2.2 from the index in both the default and develop sections, and the [hub] extra pulls huggingface_hub and its transitive deps into the lock, so the next machine installs the published wheel with the Hub path available.	2026-06-18 00:44:36 +02:00
Tiara Rodney	baae4f3631	todo(14): in-progress Pipfile.lock pins tiararodney.posix-sdc ==1.2.2 from the private index in both default and develop sections, with huggingface_hub and its [hub] transitive deps present; committed on develop; tree clean after.	2026-06-18 00:44:25 +02:00
Tiara Rodney	cc26e61e6b	todo(14): open	2026-06-18 00:43:56 +02:00
Tiara Rodney	c295a98748	Merge branch 'feature/13' feat(13): reference posix-sdc three ways for seamless multi-machine dev	2026-06-18 00:18:06 +02:00
Tiara Rodney	c3024d3df9	todo(13): done pyproject declares posix-sdc[hub]; Pipfile [packages] = private-index wheel, [dev-packages] = git develop; Pipfile.lock committed alongside. Tree clean. No submodule changes.	2026-06-18 00:18:06 +02:00
Tiara Rodney	74d9793e76	feat(13): reference posix-sdc three ways for seamless multi-machine dev Wire the posix-sdc dependency as a triplet: - pyproject declares the abstract posix-sdc[hub], so the trainer's --hub path can reach the Hub (huggingface_hub) wherever sekft is installed; - Pipfile [packages] pulls the published wheel from the private index; - Pipfile [dev-packages] pulls the git develop branch for develop-time work. Commit Pipfile.lock so the dependency surface and its lock land together.	2026-06-18 00:17:56 +02:00
Tiara Rodney	b7878251ed	todo(13): in-progress pyproject declares posix-sdc[hub]; Pipfile [packages] pulls posix-sdc from the private index and [dev-packages] from git develop; Pipfile.lock committed and consistent; tree clean after commit.	2026-06-18 00:17:47 +02:00
Tiara Rodney	a37e52aacf	todo(13): open	2026-06-18 00:17:06 +02:00
Tiara Rodney	b8843557d7	Merge branch 'feature/12' feat(12): load training data from a raw dir, a curated jsonl, or the Hub	2026-06-18 00:05:36 +02:00
Tiara Rodney	15e598bda1	todo(12): done sft.load_turns(data, hub, revision) yields turns from a raw rollout dir (keep-filtered), a curated .jsonl file, or the published corpus via posix-sdc load_trajectories (lazy import; Hub fallback) - verified the hub path yields all 787 trajectories via the editable checkout's data/; sekft-train gains --hub/--revision and dispatches --data by dir-vs-.jsonl; train()+inspect() use it; 9 tests pass (3 new for raw-dir/jsonl/reject); mypy strict clean (5 files). No submodule changes.	2026-06-18 00:05:36 +02:00
Tiara Rodney	414e963825	feat(12): load training data from a raw dir, a curated jsonl, or the Hub iter_keepers read only raw per-trajectory .json -- one of three input shapes. Add load_turns(data, hub, revision) yielding assistant-bearing turns from a raw rollout dir (keep-filtered), a curated .jsonl corpus (one record per line), or the published corpus via posix-sdc's load_trajectories (the in-repo data/ of a checkout, else the Hugging Face Hub). sekft-train gains --hub and --revision and dispatches --data by dir-vs-.jsonl; train() and inspect() use it. Raw-rollout reading stays sekft-local; curated + Hub reuse posix-sdc's loader, imported lazily so the raw/jsonl paths need neither posix-sdc nor huggingface_hub installed. Unit tests cover the raw-dir and jsonl dispatch.	2026-06-18 00:05:27 +02:00
Tiara Rodney	d78a8028d2	todo(12): in-progress sft.load_turns(data, hub, revision) yields turns from a raw rollout dir (keep-filtered), a curated .jsonl file, or the published corpus via posix-sdc load_trajectories (Hub fallback), imported lazily; sekft-train gains --hub and --revision and dispatches --data by dir-vs-.jsonl; train() and inspect() use it; unit tests cover the raw-dir and jsonl paths; existing tests stay green; mypy strict clean.	2026-06-18 00:03:19 +02:00
Tiara Rodney	d47ba8a56e	todo(12): open	2026-06-18 00:03:10 +02:00
Tiara Rodney	2209ade52c	Merge branch 'bugfix/11' bugfix(11): operate_rate must not sum a None	2026-06-17 23:49:27 +02:00
Tiara Rodney	c6939c0a64	todo(11): done bool() wraps the operate_rate predicate in eval.py and resident.py so the sum counts cleanly-operated trajectories without summing None; mypy strict clean across the whole package (5 files); 6 tests pass.	2026-06-17 23:49:27 +02:00
Tiara Rodney	157bb4955d	bugfix(11): operate_rate must not sum a None sum(t.steps > 0 and t.meta.get("clean") for t in rows) yields the right operand of `and` when steps>0, so a trajectory whose meta lacks the "clean" key contributes None and sum() raises TypeError. Wrap the predicate in bool() so it counts trajectories that operated and are clean. Surfaced by mypy once posix-sdc began shipping py.typed (Trajectory is now typed).	2026-06-17 23:49:26 +02:00
Tiara Rodney	637b746d1d	todo(11): in-progress operate_rate in eval.py and resident.py wraps the steps>0 and meta.get('clean') predicate in bool() so the sum counts cleanly-operated trajectories without summing None; mypy strict clean across the whole package (5 files); tests pass.	2026-06-17 23:48:29 +02:00
Tiara Rodney	299b2ce488	todo(11): open	2026-06-17 23:48:21 +02:00
Tiara Rodney	1c890c703f	Merge branch 'feature/10' feat(10): structured logging for the trainer	2026-06-17 23:47:52 +02:00
Tiara Rodney	7673e47002	todo(10): done sekft-train logs run config and each phase (tokenizer/model load, build, training, save) via a sekft.train logger to stderr; -v/--verbose and -q/--quiet added; dataset accounting reports keepers->usable with over-length/empty-mask drop counts and a warning; build loop logs per-trajectory (debug) and every 100 (info); transformers verbosity raised so the per-step curve shows; inspect() logs likewise; 6 tests pass; mypy strict clean on sft.py + _log.py (verified loop logging fires with FakeTok). No submodule changes.	2026-06-17 23:47:52 +02:00
Tiara Rodney	e4e88c5c18	feat(10): structured logging for the trainer The trainer was nearly silent: outside an example count and a save line it printed nothing through tokenizer load, the base-model load, example building, or the training loop, and trajectories dropped for length or empty mask vanished without a trace. Add _log.py (a shared stderr logging setup so stdout stays clean for results) and a module logger. sekft-train gains -v/--verbose and -q/--quiet. Log the run config and each phase; report dataset accounting (keepers -> usable, with counts dropped for over-length and empty-mask and a warning when any are dropped); inside the build loop, a per-trajectory debug line and a progress line every 100; raise transformers' verbosity during training so the per-step curve shows. Prints in train() and inspect() are routed through the logger.	2026-06-17 23:47:42 +02:00
Tiara Rodney	47b84a0dce	todo(10): in-progress sekft-train logs run config and each phase (tokenizer load, model load, training, save) via a module logger to stderr; -v/--verbose and -q/--quiet control level; dataset accounting reports keepers->usable with counts dropped for length and empty-mask and warns when any are dropped; transformers verbosity raised so the per-step curve shows during training; inspect() logs likewise; existing sft tests stay green; mypy strict clean.	2026-06-17 23:43:21 +02:00
Tiara Rodney	814261dc56	todo(10): open	2026-06-17 23:43:08 +02:00
Tiara Rodney	86df915524	Merge branch 'feature/9'	2026-06-17 14:03:55 +02:00
Tiara Rodney	90e0ebbb45	todo(9): done mypy --strict src tests passes (no issues, 6 files); ML/posix-sdc imports ignored via overrides; sft/eval/resident/tests fully annotated; mypy is a declared dev dep; py.typed ships in the package.	2026-06-17 14:03:54 +02:00
Tiara Rodney	64020c321d	test: annotate the sft test helpers	2026-06-17 14:03:52 +02:00
Tiara Rodney	9397280e6f	refactor: annotate the trainer modules under mypy strict	2026-06-17 14:03:52 +02:00
Tiara Rodney	e60495b2ce	chore: set up mypy strict checking and ship py.typed	2026-06-17 14:03:46 +02:00
Tiara Rodney	ee2a729438	todo(9): in-progress mypy --strict src tests passes; ML imports are configured to not error; eval/resident/sft are fully annotated; py.typed ships in the package.	2026-06-17 13:53:09 +02:00
Tiara Rodney	4fa082478c	todo(9): open	2026-06-17 13:53:02 +02:00

1 2

96 commits