bugfix(15): normalise apply_chat_template's BatchEncoding (transformers 5.x)
apply_chat_template returns a BatchEncoding ({input_ids: [...]}) on transformers
>= 5 where 4.x returned a bare list[int]. build_masked_example treated the render
as a dict, so len/slicing were wrong and the prefix-differencing spuriously
raised "chat template is not additive" on every real model. Extract the id
sequence via a _render_ids helper; verified the assistant-only mask against
mistralai/Mistral-7B-Instruct-v0.2. The fake tokenizer returned a bare list and
missed this, so a BatchEncoding-returning variant now guards it.
This commit is contained in:
parent
7853224796
commit
4987d951ce
3 changed files with 51 additions and 2 deletions
13
CHANGELOG.md
13
CHANGELOG.md
|
|
@ -7,6 +7,18 @@ are documented in this file.
|
|||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
||||
and the project follows [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [1.0.1] - 2026-06-18
|
||||
|
||||
### Fixed
|
||||
- `build_masked_example` could not derive the assistant mask on transformers
|
||||
≥ 5: `apply_chat_template` now returns a `BatchEncoding` (`{input_ids: [...]}`)
|
||||
where 4.x returned a bare `list[int]`, so the render was treated as a dict and
|
||||
the prefix-differencing spuriously raised "chat template is not additive" on
|
||||
every real model. The id sequence is now extracted either way; verified the
|
||||
assistant-only mask against `mistralai/Mistral-7B-Instruct-v0.2`. The
|
||||
fake-tokenizer test gained a `BatchEncoding`-returning variant so this can't
|
||||
regress.
|
||||
|
||||
## [1.0.0] - 2026-06-18
|
||||
|
||||
First release: the training and evaluation pipeline that turns posix-sdc
|
||||
|
|
@ -38,4 +50,5 @@ trajectories into a fine-tuned shell operator.
|
|||
mypy-strict codebase; an optional `[gpu]` extra (torch / transformers / peft);
|
||||
and a dependency on `posix-sdc[hub]`. Released under GPL-2.0.
|
||||
|
||||
[1.0.1]: https://git.code.tiararodney.com/tiara/sekft/compare/v1.0.0...v1.0.1
|
||||
[1.0.0]: https://git.code.tiararodney.com/tiara/sekft/releases/tag/v1.0.0
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue