todo(15): open
This commit is contained in:
parent
705b4a028b
commit
01e4d75237
1 changed files with 22 additions and 0 deletions
22
TODO
22
TODO
|
|
@ -249,3 +249,25 @@ Description: The lock committed with the triplet (#13) predated the published
|
||||||
and its transitive deps into the lock. Commit the refreshed
|
and its transitive deps into the lock. Commit the refreshed
|
||||||
Pipfile.lock so the next machine installs the published wheel with
|
Pipfile.lock so the next machine installs the published wheel with
|
||||||
the Hub path available.
|
the Hub path available.
|
||||||
|
|
||||||
|
--ISSUE
|
||||||
|
Content-Type: application/issue
|
||||||
|
ID: 15
|
||||||
|
Type: bugfix
|
||||||
|
Title: apply_chat_template returns BatchEncoding on transformers 5.x
|
||||||
|
Status: open
|
||||||
|
Priority: high
|
||||||
|
Created: 2026-06-18
|
||||||
|
Module: sekft
|
||||||
|
Relationships:
|
||||||
|
Description: build_masked_example assumed apply_chat_template returns a flat
|
||||||
|
list[int] (transformers 4.x). On transformers 5.x it returns a
|
||||||
|
BatchEncoding ({input_ids: [...]}), so ids was a dict, len(ids) was
|
||||||
|
the key count, and the prefix-differencing spuriously raised 'chat
|
||||||
|
template is not additive' on every real model (verified against
|
||||||
|
mistralai/Mistral-7B-Instruct-v0.2). The masking logic is sound and
|
||||||
|
the Mistral template is additive; only the return type needs
|
||||||
|
normalising. Add a _render_ids helper that extracts input_ids when
|
||||||
|
the result is dict-like, and use it for both renders. The
|
||||||
|
fake-tokenizer test returned a bare list and missed this, so add a
|
||||||
|
BatchEncoding-returning fake and assert the mask matches.
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue