todo(15): open

2026-06-18 12:34:37 +02:00 · 2026-06-18 12:34:37 +02:00 · 01e4d75237
commit 01e4d75237
parent 705b4a028b
1 changed files with 22 additions and 0 deletions
--- a/22
+++ b/22
@ -249,3 +249,25 @@ Description: The lock committed with the triplet (#13) predated the published
             and its transitive deps into the lock. Commit the refreshed
             Pipfile.lock so the next machine installs the published wheel with
             the Hub path available.
 --ISSUE
 Content-Type: application/issue
 ID: 15
 Type: bugfix
 Title: apply_chat_template returns BatchEncoding on transformers 5.x
 Status: open
 Priority: high
 Created: 2026-06-18
 Module: sekft
 Relationships: 
 Description: build_masked_example assumed apply_chat_template returns a flat
             list[int] (transformers 4.x). On transformers 5.x it returns a
             BatchEncoding ({input_ids: [...]}), so ids was a dict, len(ids) was
             the key count, and the prefix-differencing spuriously raised 'chat
             template is not additive' on every real model (verified against
             mistralai/Mistral-7B-Instruct-v0.2). The masking logic is sound and
             the Mistral template is additive; only the return type needs
             normalising. Add a _render_ids helper that extracts input_ids when
             the result is dict-like, and use it for both renders. The
             fake-tokenizer test returned a bare list and missed this, so add a
             BatchEncoding-returning fake and assert the mask matches.