sekft

History

Tiara Rodney 1279bc8965 bugfix(16): operators must not feed a BatchEncoding to model.generate The transformers 5.x return-type change behind #15 also breaks generation: apply_chat_template(add_generation_prompt=True, return_tensors="pt") returns a BatchEncoding, and eval.py + resident.py passed it to model.generate, which does inputs.shape[0] -> AttributeError (the holdout eval crashed on scenario 1). #15 fixed only the trainer. Factor a shared _input_ids helper and a render_prompt_ids function; both operators use it. Tests cover _input_ids for both shapes and render_prompt_ids.	2026-06-18 16:49:30 +02:00
..
test_load.py	feat(12): load training data from a raw dir, a curated jsonl, or the Hub	2026-06-18 00:05:27 +02:00
test_sft.py	bugfix(16): operators must not feed a BatchEncoding to model.generate	2026-06-18 16:49:30 +02:00

bugfix(16): operators must not feed a BatchEncoding to model.generate

The transformers 5.x return-type change behind #15 also breaks generation:
apply_chat_template(add_generation_prompt=True, return_tensors="pt") returns a
BatchEncoding, and eval.py + resident.py passed it to model.generate, which does
inputs.shape[0] -> AttributeError (the holdout eval crashed on scenario 1). #15
fixed only the trainer. Factor a shared _input_ids helper and a render_prompt_ids
function; both operators use it. Tests cover _input_ids for both shapes and
render_prompt_ids.

2026-06-18 16:49:30 +02:00

test_load.py

feat(12): load training data from a raw dir, a curated jsonl, or the Hub

2026-06-18 00:05:27 +02:00

test_sft.py

bugfix(16): operators must not feed a BatchEncoding to model.generate

2026-06-18 16:49:30 +02:00