The transformers 5.x return-type change behind #15 also breaks generation:
apply_chat_template(add_generation_prompt=True, return_tensors="pt") returns a
BatchEncoding, and eval.py + resident.py passed it to model.generate, which does
inputs.shape[0] -> AttributeError (the holdout eval crashed on scenario 1). #15
fixed only the trainer. Factor a shared _input_ids helper and a render_prompt_ids
function; both operators use it. Tests cover _input_ids for both shapes and
render_prompt_ids.
apply_chat_template returns a BatchEncoding ({input_ids: [...]}) on transformers
>= 5 where 4.x returned a bare list[int]. build_masked_example treated the render
as a dict, so len/slicing were wrong and the prefix-differencing spuriously
raised "chat template is not additive" on every real model. Extract the id
sequence via a _render_ids helper; verified the assistant-only mask against
mistralai/Mistral-7B-Instruct-v0.2. The fake tokenizer returned a bare list and
missed this, so a BatchEncoding-returning variant now guards it.
First release: the trainer (assistant-only mask, train=serve render), the
three-source data loader (raw dir / curated jsonl / Hub), eval, resident, and
structured logging. Packaged, typed, [gpu] extra, depends on posix-sdc[hub].