I'm currently trying to regenerate old results from KD and at first loss started at 2 then decreased but now it starts at ~17 I was using torchtune 0.6.0 and now 0.6.1 when reverting to 0.6.0 same thing happens