Hello, I use my own dataset, but the train loss is wired, looks like doesn't converge, is it normal? 