When you use global_step, after you have trained for several epochs, you may get an all-zero teacher logits because the start_idx and end_idx may be out of range. (you can print the length of loaded_data to verify this issue)
I think you can include the idx of the training data and use the idx to obtain the logits of the teachers.