Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
          nlp          bloom          pipeline          pytorch          deepspeed          llm          full-finetune          model-parallization          flash-attention          llama2          baichuan2-7b          chatglm3-6b          mixtral-8x7b      
    - 
            Updated
            Feb 5, 2024 
- Python