Skip to content

Commit f424a2d

Browse files
digger-yuyaozhewei
andauthored
Update README.md (deepspeedai#397)
fix spelling errors Co-authored-by: Zhewei Yao <zheweiy@berkeley.edu>
1 parent aa4b6c8 commit f424a2d

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

applications/DeepSpeed-Chat/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ Figure 1: The illustration of DeepSpeed Chat’s RLHF training pipeline with opt
217217

218218
</p>
219219

220-
As the most complex step of the entire 3-step InstructGPT pipeline, DeepSpeed Chat's ***Hyrbid Engine*** has enabled sufficient acceleration to aovid large training time (cost) implications. Refer to [Step3: Reinforcement Learning Human Feedback (RLHF)](./training/step3_rlhf_finetuning) for more information. If you already have your fine-tuned actor and reward model checkpoints, you can simply run the following scripts to enable the PPO training.
220+
As the most complex step of the entire 3-step InstructGPT pipeline, DeepSpeed Chat's ***Hybrid Engine*** has enabled sufficient acceleration to avoid large training time (cost) implications. Refer to [Step3: Reinforcement Learning Human Feedback (RLHF)](./training/step3_rlhf_finetuning) for more information. If you already have your fine-tuned actor and reward model checkpoints, you can simply run the following scripts to enable the PPO training.
221221

222222
<details><summary> Expand </summary><p>
223223

@@ -314,7 +314,7 @@ The numbers in the table above are for Stage 3 of the training and based on actu
314314

315315
### 🐲 Throughput and Model Size Scalability Comparisons with Existing RLHF Systems
316316

317-
&nbsp;&nbsp;***(I) Single-GPU's Model Scale and Throughput Comparision***
317+
&nbsp;&nbsp;***(I) Single-GPU's Model Scale and Throughput Comparison***
318318

319319
&nbsp;&nbsp;With over an order of magnitude higher throughput, DeepSpeed-Chat unlocks the ability to train significantly larger actor models under the same latency budget or train models of similar size at much lower cost, compared to the existing systems like Colossal-AI or HuggingFace-DDP. For example, on a single GPU, DeepSpeed enables over **10X** throughput improvement for RLHF training on a single GPU. While both CAI-Coati and HF-DDP can run a max model size of 1.3B, DeepSpeed can run 6.5B model on the same hardware, **5x** higher.
320320

@@ -325,7 +325,7 @@ Figure 2: Step 3 throughput comparison against two other system frameworks (Colo
325325

326326
</p>
327327

328-
&nbsp;&nbsp;***(II) Single-Node Multi-GPU Model Scale and Throughput Comparision***
328+
&nbsp;&nbsp;***(II) Single-Node Multi-GPU Model Scale and Throughput Comparison***
329329

330330
On multi-GPUs of a single node, DeepSpeed-Chat enables **6-19X** speedup over CAI-Coati and **1.4-10.5X** speedup over HF-DDP (Figure 3) with respect to system throughput.
331331

@@ -338,7 +338,7 @@ Figure 3. End-to-end training throughput comparison for step 3 of the training p
338338

339339
&nbsp;&nbsp;***(III) Superior Generation Phase Acceleration in Step3***
340340

341-
One of the key reasons that result in Figure 3 is our Hyrbid Engine's superior generation phase acceleration, shown below.
341+
One of the key reasons that result in Figure 3 is our Hybrid Engine's superior generation phase acceleration, shown below.
342342

343343
<p align="center">
344344

0 commit comments

Comments
 (0)