2025 年初,所有實驗室的 LLM 生產棧看起來大致是這樣的:
1.Pretraining (GPT-2/3 of ~2020)
預訓練(約 2020 年的 GPT-2/3)
2.Supervised Finetuning (InstructGPT ~2022) and
監督微調(InstructGPT ~2022)和
Reinforcement Learning from Human Feedback (RLHF ~2022)
3.人類反饋強化學習(RLHF ~2022)
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()
特別聲明:以上內容(如有圖片或視頻亦包括在內)為自媒體平臺“網易號”用戶上傳并發布,本平臺僅提供信息存儲服務。
Notice: The content above (including the pictures and videos if any) is uploaded and posted by a user of NetEase Hao, which is a social media platform and only provides information storage services.