Practical Guide to Reinforcement Learning from Human Feedback
Practical Guide to Reinforcement Learning from Human Feedback
Using Human Signals to Align AI Models
K, Sandip
Packt Publishing Limited
03/2026
Mole
Inglês
9781835880500
Pré-lançamento - envio 15 a 20 dias após a sua edição
Descrição não disponível.
Table of Contents
Introduction to Reinforcement Learning
Role of Human Feedback in Reinforcement Learning
Reward Modeling
Policy Training Based on Reward Model
Introduction to Language Models and Fine Tuning
Parameter Efficient Fine Tuning
Reward Modeling for Language Model Tuning
Reinforcement Learning for Tuning Language Models
Challenges of Reinforcement Learning with Human Feedback
Direct Preference Optimization
RLHF and Model Evaluations
Other Applications
Introduction to Reinforcement Learning
Role of Human Feedback in Reinforcement Learning
Reward Modeling
Policy Training Based on Reward Model
Introduction to Language Models and Fine Tuning
Parameter Efficient Fine Tuning
Reward Modeling for Language Model Tuning
Reinforcement Learning for Tuning Language Models
Challenges of Reinforcement Learning with Human Feedback
Direct Preference Optimization
RLHF and Model Evaluations
Other Applications
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
Table of Contents
Introduction to Reinforcement Learning
Role of Human Feedback in Reinforcement Learning
Reward Modeling
Policy Training Based on Reward Model
Introduction to Language Models and Fine Tuning
Parameter Efficient Fine Tuning
Reward Modeling for Language Model Tuning
Reinforcement Learning for Tuning Language Models
Challenges of Reinforcement Learning with Human Feedback
Direct Preference Optimization
RLHF and Model Evaluations
Other Applications
Introduction to Reinforcement Learning
Role of Human Feedback in Reinforcement Learning
Reward Modeling
Policy Training Based on Reward Model
Introduction to Language Models and Fine Tuning
Parameter Efficient Fine Tuning
Reward Modeling for Language Model Tuning
Reinforcement Learning for Tuning Language Models
Challenges of Reinforcement Learning with Human Feedback
Direct Preference Optimization
RLHF and Model Evaluations
Other Applications
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.