LLM#
Base Knowledge#
Specific Techniques#
Direct Preference Optimization (DPO)
Mixture of Experts (MoE)
HF Blog: Mixture of Experts Explained
Specific Models#
Mixtral
Argilla Notux
based on Mixtral
Dataset: https://huggingface.co/datasets/argilla/ultrafeedback-binarized-preferences-cleaned
Code: argilla-io/notus