Distillation Of DeepSeek-Prover V1.5
https://arxiv.org/abs/2408.08152 - "DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search" https://github.com/deepseek-ai/DeepSeek-Prover-V1.5 TL;DR DeepSeek-Prover-V1.5 is an improved open-source language model for theorem proving in Lean 4. The paper continues pre-training DeepSeekMath-Base, a math foundation model, and then does supervised fine-tuning on a dataset of incomplete proofs in...