IvanLin

Distillation Of DeepSeek-Prover V1.5

https://arxiv.org/abs/2408.08152 - "DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search" https://github.com/deepseek-ai/DeepSeek-Prover-V1.5 TL;DR DeepSeek-Prover-V1.5 is an improved open-source language model for theorem proving in Lean 4. The paper continues pre-training DeepSeekMath-Base, a math foundation model, and then does supervised fine-tuning on a dataset of incomplete proofs in...

Oct 15, 20244

LESSWRONG
LW

LESSWRONG
LW

IvanLin

Distillation Of DeepSeek-Prover V1.5

IvanLin

IvanLin

Distillation Of DeepSeek-Prover V1.5

TL;DR

Overview