Kushal Thaman

Posts

Sorted by New

Wiki Contributions

Comments

Sorted by

Thanks for the post! Do you think there is an amount of pretraining you can do such that no fine-tuning (on a completely non-complementary task, away from pre-trained distribution, say) will let you push out of that loss basin? A 'point of no return' s.t. even for very large values of LR and amount of fine-tuning you will get a network that is still LMC?