Self Supervised Learning (SSL)
Self Supervised Learning (SSL) "Unlocking Powerful Representations: The Frontier of Self-Supervised Learning" JASKARAN SINGH AUG 9, 2023 Share With all that’s been happening in the AI/ML industry for the past few weeks, it is important we address the elephant in the room. The Idea Behind SSL SSL comes under the umbrella of Unsupervised Learning. One thing that worked for NNs is that they are able to fit a curated dataset with ease given they have labels to optimize for (Supervised Learning), but this dataset may not be large enough, instead, it would be very expensive or impossible to create such a dataset. Once NNs have good representations of the task-related work they can learn a new task even more rapidly when shown how to do it only once (Zero-shot). But what if we can create synthetic Labels from the data? they don’t need to be highly curated. They can be thought of as corrupting the data. we can use these labels to train NNs and in the process, these NNs learn a powerful representation (depending upon the data and compute) of the Data that can be fine-tuned with a handful of data to produce quality results. How to SSL The main goal of SSL is to generate labels out of the dataset itself, without much human effort. The basic idea is to corrupt the data or either do a lossy compression to predict the same data back from the Model. Let’s discuss some ways in which researchers have used this technique in different domains: NLP While training a Language Model you can do SSL in two ways: MLM: Masking some words of the text (Corrupt) and letting the model train on getting these masked tokens correctly. Research Paper LM: given some context let the Model predict the rest of the text, one token at a time. Research Paper These approaches are part of contrastive training: in short, you train the model to differentiate between the different classes, during training you want the model to give a high probability to the ground truth and near

