Recent advancements in deep learning have been made possible by the availability of large amounts of labeled training data. However, collecting accurate labels can be time-consuming and costly. In many cases, only a small portion of the training data is labeled. Semi-supervised learning (SSL) aims to improve model performance by using both labeled and unlabeled input data. One effective approach to SSL is unsupervised consistency regularization, which makes use of the unlabeled data.
Although state-of-the-art consistency-based algorithms achieve excellent performance, they often require the tuning of multiple hyper-parameters. Unfortunately, hyper-parameter tuning can be unreliable in real-world SSL scenarios where annotated data is scarce. This is because cross-validation leads to high variance. The sensitivity of algorithm performance to hyper-parameter values exacerbates this problem. Additionally, as the number of hyper-parameters increases, the computational cost becomes unmanageable for cutting-edge deep learning algorithms.
To address these issues, researchers from Tsinghua University have developed a meta-learning-based SSL algorithm called Meta-Semi. This algorithm leverages labeled data more effectively by adjusting just one additional hyper-parameter. The team found that training the network using “pseudo-labeled” unannotated examples improves performance. Pseudo-soft labels based on network predictions are generated for the unlabeled data during the online training phase. Samples with unreliable or incorrect pseudo labels are removed, and the remaining data is used to train the model. The distribution of correctly pseudo-labeled data should be comparable to the labeled data, ensuring that the network is trained effectively.
The researchers introduced the meta-reweighting objective to minimize the final loss on labeled data by selecting appropriate weights for unlabeled samples. However, they encountered computational difficulties when using optimization algorithms to tackle this problem. As a solution, they propose an approximation formulation that allows for a closed-form solution. Theoretically, the algorithm only requires a single meta gradient step in each training iteration to achieve approximate solutions.
In conclusion, Meta-Semi incorporates a dynamic weighting approach to reweight previously pseudo-labeled samples. This approach eventually reaches the stationary point of the supervised loss function. In various image classification benchmarks, such as CIFAR-10, CIFAR-100, SVHN, and STL-10, Meta-Semi outperforms state-of-the-art deep networks. It performs exceptionally well on challenging tasks like CIFAR-100 and STL-10, surpassing other SSL algorithms like ICT and MixMatch. Furthermore, incorporating consistency regularization into the algorithm further improves performance.
It is worth noting that Meta-Semi does require slightly more time to train, which is a drawback. The researchers plan to address this issue in future research. For more information, you can check out the paper and the reference article.