TR2022-156

Learning with noisy labels using low-dimensional model trajectory


Abstract:

Recent work shows that deep neural networks (DNNs) first learn clean samples and then memorize noisy samples. Early stopping can therefore be used to improve performance when training with noisy labels. It was also shown recently that the training trajectory of DNNs can be approximated in a low-dimensional subspace using PCA. The DNNs can then be trained in this subspace achieving similar or better generalization. These two observations were utilized together, to further boost the generalization performance of vanilla early stopping on noisy label datasets. In this paper, we probe this finding further on different real-world and synthetic label noises. First, we show that the prior method is sensitive to the early stopping hyper-parameter. Second, we investigate the effectiveness of PCA, for approximating the optimization trajectory under noisy label information. We propose to estimate low-rank subspace through robust and structured variants of PCA, namely Robust PCA, and Sparse PCA. We find that the subspace estimated through these variants can be less sensitive to early stopping, and can outperform PCA to achieve better test error when trained on noisy labels.

 

  • Related News & Events

    •  NEWS    MERL researchers presenting workshop papers at NeurIPS 2022
      Date: December 2, 2022 - December 8, 2022
      MERL Contacts: Matthew Brand; Toshiaki Koike-Akino; Jing Liu; Saviz Mowlavi; Kieran Parsons; Ye Wang
      Research Areas: Artificial Intelligence, Control, Dynamical Systems, Machine Learning, Signal Processing
      Brief
      • In addition to 5 papers in recent news (https://www.merl.com/news/news-20221129-1450), MERL researchers presented 2 papers at the NeurIPS Conference Workshop, which was held Dec. 2-8. NeurIPS is one of the most prestigious and competitive international conferences in machine learning.

        - “Optimal control of PDEs using physics-informed neural networks” by Saviz Mowlavi and Saleh Nabi

        Physics-informed neural networks (PINNs) have recently become a popular method for solving forward and inverse problems governed by partial differential equations (PDEs). By incorporating the residual of the PDE into the loss function of a neural network-based surrogate model for the unknown state, PINNs can seamlessly blend measurement data with physical constraints. Here, we extend this framework to PDE-constrained optimal control problems, for which the governing PDE is fully known and the goal is to find a control variable that minimizes a desired cost objective. We validate the performance of the PINN framework by comparing it to state-of-the-art adjoint-based optimization, which performs gradient descent on the discretized control variable while satisfying the discretized PDE.

        - “Learning with noisy labels using low-dimensional model trajectory” by Vasu Singla, Shuchin Aeron, Toshiaki Koike-Akino, Matthew E. Brand, Kieran Parsons, Ye Wang

        Noisy annotations in real-world datasets pose a challenge for training deep neural networks (DNNs), detrimentally impacting generalization performance as incorrect labels may be memorized. In this work, we probe the observations that early stopping and low-dimensional subspace learning can help address this issue. First, we show that a prior method is sensitive to the early stopping hyper-parameter. Second, we investigate the effectiveness of PCA, for approximating the optimization trajectory under noisy label information. We propose to estimate the low-rank subspace through robust and structured variants of PCA, namely Robust PCA, and Sparse PCA. We find that the subspace estimated through these variants can be less sensitive to early stopping, and can outperform PCA to achieve better test error when trained on noisy labels.

        - In addition, new MERL researcher, Jing Liu, also presented a paper entitled “CoPur: Certifiably Robust Collaborative Inference via Feature Purification" based on his previous work before joining MERL. His paper was elected as a spotlight paper to be highlighted in lightening talks and featured paper panel.
    •  
  • Related Research Highlights