Continual Learning in Neural Networks: Overcoming Catastrophic Forgetting in AI Systems
Keywords:
continual learning, catastrophic forgetting, neural networks, elastic weight consolidationAbstract
Fundamental challenge in artificial intelligence is continuous learning in neural networks as traditional deep learning model shows unfortunate forgetting when consecutively training on a new data. The aim of this paper is to explore the advance methodologies to counter this issue by focusing on elastic with consolidation, replay buffers, and dynamic network expansion as key techniques for enabling lifelong learning.
Downloads
References
J. Kirkpatrick et al., "Overcoming catastrophic forgetting in neural networks," Proc. Natl. Acad. Sci. USA, vol. 114, no. 13, pp. 3521–3526, Mar. 2017, doi: 10.1073/pnas.1611835114.
R. Kemker, M. McClure, A. Abitino, T. Hayes, and C. Kanan, "Measuring catastrophic forgetting in neural networks," in Proc. AAAI Conf. Artif. Intell., vol. 32, no. 1, 2018.
A. Gepperth and B. Hammer, "Incremental learning algorithms and applications," in Proc. Eur. Symp. Artif. Neural Netw. Comput. Intell. Mach. Learn. (ESANN), Bruges, Belgium, 2016, pp. 357–368.
Sivaraman, Hariprasad. (2020). Integrating Large Language Models for Automated Test Case Generation in Complex Systems.
S. Kumari, "Kanban and Agile for AI-Powered Product Management in Cloud-Native Platforms: Improving Workflow Efficiency Through Machine Learning-Driven Decision Support Systems", Distrib Learn Broad Appl Sci Res, vol. 5, pp. 867-885, Aug. 2019
Singu, Santosh Kumar. "Real-Time Data Integration: Tools, Techniques, and Best Practices." ESP Journal of Engineering & Technology Advancements 1.1 (2021): 158-172.
S. Kumari, "AI-Powered Cloud Security for Agile Transformation: Leveraging Machine Learning for Threat Detection and Automated Incident Response ", Distrib Learn Broad Appl Sci Res, vol. 6, pp. 467-488, Oct. 2020
S. Kumari, "AI-Powered Cybersecurity in Agile Workflows: Enhancing DevSecOps in Cloud-Native Environments through Automated Threat Intelligence ", J. Sci. Tech., vol. 1, no. 1, pp. 809-828, Dec. 2020.
Sivaraman, Hariprasad. (2020). Intelligent Deployment Orchestration Using ML for Multi-Environment CI/CD Pipelines.
S. Kumari, "Cloud Transformation and Cybersecurity: Using AI for Securing Data Migration and Optimizing Cloud Operations in Agile Environments", J. Sci. Tech., vol. 1, no. 1, pp. 791-808, Oct. 2020.
Singu, Santosh Kumar. "Designing scalable data engineering pipelines using Azure and Databricks." ESP Journal of Engineering & Technology Advancements 1.2 (2021): 176-187.
R. Aljundi, P. Chakravarty, and T. Tuytelaars, "Expert gate: Lifelong learning with a network of experts," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2017, pp. 3366–3375.
M. Riemer et al., "Learning to learn without forgetting by maximizing transfer and minimizing interference," in Proc. Int. Conf. Learn. Representations (ICLR), 2019.
Y. Bengio, J. Louradour, R. Collobert, and J. Weston, "Curriculum learning," in Proc. Int. Conf. Mach. Learn. (ICML), 2009, pp. 41–48.
T. Lesort, H. Caselles-Dupré, M. Garcia-Ortiz, A. Stoian, and D. Filliat, "Generative models for continual learning: A review," Neural Netw., vol. 121, pp. 218–233, Jan. 2020, doi: 10.1016/j.neunet.2019.09.012.
A. Chaudhry, M. Ranzato, M. Rohrbach, and A. Elhoseiny, "Efficient lifelong learning with A-GEM," in Proc. Int. Conf. Learn. Representations (ICLR), 2019.
C. Finn, P. Abbeel, and S. Levine, "Model-agnostic meta-learning for fast adaptation of deep networks," in Proc. Int. Conf. Mach. Learn. (ICML), 2017, pp. 1126–1135.
F. Parisi, R. Kemker, J. Part, C. Kanan, and S. Wermter, "Continual lifelong learning with neural networks: A review," Neural Netw., vol. 113, pp. 54–71, May 2019, doi: 10.1016/j.neunet.2019.01.012.
A. Dhar, N. R. Kejriwal, and S. K. Nandy, "Biologically inspired strategies for catastrophic forgetting in artificial neural networks: A survey," IEEE Trans. Cogn. Dev. Syst., vol. 13, no. 3, pp. 514–527, Sept. 2021, doi: 10.1109/TCDS.2021.3071990.
R. Aljundi et al., "Online continual learning with maximally interfered retrieval," in Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 2019, pp. 11849–11860.
J. Yoon, E. Yang, J. Lee, and S. J. Hwang, "Lifelong learning with dynamically expandable networks," in Proc. Int. Conf. Learn. Representations (ICLR), 2018.
A. Mallya and S. Lazebnik, "PackNet: Adding multiple tasks to a single network by iterative pruning," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 7765–7773.