Backdoor Attacks in AI Models - Research Paper

Summary: Researched vulnerabilities in ML models, identifying backdoor attacks and mitigation strategies. Final project for CS360: Computer and Network Security.

Full Paper 📄 Download Full Paper (PDF)

Objective

Investigate how malicious actors can inject backdoors into AI/ML models and propose defenses for critical applications such as automotive and healthcare.

Key Findings

Analyzed attack methods: Trojan Attacks, BadNets, and data poisoning.
Proposed mitigation strategies: dataset vetting, adversarial retraining, and neural activation clustering.
Explored detection techniques for model integrity and security.

Tools & Techniques

Python, TensorFlow, ML model analysis, gradient inspection, neural activation clustering