Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Neural networks have changed the way machines interpret the world. At their
core, they learn by following gradients, adjusting their parameters step by
step until they identify the most discriminant patterns in the data. This
process gives them their strength, yet it also opens the door to a hidden flaw.
The very gradients that help a model learn can also be used to produce small,
imperceptible tweaks that cause the model to completely alter its decision.
Such tweaks are called adversarial attacks. These attacks exploit this
vulnerability by adding tiny, imperceptible changes to images that, while
leaving them identical to the human eye, cause the model to make wrong
predictions. In this work, we propose Adversarially-trained Contrastive
Hard-mining for Optimized Robustness (ANCHOR), a framework that leverages the
power of supervised contrastive learning with explicit hard positive mining to
enable the model to learn representations for images such that the embeddings
for the images, their augmentations, and their perturbed versions cluster
together in the embedding space along with those for other images of the same
class while being separated from images of other classes. This alignment helps
the model focus on stable, meaningful patterns rather than fragile gradient
cues. On CIFAR-10, our approach achieves impressive results for both clean and
robust accuracy under PGD-20 (epsilon = 0.031), outperforming standard
adversarial training methods. Our results indicate that combining adversarial
guidance with hard-mined contrastive supervision helps models learn more
structured and robust representations, narrowing the gap between accuracy and
robustness.
Authors (3)
Samarup Bhattacharya
Anubhab Bhattacharya
Abir Chakraborty
Submitted
October 31, 2025
Key Contributions
ANCHOR is a framework that combines adversarial training with supervised contrastive learning and explicit hard positive mining to learn robust image representations. This approach aims to make neural networks more resilient to adversarial attacks by ensuring that learned embeddings are discriminative even for subtly perturbed inputs.
Business Value
Increases the reliability and security of AI systems deployed in critical applications, reducing risks associated with malicious manipulation of inputs.