Adversarial Attacks — Tag

Adversarial Attacks on LLMs

This article explores adversarial attacks on large language models (LLMs), including types of attacks, threat models, and their impact on the safety of generated text, revealing significant challenges in AI safety.

Lilian Weng · Wed, 25 Oct 2023 00:00:00 +0000

Tag: Adversarial Attacks (1 articles)

Adversarial Attacks on LLMs