arxiv_cl 95% Match Research Paper AI researchers,Security professionals,Developers of LLM-based applications,Educators 3 weeks ago

On the Ability of LLMs to Handle Character-Level Perturbations: How Well and How?

large-language-models › robustness

📄 Abstract

Abstract: This work investigates the resilience of contemporary LLMs against frequent and structured character-level perturbations, specifically through the insertion of noisy characters after each input character. We introduce UCC-Inj, a practical method that inserts invisible Unicode control characters into text to discourage LLM misuse in scenarios such as online exam systems. Surprisingly, despite strong obfuscation that fragments tokenization and reduces the signal-to-noise ratio significantly, many LLMs still maintain notable performance. Through comprehensive evaluation across model-, problem-, and noise-related configurations, we examine the extent and mechanisms of this robustness, exploring both the handling of character-level tokenization and implicit versus explicit denoising mechanism hypotheses of character-level noises. We hope our findings on the low-level robustness of LLMs will shed light on the risks of their misuse and on the reliability of deploying LLMs across diverse applications.

Authors (5)

Anyuan Zhuo

Xuefei Ning

Ningyuan Li

Yu Wang

Pinyan Lu

Submitted

October 16, 2025

arXiv Category

cs.CL

arXiv PDF

Key Contributions

This work introduces UCC-Inj, a practical method for obfuscating text using invisible Unicode control characters to deter LLM misuse. It demonstrates that many LLMs maintain notable performance despite significant tokenization fragmentation and reduced signal-to-noise ratio, shedding light on LLM misuse risks and deployment reliability.

Business Value

Enhances the security of online platforms and systems that rely on LLMs by providing a method to prevent malicious use, such as cheating in online exams or generating harmful content.

Paper Metadata

Innovation Type

Methodology

Deployment Feasibility

High, as it involves injecting invisible characters into input text, which is a simple modification.

Limitations Addressed

Existing LLMs' vulnerability to misuse through subtle text manipulations, and the lack of practical methods to prevent such misuse in sensitive applications like online exams.

Technical Tags

LLM robustnesscharacter-level perturbationsUnicode control characterstokenizationdenoisingadversarial attackstext obfuscationLLM misuseprompt engineeringnatural language processing

Research Topics

LLM SecurityTextual RobustnessAdversarial NLPLLM EvaluationMisuse Prevention

Methods & Architectures

UCC-Inj (Unicode Control Character Insertion)Comprehensive evaluation across model-, problem-, and noise-related configurations Large Language Models (LLMs)

Applications & Tasks

Online exam systems Content moderation Security LLM misuseTextual obfuscationRobustness to character-level noise Detecting LLM misuseSecuring online systemsEvaluating LLM resilience

Related Fields

Natural Language ProcessingCybersecurityMachine Learning SecurityComputational Linguistics

Keywords

LLMrobustnessperturbationscharacter-levelUnicodecontrol charactersUCC-Injtokenizationdenoisingmisusesecurityonline examsadversarialNLP

Academic Context

#LLM Security#Textual Robustness#Adversarial NLP#LLM Evaluation#Misuse Prevention

Commercial Potential

Potential Products

Text obfuscation tools for LLM securityLLM misuse detection systems

Target Industries

EducationTechnologyCybersecurityOnline Services

Use Case Examples

Preventing students from using LLMs to cheat on online examsObfuscating user input to prevent LLM-based abuse

Competitive Edge

Offers a novel, practical method for text obfuscation specifically targeting character-level perturbations, which may be more effective than existing methods that focus on semantic or syntactic manipulation.

Market Opportunity

Growing market for AI security and LLM misuse prevention.

Revenue Models

Licensing of the technologyintegration services.

Resource Requirements

Compute Needs

Standard compute for LLM inference and evaluation.

Data Requirements

Requires text data for input and potentially labeled data for evaluating downstream task performance.

Deployment Constraints

Potential for slight increases in input length and processing time due to character insertion.

Scalability

The method itself is scalable as it's a pre-processing step on input text.

Production Readiness

Maturity Level

Research

Time to Market

Short for integration into existing systems.

View Full Paper Back to Papers