Lesson 1
What Is an HTML Entity?
Understand character references and why HTML needs escaping.
An HTML entity (more precisely, a character reference) is a way to write a character using an escape sequence instead of the literal character itself. The most familiar form starts with & and ends with ;, such as < for < or © for ©.
Why HTML needs escaping
HTML uses < and > to delimit tags. If you insert raw <script> into a page without escaping, the browser may interpret it as markup. Escaping turns < into < so the browser displays the character instead of starting a tag.
The same idea applies to:
- Ampersands (
&) — must be escaped first so©is not parsed incorrectly - Quotes in attributes —
"and'inside attribute values - Special symbols — ©, non-breaking spaces, math symbols in content
Entities are not encryption
Encoding does not hide meaning. Anyone can decode Tom & Jerry back to Tom & Jerry. The goal is safe insertion into HTML, not confidentiality.
Where developers see entities
Common sources include:
- CMS exports that escape HTML for storage or email templates
- Server-side template engines that auto-escape output
- JSON or log fields containing pre-escaped snippets
- Copy-paste from Word or rich-text editors with smart quotes and symbols
Decode vs render
Decoding an entity string produces text. Rendering that text inside HTML is a separate step. Never assume that decoding alone makes content safe for innerHTML.