Lesson 2
Named vs Numeric Entities
Compare named, decimal, and hex entity forms.
The same character can often be written in multiple entity forms. For copyright ©:
- Named:
© - Decimal:
© - Hexadecimal:
©
After decoding, all three produce the same Unicode character.
Readability vs coverage
Named entities are easier to read in templates and CMS fields. Editors recognize and < quickly.
Numeric entities work for any Unicode code point, including characters without a standard named alias. They are essential for rare symbols, emoji-adjacent punctuation, or legacy encodings.
Round-trip behavior
When you encode text and decode it again, the bytes of the original text should match if you use consistent rules. However, the entity string may differ:
©encoded as©vs©vs©- A space vs
(non-breaking space is a different character from a normal space)
Always check whether your workflow cares about character equality or exact entity string equality.
Legacy forms without semicolons
Some older HTML content uses © without a trailing semicolon. Modern parsers and strict decoders may treat this differently. Prefer the semicolon form in new content.
Choosing a style
| Goal | Suggested style |
|---|---|
| Human-readable templates | Named where available |
| Full Unicode coverage | Decimal or hex |
| Compact logs | Hex (often shorter for large code points) |
| CMS compatibility | Match the platform's default exporter |