Lesson 7
String Escaping and Unicode
Escape sequences, Unicode code points, and special characters in JSON strings.
JSON strings are always delimited by double quotes. Any character that would break the string—or be ambiguous—must be escaped with a backslash.
Common escape sequences
| Sequence | Meaning |
|---|---|
\" | Double quote inside a string |
\\ | Literal backslash |
\n | Newline |
\t | Tab |
\r | Carriage return |
\b | Backspace |
\f | Form feed |
Example:
{
"message": "Line one\nLine two",
"path": "C:\\Users\\dev\\config.json"
}
Unicode escapes
Use \uXXXX for a Unicode code point with exactly four hex digits:
{
"greeting": "Hello, \u4e16\u754c"
}
For characters outside the Basic Multilingual Plane, UTF-16 surrogate pairs appear as two \u escapes in JSON—most editors and parsers handle this when reading/writing UTF-8 files.
Characters you cannot put raw in strings
Control characters (U+0000 through U+001F) must be escaped. Unescaped line breaks inside strings are invalid JSON—use \n instead.
UTF-8 files vs escaped Unicode
A .json file saved as UTF-8 may contain literal Chinese or emoji characters:
{ "label": "世界" }
That is valid JSON. Escaped \u forms are equivalent when normalized—choose whichever keeps your pipeline and diff tools happiest.
Practical tips
- When copying strings from logs, watch for smart quotes
""—they are not valid JSON delimiters. - API docs often show
\nin examples; your parser converts them to real newlines in memory. - If validation fails inside a long string, search for unescaped backslashes or broken
\usequences.