レッスン 4

Common HTML Entity Mistakes 日本語ガイド

日本語の html entity common html entity mistakes ガイド: Avoid double encoding, decode-as-sanitize, and whitespace bugs.

このコンテンツはまだ日本語で用意されていません。ローカライズが完了するまで English 版を表示しています。

Entity bugs often look like "wrong character on the page" but come from pipeline ordering or invisible characters.

Double encoding

Encoding already-escaped content is the most frequent issue. Symptoms:

  • Users see © instead of ©
  • Search finds the literal string < in rendered text

Always know whether your input is raw user text or pre-escaped storage.

Treating decode as sanitize

Decoding &lt;script&gt;alert(1)&lt;/script&gt; produces <script>alert(1)</script> as text—but inserting that into HTML without sanitization is still dangerous if your renderer interprets tags.

Decode for inspection and editing. Sanitize separately before rendering untrusted HTML.

Confusing space and nbsp

&nbsp; is a non-breaking space (U+00A0), not the same as a normal space (U+0020). Layout bugs, copy-paste mismatches, and string comparisons fail silently when these are mixed.

Use a preview that visualizes nbsp and tabs when debugging.

Wrong entity for the context

Encoding < is essential in HTML. Encoding every ASCII letter "just to be safe" creates unreadable CMS fields and breaks full-text search. Escape what the output format requires—no more.

Invalid numeric entities

Malformed references such as &#9999999; or incomplete &#xZZ; should fail loudly during batch processing instead of being silently passed through.

実践したいときは関連する DevCove ツールを使えます。任意であり、このレッスンの必須部分ではありません。

関連ツールを開く

コース概要へ戻る