URL Encoding Explained — When, Why & How (Percent Encoding)

URL encoding — also called percent-encoding — is one of those internet primitives nobody learns formally but everyone has run into. A URL with a space in it breaks; the fix is to replace the space with `%20` or `+`, and you mostly memorize that and move on. But the rules are more subtle than they look, and the difference between `encodeURI` and `encodeURIComponent` has caused thousands of production bugs.

This guide explains what URL encoding is for, which characters actually need encoding, the two browser functions and when to use each, and the specific patterns that show up wrong in production URLs.

Why URLs reserve certain characters

A URL is structured: `scheme://host:port/path?query#fragment`. Each part is delimited by specific characters — `://`, `/`, `?`, `#`, `&`, `=`. Those characters cannot appear inside the parts they delimit, or the URL becomes ambiguous. A query string of `?name=John&Jane` looks like two parameters because `&` is a delimiter.

Percent encoding replaces problematic characters with `%XX` sequences, where `XX` is the hex value of the byte. A space becomes `%20`, an ampersand becomes `%26`, a `+` becomes `%2B`. Non-ASCII characters get UTF-8 encoded first, then percent-encoded per byte (a Chinese character ends up as something like `%E4%B8%AD`).

Reserved vs unreserved characters

RFC 3986 divides URL characters into two classes. Unreserved characters (letters, digits, `-`, `.`, `_`, `~`) are always safe and never need encoding. Reserved characters (`:`, `/`, `?`, `#`, `[`, `]`, `@`, `!`, `$`, `&`, `'`, `(`, `)`, `*`, `+`, `,`, `;`, `=`) have special meaning at specific positions and need encoding when used as literal data.

Anything outside this set — spaces, accented characters, emojis, control characters — must always be encoded. Modern browsers and HTTP libraries handle most of this automatically, but the manual cases (constructing URLs in code) require attention.

encodeURI vs encodeURIComponent

JavaScript has two URL-encoding functions and they're not interchangeable. `encodeURI` is meant for entire URLs — it preserves the structural characters (`:`, `/`, `?`, `&`, `=`, `#`) so the URL stays parseable. `encodeURIComponent` is meant for individual URL components (query parameter values, path segments) — it encodes everything that isn't a safe character, including the structural ones.

The rule: use `encodeURIComponent` when building a URL piece by piece (query parameter values, path segments, fragment identifiers); use `encodeURI` only when you have a complete URL string and just want to make spaces and accented characters safe.

The classic bug: `"/search?q=" + encodeURI(query)` looks right but breaks when the query contains `&` — the ampersand is preserved by `encodeURI`, and downstream parsers split the query string at it. Use `encodeURIComponent` for the value.

Plus signs and form-encoded queries

Two encodings exist for spaces in URLs. `%20` is the strict percent-encoding from RFC 3986. `+` is the older application/x-www-form-urlencoded variant used by HTML forms. Servers usually accept both in query strings; both produce a literal space when decoded.

The catch: a literal `+` in your data has to be encoded as `%2B`. If you just use `encodeURIComponent`, you get `%2B` — perfect. If you concatenate strings yourself and rely on the form-encoded space-as-plus convention, a real plus sign in the data turns into a space when the server decodes it. This is how phone numbers with country codes get mangled.

Internationalized URLs

Non-ASCII paths and queries are encoded as UTF-8 bytes, then percent-encoded per byte. A path containing `日本` becomes `/%E6%97%A5%E6%9C%AC`. Most modern browsers display this in the address bar in the original characters but transmit the encoded form on the wire.

Domain names are different. International domains use Punycode (`xn--fsq.com` for `中.com`), not percent encoding. Don't try to percent-encode a hostname — it won't resolve.

Debugging broken URLs

Symptoms of double encoding (where a value gets encoded twice): the URL contains `%2520` (which is `%20` itself percent-encoded) or `%2526` for `&`. The fix is to find the duplicate encoding and remove one layer.

Symptoms of missing encoding: the URL looks fine but breaks server-side; query parameters get split unexpectedly; spaces appear as `+` where they shouldn't. The fix is to confirm every component you concatenate into a URL went through `encodeURIComponent`.

Tools: any in-browser URL encoder/decoder lets you paste in a suspicious URL, decode it, see the actual values, and re-encode if needed.

Quick reference

Use `encodeURIComponent` when building URL pieces (query values, path segments).
Use `encodeURI` only on a complete URL string to escape spaces/accents.
Spaces encode as `%20` or `+` depending on context; `+` literally means "+" only when encoded as `%2B`.
Non-ASCII characters: UTF-8 encode, then percent-encode per byte.
International domain names use Punycode, not percent encoding.

Wrapping up

URL encoding is one of those small disciplines that prevents an outsized share of production bugs. Pick `encodeURIComponent` as your default, reach for `encodeURI` only when you really do have a whole URL, and pay attention to `+` vs `%20` in query strings.

If a URL is misbehaving and you need to see exactly what's in it, our free URL encoder/decoder handles both directions, supports component and full-URL modes, and stays in your browser.