What is URL Encoding?

URL encoding, formally known as percent encoding, is the mechanism used to convert characters into a format that can be safely included in a Uniform Resource Locator. URLs can only contain a limited set of characters from the US-ASCII character set. When a URL needs to include characters outside this set, or characters that have special meaning within the URL syntax, those characters must be encoded.

The encoding process replaces unsafe characters with a percent sign (%) followed by two hexadecimal digits that represent the character's byte value. For example, a space becomes %20, a forward slash becomes %2F, and an ampersand becomes %26. This simple substitution system ensures that URLs remain valid and unambiguous regardless of the data they carry.

URL encoding is defined in RFC 3986, which specifies the syntax of URIs. Every time you submit a form, click a link with special characters, or make an API request with query parameters, percent encoding is working behind the scenes to keep things running smoothly.

Why Special Characters Must Be Encoded

URLs use certain characters as structural delimiters. The colon and double slash (://) separate the scheme from the authority, the forward slash (/) divides path segments, the question mark (?) marks the start of a query string, and the ampersand (&) separates key-value pairs within that query. If your data happened to contain any of these characters, the browser or server would misinterpret the URL's structure.

Consider a search query like cats & dogs. If you placed that directly into a URL as ?q=cats & dogs, the browser would interpret & as a parameter separator and the space would break the URL entirely. The encoded version, ?q=cats%20%26%20dogs, preserves the intended meaning without any ambiguity.

Beyond structural characters, URLs cannot contain spaces, non-ASCII characters such as accented letters or emoji, or control characters. All of these must be percent-encoded before they can appear in a valid URL.

The Percent Encoding System

The percent encoding system is straightforward. Each character that needs encoding is converted to its UTF-8 byte sequence, and each byte is then represented as a percent sign followed by two uppercase hexadecimal digits. Here are some of the most common encodings you will encounter:

For multi-byte characters like the euro sign, the encoding uses multiple percent-encoded bytes. The euro sign has the UTF-8 byte sequence E2 82 AC, so it becomes %E2%82%AC in a URL. This is why modern internationalized URLs with non-Latin characters can sometimes look very long when encoded.

Reserved vs Unreserved Characters

RFC 3986 divides URL characters into two groups. Unreserved characters are safe to use anywhere in a URL without encoding. These include:

Reserved characters have special meaning within the URL structure. These include :, /, ?, #, [, ], @, !, $, &, ', (, ), *, +, ,, ;, and =. When these characters appear as data rather than as delimiters, they must be percent-encoded. When they serve their intended structural purpose, they must not be encoded.

This distinction is critical. Encoding a reserved character when it is being used as a delimiter will break the URL. Failing to encode a reserved character when it is part of user data will also break the URL. Knowing which context you are in determines whether a character should be encoded.

encodeURI() vs encodeURIComponent()

JavaScript provides two built-in functions for URL encoding, and using the wrong one is one of the most common bugs in web development. Understanding the difference is essential.

encodeURI() is designed to encode a complete URL. It leaves structural characters intact so the URL remains valid. It does not encode :, /, ?, #, [, ], @, !, $, &, ', (, ), *, +, ,, ;, or =.

encodeURI("https://example.com/search?q=hello world")
// "https://example.com/search?q=hello%20world"

encodeURIComponent() is designed to encode a single value that will be placed into a URL. It encodes all special characters, including structural ones like /, ?, &, and =.

encodeURIComponent("hello world & goodbye")
// "hello%20world%20%26%20goodbye"

The rule of thumb: use encodeURIComponent() for individual parameter values and path segments. Use encodeURI() only when you have a full URL that just needs its spaces and non-ASCII characters cleaned up. In practice, encodeURIComponent() is what you need the vast majority of the time.

// Correct: encode only the value
const url = "https://api.example.com/search?q=" + encodeURIComponent(userInput);

// Wrong: encodeURI won't encode & in the value
const url = encodeURI("https://api.example.com/search?q=" + userInput);

Common Encoding Mistakes in Web Development

Double encoding is the most frequent mistake. This happens when a value is encoded twice, turning %20 into %2520 (the percent sign itself gets encoded). This typically occurs when a library or framework already encodes values and the developer manually encodes them again. Always check whether your HTTP client or framework handles encoding automatically before adding your own.

Using encodeURI() for parameter values is another common error. Because encodeURI() does not encode ampersands or equals signs, a user-supplied value containing these characters will corrupt the query string structure. Always use encodeURIComponent() for values.

Encoding the entire URL with encodeURIComponent() is the reverse mistake. This will encode the slashes and colons that form the URL's structure, rendering it completely invalid. If you see https%3A%2F%2F in a URL, someone used the wrong function.

Forgetting to decode on the server can lead to subtle bugs where data appears with percent codes instead of the intended characters. Most server frameworks decode automatically, but custom parsing logic may skip this step.

Query Strings and Form Data Encoding

Query strings are the most common place where URL encoding matters. A query string begins with ? and contains key-value pairs separated by &, with keys and values joined by =. Both keys and values must be individually encoded with encodeURIComponent().

const params = new URLSearchParams();
params.set("name", "Jane Doe");
params.set("city", "San Francisco");
console.log(params.toString());
// "name=Jane%20Doe&city=San%20Francisco"

The modern URLSearchParams API in JavaScript handles encoding automatically and is the recommended approach for building query strings. It eliminates the need to manually call encodeURIComponent() on each value.

HTML forms use a related but slightly different encoding called application/x-www-form-urlencoded. This format is nearly identical to standard percent encoding with one key difference: spaces are encoded as + instead of %20. When a form is submitted with method="GET", the browser encodes the form data into the query string using this format. With method="POST", the same encoding is used in the request body.

The plus-for-space convention dates back to early HTML specifications and remains the default for form submissions. While servers typically handle both %20 and + as spaces, it is important to be aware of this difference when parsing URLs or building form-compatible requests manually.

Try Our URL Encoder

Encode and decode URLs instantly with our free online tool.

Open URL Encoder

Frequently Asked Questions

What is URL encoding?
URL encoding, also called percent encoding, is the process of converting special characters in a URL into a format that can be safely transmitted over the internet. Characters that are not allowed in a URL are replaced with a percent sign followed by two hexadecimal digits representing the character's ASCII code.
What does %20 mean?
%20 is the URL-encoded representation of a space character. The number 20 is the hexadecimal value of the ASCII code for a space (decimal 32). In query strings, spaces can also be encoded as a plus sign (+).
When should I use encodeURIComponent?
Use encodeURIComponent() when encoding individual values that will be placed into a URL, such as query parameter values or path segments. It encodes all special characters including slashes, colons, and ampersands. Use encodeURI() only when encoding a full URL where you want to preserve its structural characters.
How do I decode a URL?
In JavaScript, use decodeURIComponent() to decode an encoded URL component, or decodeURI() to decode a full URL. In other languages, equivalent functions exist: Python has urllib.parse.unquote(), PHP has urldecode(), and Java has URLDecoder.decode().
What characters are safe in a URL?
Unreserved characters that are safe to use in a URL without encoding include uppercase and lowercase letters (A-Z, a-z), digits (0-9), hyphens (-), underscores (_), periods (.), and tildes (~). All other characters should be percent-encoded.