What is Base64?
A plain-language explanation of Base64 encoding — what it is, why it exists, how it works, and when to use it.
What is Base64?
Base64 is an encoding scheme that converts binary data into a string of plain ASCII characters. The name comes directly from the 64 printable characters it uses: the uppercase letters A–Z, lowercase letters a–z, the digits 0–9, and the symbols + and /. A 65th character, =, is used as padding.
It was formally defined in RFC 2045 as part of the MIME standard for email attachments, and it remains one of the most widely used encoding schemes on the web today.
One important clarification up front: Base64 is encoding, not encryption. It is a reversible transformation with no secret key — anyone who sees a Base64 string can decode it instantly. It provides no confidentiality or security whatsoever.
Why Base64 exists
The core problem: many systems were designed to handle text, not arbitrary binary data. Email protocols, HTTP headers, URLs, HTML attributes, and XML documents all operate on text — and many early implementations were strict about it, expecting 7-bit ASCII and treating certain byte values as control characters.
If you tried to embed a raw PNG image in an email, the binary bytes would likely get corrupted as they passed through text-only mail relays. Bytes with values above 127, or bytes that happen to look like newlines or null terminators, would be mangled or dropped entirely.
Base64 solves this by converting any binary data into a string that uses only safe, printable ASCII characters — characters every text-processing system understands and will pass through unchanged. Common scenarios where you'll encounter it:
- —Email attachments — MIME encodes attachments as Base64 so binary files survive text-only mail servers.
- —Data URIs — Images and fonts can be embedded directly in HTML or CSS without a separate HTTP request.
- —JSON and XML APIs — Binary blobs (images, PDFs, cryptographic keys) can be stored in string fields.
- —HTTP Basic Auth — Credentials are Base64-encoded before being placed in the
Authorizationheader.
How it works
Base64 operates on 3 bytes at a time (24 bits). It splits those 24 bits into four 6-bit groups, then maps each 6-bit group to one of the 64 characters in the alphabet. Because 26 = 64, every possible 6-bit value has exactly one character.
If the input length is not a multiple of 3 bytes, the encoder pads the final group with zero bits and appends one or two = characters to signal this. The net result is that Base64 output is always a multiple of 4 characters, and always about 33% larger than the input (4 output characters per 3 input bytes).
Here is a concrete example encoding the ASCII string "Man":
Input: M a n
Bytes: 0x4D 0x61 0x6E
Bits: 01001101 01100001 01101110
Group into 6-bit chunks:
010011 010110 000101 101110
Map to Base64 alphabet:
T W F u
Result: "TWFu"The Base64 alphabet in order: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/. Index 0 is A, index 19 is T, index 22 is W, and so on.
Common use cases
Base64 shows up in more places than most developers realize. Here are the four most common scenarios with short examples.
Data URIs in HTML and CSS
Embed images or fonts inline without a separate network request:
<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..." />
JWTs (JSON Web Tokens)
A JWT has three Base64URL-encoded parts separated by dots. The first two (header and payload) are readable by anyone:
// A JWT looks like this:
eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiIxMjM0NTY3ODkwIn0.dozjgNryP4J3jVmNHl0w5N_XgL0n3I9PlFUP0THsR8U
// Decode the payload (middle part):
atob('eyJzdWIiOiIxMjM0NTY3ODkwIn0')
// => '{"sub":"1234567890"}'HTTP Basic Auth
The Authorization header encodes username:password as Base64 (not encrypted — always use HTTPS):
// btoa('user:pass') => 'dXNlcjpwYXNz'
Authorization: Basic dXNlcjpwYXNzBinary blobs in JSON APIs
JSON has no binary type, so binary data (images, file contents, crypto keys) gets Base64-encoded into a string field:
{
"filename": "photo.jpg",
"content": "iVBORw0KGgoAAAANSUhEUgAAABAAAA..."
}Base64 in JavaScript
Browsers expose two built-in globals for Base64: btoa() (binary to ASCII, i.e. encode) and atob() (ASCII to binary, i.e. decode). In Node.js, the Buffer API is the idiomatic approach and handles arbitrary binary data cleanly.
// Browser — encode
btoa('Hello, World!') // => "SGVsbG8sIFdvcmxkIQ=="
// Browser — decode
atob('SGVsbG8sIFdvcmxkIQ==') // => "Hello, World!"
// Node.js — encode
Buffer.from('Hello').toString('base64') // => "SGVsbG8="
// Node.js — decode
Buffer.from('SGVsbG8=', 'base64').toString('utf8') // => "Hello"
// Unicode-safe encode in the browser
// btoa() throws on characters outside Latin-1, so use this pattern:
function toBase64(str) {
return btoa(encodeURIComponent(str).replace(
/%([0-9A-F]{2})/g,
(_, p1) => String.fromCharCode(parseInt(p1, 16))
))
}
toBase64('Hello 🌍') // => "SGVsbG8g8J+MjQ=="Note that btoa() only accepts Latin-1 (ISO 8859-1) characters. If you pass a string containing emoji or multi-byte Unicode characters, it will throw a InvalidCharacterError. The pattern above — encodeURIComponent then byte-map — is the standard workaround.
Base64URL: the URL-safe variant
Standard Base64 uses +, /, and = — all characters that have special meaning in URLs and HTTP headers. Placing a standard Base64 string in a URL requires percent-encoding those characters, which makes the string longer and harder to read.
Base64URL solves this with two simple substitutions: + becomes -, / becomes _, and the = padding is typically omitted. The result is a string that can be embedded in a URL path or query string without any further encoding.
You will encounter Base64URL in JWTs (all three parts use it), OAuth 2.0 PKCE code challenges, and web cryptography APIs. When decoding a JWT manually, remember to add back padding and swap -/_ before passing the string to atob().
Frequently asked questions
- Is Base64 the same as encryption?
- No. Base64 is encoding, not encryption. Anyone can decode a Base64 string instantly — it provides zero security. Never use it to "hide" sensitive data. If you need to protect data, use a real encryption algorithm like AES-256 or an authenticated encryption scheme.
- Why is Base64 output always longer than the input?
- Base64 encodes every 3 bytes of input as 4 characters of output, which is a 4/3 ratio — roughly a 33% size increase. If the input length is not a multiple of 3, one or two = padding characters are appended, adding at most 2 more characters. This overhead is the trade-off for guaranteed text-safe output.
- What is the difference between Base64 and Base64URL?
- Base64URL is a URL-safe variant of Base64. It replaces + with - and / with _ and removes = padding. This makes the output safe to include in URLs, HTTP headers, and filenames without percent-encoding. JWTs, OAuth PKCE, and the Web Crypto API all use Base64URL.
- Can Base64 handle Unicode or emoji?
- Base64 operates on bytes, not characters. To encode a Unicode string, first convert it to UTF-8 bytes, then Base64-encode those bytes. In the browser, calling btoa() directly on a string containing characters outside Latin-1 will throw an error. The safe pattern is to use encodeURIComponent to get percent-encoded UTF-8, then map those bytes before calling btoa(). In Node.js, Buffer.from(str, "utf8").toString("base64") handles this correctly.