Backend,  Frontend

The Basics and Principles of the Base64 Algorithm

Introduction

Base64 is a collection of binary-to-text encoding schemes that are utilized to represent binary data in an ASCII string format through the translation of the data into a radix-64 representation.

It uses the following alphabet to represent the radix-64 digits, alongside “=” as a padding character

A-Z, a-z, 0-9, +, /

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/

In URLs, certain characters such as /, ?, and # have special meanings. To optimize the encoding of binary data in URLs, URL Base64 encoding replaces the characters + and / with – and _ to avoid characters that might cause problems in URL path segments or query parameters.

Base64 encoding schemes are commonly used to encode binary data for storage or transfer over media that can only deal with ASCII text.

Common applications of Base64 include:

  • Email via MIME
  • Storing complex data in XML
  • Encoding binary data so it can be included in a data: URL

Algorithm

In Base64 encoding, each character is represented using only 6 bits.

Why 64?

2 ^ 6 = 64

The Relationship between Base64 Index and Corresponding Characters

CharacterBase64 Index
A-Z0-25
a-z26-51
0-952-61
+62
/63

It is known that one character in ASCII code occupies 1 byte (8 bits) of storage space.

According to the ASCII code, the characters with decimal values ranging from 0 to 31 and 127 are control characters, while the characters with decimal values ranging from 32 to 126 are printable characters. This means that only these 95 characters can be transmitted over a network, and any characters outside this range cannot be transmitted.

So how can other characters be transmitted? One way to achieve this is by using Base64 encoding.

While one Base64 character occupies 6 bits, ASCII code occupies 8 bits. Therefore, a method is needed to represent 8-bit data using 6 bits.

3*8 bit = 4*6 bit

Each Base64 digit represents 6 bits of data. So, three 8-bit bytes of the input string/binary file (3×8 bits = 24 bits) can be represented by four 6-bit Base64 digits (4×6 = 24 bits).

Example 1

Assuming the content we want to encode is “China”. Referring to the ASCII code table, the corresponding binary data for “China” is:

01000011 01101000 01101001 01101110 01100001

The encoding process is as follows:

1. The original data is encoded by grouping every 3 bytes together. This results in a total of 3*8=24 bits. Any remaining bytes that are less than 3 are placed in a separate group.

2. The 24 bytes are divided into 4 groups, with each group consisting of 6 bits. Any remaining bits are padded with zeros at the end.

3. Two additional zero bits are inserted before each group of 6 bits, resulting in a total of 32 bits.

4. The remaining part, which is less than 4 bytes, is padded with “0”. If a group is entirely filled with 0, it is represented by the “=” symbol. By referring to the Base64 code table, the resulting encoded data is: Q2hpbmE=

Based on the aforementioned analysis, it can be observed that if the final group is less than 32 bits, Base64 encoding will result in the conversion of 3 bytes of data into 4 bytes, increasing the size by approximately 4/3 times.

Example 2

Son
ASCII83111110
Binary010100110110111101101110
6 bits per group010100110110111101101110
Padding with leading zeros00010100001101100011110100101110
Base64 Index20546146
Encoded data U29u

In the given scenario, the string “Son” is encoded using Base64 and results in the string “U29u”. This is an example where characters are precisely converted into four corresponding Base64 characters.

S
ASCII83
Binary01010011
6 bits per group010100110000000000000000
Padding with leading zeros00010100001101100000000000101110
Base64 Index2048
Encoded data Uw==

The resulting encoded data is: Uw==

Advantages

  • The advantage of using Base64 encoding is that it reduces the number of HTTP requests and eliminates cross-origin issues

Disadvantages

  • The increase in file size can lead to the blocking of HTML and CSS parsing when loading base64 images. However, external linked images can continue to load after the completion of page rendering without causing any blocking.

Leave a Reply

Your email address will not be published. Required fields are marked *