Skip to content

[BUG] Base64.decode accepts malformed input: a padding '=' followed by a data character #7482

@aimasteracc

Description

@aimasteracc

Description

Base64.decode(String) in src/main/java/com/thealgorithms/conversions/Base64.java claims "Strict RFC 4648 compliance", but it silently accepts malformed input in which a padding character '=' appears at the second-to-last position and is followed by a data character at the last position (e.g. "QQ=Q", "AB=C"). Such strings are not valid Base64 and a strict decoder must reject them; instead decode returns a byte array.

The padding-position validation only checks the first '=' and only that it is not too early in the string:

// Base64.java (lines ~120-124)
int firstPadding = input.indexOf('=');
if (firstPadding != -1 && firstPadding < input.length() - 2) {
    throw new IllegalArgumentException("Padding '=' can only appear at the end (last 1 or 2 characters)");
}

It never verifies that, once padding begins, all remaining characters are also '='. So a '=' sitting exactly at index length - 2 passes validation even when the final character is a normal data character. The decode loop then treats charAt(i+2) == '=' as padding (skipping the middle byte) yet still reads charAt(i+3) as data and emits a byte for it — producing output bytes from a position after a padding character, which is impossible in well-formed Base64.

Steps to reproduce

import com.thealgorithms.conversions.Base64;

// "QQ=Q": '=' is at index 2 (== length-2), data char 'Q' at index 3.
byte[] out = Base64.decode("QQ=Q");          // expected: IllegalArgumentException
System.out.println(java.util.Arrays.toString(out)); // actually prints: [65, 16]

As a JUnit test that should pass but currently fails:

@Test
void decodeRejectsDataAfterPadding() {
    assertThrows(IllegalArgumentException.class, () -> Base64.decode("QQ=Q"));
    assertThrows(IllegalArgumentException.class, () -> Base64.decode("AB=C"));
    assertThrows(IllegalArgumentException.class, () -> Base64.decode("AB=A"));
}

Reference behavior — the JDK strict decoder rejects all three:

java.util.Base64.getDecoder().decode("QQ=Q"); // throws IllegalArgumentException

Expected behavior

decode should throw IllegalArgumentException for "QQ=Q", "AB=C", "AB=A" and any input where a '=' is followed by a non-'=' character, consistent with the documented strict RFC 4648 behavior.

Actual behavior

input Base64.decode result correct (strict) result
"QQ=Q" accepted → [65, 16] IllegalArgumentException
"AB=C" accepted → [0, 2] IllegalArgumentException
"AB=A" accepted → [0, 0] IllegalArgumentException

Valid inputs ("QQ==", "QUI=", "QUJD", "SGVsbG8=") decode correctly, so the defect is isolated to padding validation.

Suggested fix

After locating the first padding character, require that every character from there to the end is also '=':

int firstPadding = input.indexOf('=');
if (firstPadding != -1) {
    if (firstPadding < input.length() - 2) {
        throw new IllegalArgumentException("Padding '=' can only appear at the end (last 1 or 2 characters)");
    }
    for (int i = firstPadding; i < input.length(); i++) {
        if (input.charAt(i) != '=') {
            throw new IllegalArgumentException("A padding '=' must not be followed by a non-padding character");
        }
    }
}

Additional context

The existing "Invalid padding position" tests in Base64Test.java cover only padding that is too early ("Q=QQ", "Q=Q=", "=QQQ" — all caught by the firstPadding < length - 2 check). The case of a padding character at index length - 2 followed by a data character is not covered; adding the test above would guard against regressions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions