/* Fast Walsh–Hadamard Transform in GF(251). Produces a “renormalized” FWHT of its 64-byte input in this Galois field; < f ./fwht | ./fwht should reproduce (the first 64 bytes of) the contents of f. The idea here is to demonstrate the self-inverse property of the transform; however, rounding and truncation issues interfere. As an example, here we see that applying the Walsh–Hadamard step to two 8-bit values produces a value outside the 8-bit range: >>> a, b = 68, 102 >>> a, b = (a + b), (a - b) >>> a, b (170, -34) Repeating the same transformation again we recover the original numbers, but scaled by a factor of 2: >>> a, b = (a + b), (a - b) >>> a, b (136, 204) By dividing by 2 we can recover the original numbers: >>> a/2, b/2 (68, 102) But in machine words of any fixed size (8 bits, 32 bits, whatever) dividing by 2 is an information-losing operation. So, we could do our transforms mod 241, which is a prime, thus giving us a Galois field. The multiplicative inverse of 2 mod 241 is 121, which has two square roots in this field: the usual 11 and also 230. So, each time we compute a sum and difference, we multiply by 11 (one of the square roots of 2⁻¹) to “normalize” the result: >>> a, b = 68, 102 >>> a, b = (a + b) * 11 % 241, (a - b) * 11 % 241 >>> a, b (183, 108) >>> a, b = (a + b) * 11 % 241, (a - b) * 11 % 241 >>> a, b (68, 102) If we want the transform to be information-preserving, doing the internal steps of the algorithm in higher precision doesn’t help to avoid this modulo and renormalization step, because the final output of the first transform is the thing that has to be in higher precision. Mod 241 preserves printable ASCII but would of course err on bytes with values over 240. So it only handles 7.91 bits of data per byte, not 8. Mod 251 includes more byte values (7.97 bits of data), but there’s no √½ mod 251. But there *is* a √-½: 80 or 171; so a single level of the FWHT isn’t self-inverse (when done twice it produces the additive inverse of the input) but any even number of levels is: >>> a, b = 68, 102 >>> a, b = (a + b) * 80 % 251, (a - b) * 80 % 251 >>> a, b (46, 41) >>> a, b = (a + b) * 80 % 251, (a - b) * 80 % 251 >>> a, b (183, 149) These are the negatives of the original values: >>> -68 % 251, -102 % 251 (183, 149) So, repeating the transform step twice more we recover the original values: >>> a, b = (a + b) * 80 % 251, (a - b) * 80 % 251 >>> a, b (205, 210) >>> a, b = (a + b) * 80 % 251, (a - b) * 80 % 251 >>> a, b (68, 102) Or, more briefly: >>> f = lambda a, b: ((a + b) * 80 % 251, (a - b) * 80 % 251) >>> f(*f(*f(*f(68, 102)))) (68, 102) This means that to get a self-inverse FWHT with GF(251) the size of the input (and output) needs to be a power of 4, not just 2. It would be more efficient to do the renormalization just once, for example by multiplying by 51, or once per transform, for example by multiplying by 94 twice, instead of on every transform step, by multiplying by 80 12 times, as we do. But maybe it would actually be more efficient to do the whole thing in GF(2⁸) instead! Unfortunately in a finite field like this the renormalization factors totally kill the continuity property of the usual FWHT, which is what I’m actually interested in. That is, in the usual FWHT, an arbitrarily small change to any input point results in a proportionally small change to every output point, but there’s no applicable metric in GF(251). You might think this would be useful for cryptography, because it’s roughly the easiest way to get the avalanche effect, but it’s completely linear (in GF(251)). I’m no cryptographer, but I feel like that linearity could be a big drawback. */ #include #include #include typedef uint8_t u8; typedef uint32_t u32; int main(int argc, char **argv) { u32 n = 0; if (argc == 2) n = atoi(argv[1]); /* see https://stackoverflow.com/questions/3436922/evaluate-whether-a-number-is-integer-power-of-4 */ if ((n & (n - 1)) != 0 || (n & 0x55555555) == 0) { fprintf(stderr, "Usage: %s nbytes < data, where nbytes is a power of 4\n", argv[0]); return -1; } u8 *line = malloc(n); if (!line) { perror("malloc"); return -1; } fprintf(stderr, "Expecting %d char%s of input.\n", n, (n == 1) ? "" : "s"); for (size_t i = 0; i < n; i++) { int c = getchar(); if (c < 0) { fprintf(stderr, "Only %d chars of input\n", (int)i); return -1; } line[i] = c; } /* s, the stride, identifies which dimension of the * lg(n)-dimensional hypercube we’re doing the DFT along in a given * iteration; i and j locate points in the other lg(n)-1 * dimensions. */ for (size_t s = 1; s < n; s <<= 1) { for (size_t i = 0; i < n; i += 2*s) { for (size_t j = i; j < i + s; j++) { size_t k = j + s; u8 a = line[j], b = line[k]; line[j] = (a + b) * 80 % 251; /* The extra + 251 here is because of C’s semantics for division of negatives. */ line[k] = (a - b + 251) * 80 % 251; } } } for (size_t i = 0; i < n; i++) putchar(line[i]); return 0; }