The ebcs PRNG and cipher programs


ecbs is a Psuedo Random Number Generator which generates uniformely distrubed random numbers. It passes all the statistical tests, even in scaled down versions. It can be made as an entirely 8 bit function, running on 6502 chips or 16 bit version on 8086 up to 64 bits on today's end user computers. It's easy to understand and the code is written for simplicty and ease of understanding. There are some accomodations made for compiler warnings that can't apply, but well documented with comments in the source code.

Summary

I think I have a pretty good but very slow PRNG. But I also think its cryptographically secure and in that context, well, its still slow, and that's why I named it ecbs for EC's Big Slow, but you can use some other mnemonic for the bs part. I created a test program from the orginal ecbs64 algorithm to allow for easy setting of various parameters for testing. Turning off some aspects of the algorithm makes it run faster, and often still pass Practrand, GJrand, and TestU01. But the full algorithm passes all statistical tests so far.

My major goal is unpredictibility across the board, and that means that there are few constants and almost every aspect of state is randomly initialized. The initialization is less important for a non-CSPRNG, but for security, it must be as unpredictable as possible, and yet reproducable as well.

I started out basically testing something similar to Fibinacci LFSRs but using all the operations on byte size values, rather than bits. If you start with eight byte elements and create a Fibinacci LFSR what you really have is 8 parallel LFSRs with the 8 bits of output combined to produce a single byte. Using maximal length LFSRs with taps assigned by irreducible primitive polynomials modulo 2, you get a period length of 2^elementcount -1. But if you change some of the operations to multiplication, add, or subract, then you get much different results. The typical period becomes exponentially larger, but multiple periods called "cycles" occur, dependent on seeding. The period is thus dependent on seed, and most of the cycles are very long. This simple form can produce very short cycles too, but I fixed that with later modifications. Either way, the period increases as number of elements increases too.

So that's what I started with. Then I had something to compare to the traditional 2^(n-1) period LFSRs. When adding plus, minus, and multiplication, they become multiperiod with some short cycles. With the structure I finalized, the short cycles were rare, but can be as short 2^(n+1) which is greater than the XOR single period. Typical minimum periods for very small LFSRs using all the math operations are 2^(n+7) or greater, with greater length vastly increasing the period.

I ran into the problem that you can't test periods for LFSRs unless they are really small, and using bytes as the smallest size elements, you can only go down so far. What I found was that no matter what, anything less than 8 taps was just never going to make decent numbers. You can get very long periods but there are just too many correlations. Actually I found that I could get good numbers using only six taps, but eight is better, so I stuck with that. I later found that I could get decent numbers using four taps, but doing the formula twice on them.

But, using 8 taps means you need at least 8 elements of state, otherwise you have to double up some of the taps and resuse the same elements. Thats dicey. I tried it, and it might work but its too much trouble to work around the issues with correlations. Fixing one thing causes another correlation. So with 8 taps you need at least 8 elements and by that point the periods are stupendously large and uncountable. So you just can't test emperically. Even with the shorter sizes I have to stop the input after some cutoff to look for short cycles in less than months at a time.

At that point we are closer, but the length and taps still have to chosen carefully. Thats not what I wanted. But I explored various fixes to allow any mumber of elements and any initial tap positions. The relationship between taps is the distance between them, and they always increment or decrement in lock step so the distance between taps is constant. And, furthermore it's well known and defined because we have to use precalculated irreducible primitive polynoials modulo 2. If we chose just any old taps, then we would have an LFSR with multiple cycles, mostly very long but probably some short cycles in there.

Fixing the output to make statistically good numbers is pretty easy. The multiplication really helps but the rotate is really most essential. Then, for unpredictibility, add some counters that are initialized to random values. Then swap the tap pointers around on every pass. Swapping the taps alone gaurantees we will not stay in a short cycle long. We might hit another short cycle immediately, but so what, next time around we swap taps and boom, a brand new cycle. Will they ALL be short cycles? Who cares, it doesn't matter anymore, we ain't hanging around to find out. One additional thing ecbs does is to reset the state periodically. This is done by comparing some state element to a value of certain length. I chose length 2^20 which is approximately one million. If a certain element == 0 modulo 1048576, then a reset is triggered. The trigger number could be any hardcoded value, when the least 20 bits of the specified element becomes that number, then we call ecbs64setsize whitch resets the lfsralen and all of the tap positions. The adders are never altered other than the incremental update on each call. The power value is a hard coded constant. For cryptographic use, a small power is warranted, but for small amounts or effiency, larger power is better. In fact it's not needed at all. For PRNG statistical testing I use default power 2^24, and for cryptography, I use 2^20. For a small speed gain you could just comment the whole section out really.

So here is the final algorithm. I label the variables with "a" in case I ever need "b" or "c" variants. lfsra[] is set to 128 but can be anything. I chose 128 so I can use signed chars for taps and detect less than 0. This means a maximum of 127 elements, with the final lfsr array element unused. LFSR length will be chosen randomly. The only restriction is that the taps must be unique. They don't have to be any particular starting values or in any particular order.


lfsra[128]
lfsralen   - The number of elements.   Current  MIN = 9   MAX = 127
tapa1      - tapa1 is the primary tap that recycles back into LFSR state.
tapa2
tapa3
tapa4
tapa5     - tapa5 is also a primary tap, recycled back into state. 
tapa6
tapa7
tapa8
shift     - For rotating tapa1 and tapa5 elements.
power     - For calling ecbs64setsize about once every million cycles.
adder1    - These are 64 bit counters, so adder2, adder3, and adder4 will
adder2    - never change. But they will all be set to some value and that
adder3    - value is added to one of the LFSR elements by the taps.
adder4

We just increment the adders by counting, increment the taps, cycling back to 0 when they hit the end. And then apply the operations, including swapping one pair of taps, and finally return a number. It looks long, but its just a bunch of repetitve code. With modern CPU pipelining and branch prediction it does surprisingly fast for that many operations.

/* Increment the adders */
    ++adder1;
    if (adder1 == 0) {
        ++adder2;
        if (adder2 == 0) {
            ++adder3;
            if (adder3 == 0) {
                ++adder4;
            }
        }
    }

/* Add the counter values to current tap elements. */
    lfsra[tapa2] += adder1;
    lfsra[tapa3] += adder2;
    lfsra[tapa6] += adder3;
    lfsra[tapa7] += adder4;

/* Decrement the taps and cycle back to maxtap when it goes below zero */
/* maxtap is always set to lfsralen - 1. I could have incremented like */
/* this but using zero allows CPU branch on zeroflag which is always   */
/* set or cleared so no compare is needed, just a BRZ.                 */ 
/*    ++tapa1;                                                         */
/*    if (tapa1 >= lfsralen) {                                         */
/*       tapa1 = 0;                                                    */
/*    }                                                                */
/* We can't just depend on 8bit unsigned cycling because we have       */
/* variable length LFSRs and cycling only works for 2^8, 2^16, and     */
/* so on. We can be any length between 9 and 127, but never 256.       */

    --tapa1;
    if (tapa1 < 0) {
       tapa1 = maxtap;
    }
    --tapa2;
    if (tapa2 < 0) {
       tapa2 = maxtap;
    }
    --tapa3;
    if (tapa3 < 0) {
       tapa3 = maxtap;
    }
    --tapa4;
    if (tapa4 < 0) {
       tapa4 = maxtap;
    }
    --tapa5;
    if (tapa5 < 0) {
       tapa5 = maxtap;
    }
    --tapa6;
    if (tapa6 < 0) {
       tapa6 = maxtap;
    }
    --tapa7;
    if (tapa7 < 0) {
       tapa7 = maxtap;
    }
    --tapa8;
    if (tapa8 < 0) {
       tapa8 = maxtap;
    }

/* This is first of two formulas. We are updating the value at tapa1. */

    lfsra[tapa1] = ((lfsra[tapa1] ^  (lfsra[tapa2] * (lfsra[tapa3]|1))) + lfsra[tapa4]);

/* Decrement the shift value and random set when it hits zero. I   */
/* know, there's a bias for 63 but that's Ok. Otherwise it would   */
/* have ALWAYS been reset to 63, and thats even MORE biased.       */

    --shift;
    if (shift == 0) {
        shift = (lfsra[0] & 63);
        if (shift == 0) {
            shift = 63;
        }
    }

/* Now rotate tapa1 element shift times */
    lfsra[tapa1] = ((lfsra[tapa1] >> shift) | (lfsra[tapa1] << (64-shift)));

/* Use two elements to select taps to swap. The switch statements is fast. */
    swap = (lfsra[tapa3] ^ lfsra[tapa7]) & 7;
    switch (swap) {
    case 0: tmp = tapa1;
        tapa1 = tapa2;
        tapa2 = tmp;
        break;
    case 1: tmp = tapa2;
        tapa2 = tapa3;
        tapa3 = tmp;
        break;
    case 2: tmp = tapa3;
        tapa3 = tapa4;
        tapa4 = tmp;
        break;
    case 3: tmp = tapa4;
        tapa4 = tapa5;
        tapa5 = tmp;
        break;
    case 4: tmp = tapa5;
        tapa5 = tapa6;
        tapa6 = tmp;
        break;
    case 5: tmp = tapa6;
        tapa6 = tapa7;
        tapa7 = tmp;
        break;
    case 6: tmp = tapa7;
        tapa7 = tapa8;
        tapa8 = tmp;
        break;
    case 7: tmp = tapa8;
        tapa8 = tapa1;
        tapa1 = tmp;
        break;
    default: break;
    }

/* This is second of two formulas. We are updating the value at tapa5 this time. */
    lfsra[tapa5] += ((lfsra[tapa5] ^  (lfsra[tapa6] * (lfsra[tapa7]|1))) + lfsra[tapa8]);

/* Decrement the shift value and cycle back to 63 when it hits zero   */
/* Note the use of hardcoded lfsra[0] and not a tap selected element. */
    --shift;
    if (shift == 0) {
        shift = (lfsra[0] & 63);
        if (shift == 0) {
            shift = 63;
        }
    }

/* This time rotate tapa5 element shift times instead of tapa1. */
    lfsra[tapa5] = ((lfsra[tapa5] >> shift) | (lfsra[tapa5] << (64-shift)));

/* Check and reset state approximately once every million cycles. */
    if ((lfsra[tapa2] & power) == 734551LLU)  {
        setsize = ( lfsra[tapa8] & (MAXLEN-1) );
        setsize |= (MAXLEN/2);
        ecbs64setsize(setsize);
    }

/* Return elements tapa1 and tapa5 XORed together. */
    return(lfsra[tapa1] ^ lfsra[tapa5]);

Links to professional PRNG sites and a lot of discussion about ecbs

Explanation and detailed description of the algorithms and why they are as they are

How to use testecbs8 and testecbs64

A cryptographic cipher program that leaves no trace of itself after use. How ecbscryptogen works.


     I have some files available for download, source code only.
     testecbs8.c     - Source code for the smallest version.
     testecbs64.c    - Source code for the largest version.
     ecbscrypto.c    - Source code for a standalone cipher program.
                       encrypts or decrypts, reads STDIN and writes STDOUT.
     ecbscryptogen.c - Source code for a "standalone, data dependent,
                       cipher program" generator program. This program
                       creates a new program and executes it to encrypt
                       or decrypt. Very mysterious, and harder than $%!^
                       to program and debug. 

View or Download testecbs64.c, testecbs8.c, ecbscrypto.c, ecbscryptogen.c and some minor utilities.