BiEntropy Thread

Arctile (BiEntropy – The Approximate Entropy of a Finite Binary String) -

Some codes - GitHub - winternewt/bientropy

So, here’s new entropy bias amplifier to try. Idea is somewhat similar to random walker but I hope it’ll be more sensitive.
Random walker takes like K bits on input and returns 1 or 0 (or tie is possible if K is even)
Takes inputs:
ampl_bits - K, i.e number of bits
out_length - length of output in bytes
&print_raw=1 - option to print raw unprocessed entropy, it’s printed in eight variants, each shifted by one bit (since selecting bytes from random bit-stream is actually arbitrary).
This one here produces 3 outputs :
data_out - Hamming weight - Wikipedia , similar to random walker but doesn’t take order of bits into account, only number of ones and 0’s, for even K s last bit is taken as tie-breaker for ties.
data_bd_out - that on is based on binary derivative notion -, XOR ing the bit sequence with itself shifted by one bit left, reducing length by one, until one bit is all is left. It has the nice observed property that for any number of input bits (K) when all variants are tested - it returns equal numbers of 1s and 0s, no ties.
data_weights - this one is what it’s all is about TBientropy weights [1305.0954] BiEntropy - The Approximate Entropy of a Finite Binary String , it allows to assess bit sequences (length K) based on well, how ordered these are.
Sequences like 1111111 and 00000000 are perfectly ordered and trigger random walker. But sequences like 01010101 and 10101010 are well ordered too, but for random walker these are “transparent” due to their number of 0s and 1s are near equal. TBientropy weight function based on TBientropy mean value allows to score input bit sequence, independently of the result (0 or 1) and assign how much “weight” each output bit has:
[0.0…1.0] - input sequence is unordered (~0,8-0,9 is perfect chaos zone)
[1.0…2.0] - ordered input with low local entropy.
Having this weights assigned to each bit, we can also have average for output bit sequence (weight-per-byte, or per nibble, i.e hex digit) with the same properties: weight function is based on mean value, so that statistically such average converges to 1.0
That is what we have in output, weight values for each nibble, based on these we should be able to discriminate MMI signal/noise independently of output values. This TBientropy function is O(n^2) by default, as cpu hungry as any other entropy analysis so making it on-the-fly took a while. I finally made it capable of ~30-100MBps single thread.

Each of the outputs consists of K arrays, “channels”, shifted by one bit. That is because we operate in bytes and results of these functions are dependent on bit order. But as I said, the arrangement how infinite random bitstream is fitted into bytes is arbitrary, so if we scan all variants of input shifted by one bit we have higher chance of capturing signal. To grasp this better think of tuning a radio to the right frequency, these channels are not independent output, they’re correlated and most probably mmi “signal” can me caught on on multiple adjacent ones, but “bands” might be different each time

(by Newton Winter)

We haven’t decided yet how to select channel better, by average weight or by 3 strongest weights in it. One channel in json output looks like this

Well, now I’m thinking that it can be formulated better.

There are two parts to this. Firstly i was looking for on-the-fly entropy analysis method to measure how ordered is the entropy bitstream to discern deviations from noise. One simple way to do it is by using random walker, but I was thinking about not only producing an output bit form an input N bits but also to quantitatively assess the result of this amplification.
So the first idea was to look at Shannon entropy and this train of thought led me to this article

After I took the labor of actually coding this, it turned out more versatile than I initially thought of, because this measure can be applied to virtually any amplification method of choice that produces 1 output bit for N input bits, be it random walker, majority vote, hamming weight, binary derivative or other. So it has a nice property of marking every produced bit with a score giving a hint how ordered the initial bit sequence was.

The idea with “channels” kinda came in process of working on this because of how the score depends on selected “frame” of N bits and how this frame is aligned to the input bit stream. After some thought I figured out that it is also independent of amplification method used. Our bits production is persistent but MMI influence doesn’t necessary have to be aligned with how we lay or infinite raw of bits into bytes. Random walker will also produce different results depending on selected frame, i.e chosen zero starting point.

I’m also attaching an excel sheet of calculated tbientropies and respective weight function. (2.8 MB)

Once I have spare time, I’ll calculate all possible outcomes of walker method and check what distribution will the weights have. Intuitively, all “signal” results should have weights > 1.0

I wonder is the experiment pure enough, since this method is tested on MED’s data that is already bias-amlified?

Detecting and quantifying the effect of mind or MMI on a random sequence is a challenge. The fundamental issue is that an MMI generator or REG ideally produce perfectly random sequences. That is, except when mental intention or mental effort causes the sequence to deviate slightly from that natural random output. The signal to noise ratio (S/N) of the unaffected sequence is 0 since the signal amplitude is 0. Basic statistical theory shows that when a signal is present, that is, mixed with the noise – the unaffected random sequence – The signal to noise ratio increases as more data is used in the measurement. In a binary random sequence (MMI generator output) the increase in S/N increases approximately as the square root of the number of bits measured. This brings us to the conclusion that it takes a lot of bits to make a good measurement of MMI effects since the S/N is pretty small. As an example, in an MMI measurement about 150 bits in every million might be switched by mental intention. With a perfectly efficient bias amplifier, about 111,111 of those raw bits would be needed to produce a single output bit with a 10% effect size. Given 300 of those trials, the probability the results could have happened by chance (without mental influence, i.e., the null hypothesis test) would be 5%, a statistically significant result. If the effect size were twice as big, it would only take about 75 trials to reach a statistically significant result.

These example values are in the range of real MMI test results, although the output of Mind-Enabled Devices can be at the upper end of the range, or even higher with a skilled user. 75 trials can take about one minute to reach a significant result, that is, a result we can start to believe was not produced by random chance.

Rigorous mathematical methods for calculating entropy can require millions of bits, which could take nearly a minute of continuous data at 100,000 bits per second generation rate. While entropy can capture nearly every possible pattern or deviation from randomness beside bias, it generally takes too long to provide real-time user feedback in a practical MMI system. Any algorithm that produced a very close estimate to the actual entropy of a short sequence (hundreds to thousands of bits) might be useful for quantifying MMI effects. However, caution is required since the square root of the number of bits tested still applies to the signal to noise or accuracy of every algorithm. While BiEntropy appears to provide a measure of disorder for very short sequences, I suspect the results will be fairly noisy, that is, giving a wide range of results for a group of perfectly random sequences measured. It’s certainly worth looking into, and I am glad to see people thinking about MMI in such a deep level.

If I understand bientropy algorythm correctly, It might take about minute to test if there is influence. However it can be modified to work with data stream. So, we would achieve detection. But it’ll have time lag.

Entropy measurement could be set up to look at a stream of data. However, experience suggests the largest results of MMI measurements are achieved when the user receives real-time feedback of how they are doing. Real-time in this context means about 1 second or less, but 0.2 seconds is preferable because the user experiences little or no delay after making the mental effort to achieve the desired result.

Uhhhhh…That might explain, why I had better results with entropy tests, when I just meditated near device. I had far worse results, when I tryed to influence REG with intention.
I used very crude and slow method of entropy detection: I took sequence of 3 frames from camera and shoved it into gzip compressor. And measured difference between compression rate and shannon entropy.

Green circle is a taget i’m focusing on. Yallow - attractors, Blue - voids.
Each trial initiated by button click, but calculation takes 104 seconds. One strongest attractor and one strongest void are shown for each trial. Entropy channel was selected two different ways: By average weight of bytes in channel and by average weight of 3 bytes with the strongest weights on the channel.

I can confirm that it works the same way for me. Setting an intention and subsequent passive meditation with empty thoughts yields better results that focusing on the desired output constantly

@D0ublezer0 Interesting, I wasn’t expecting Binary Derivative to produce better results. It caught my attention because of this function’s ability to produce 1 or 0 stably irrespective of number of input bits (even number of bits don’t need tie breaker)

Honestly, I don’t understand, what’s meaning of axis on your graphics.

We now have troubles with output rate of our server. Since Bi-Entropy weights can only be calculated during bias amplifiaction, we made our own amplification algo with factor 32, but we use MED’s to feed it with entropy, and MED’s output rate is limited to 100kHz, so after amplification in Bi-Entropy server it becomes much lower and it takes more than 3 minutes to generate set of coordinates suitable for our attractor search algorhitm. Is there any MED-like RNG’s with bigger output rate we could use in our experiments?

Its not diagrams, so axis doesnt represent anything other than location itself

I have a prototype MED with 1Mbps output, but I am too busy to make more of them at the moment. I have found it is possible to stretch the output bits without losing too much in effect size. This takes a setup using a deterministic or pseudorandom component that is constantly seeded by the fully entropic bits from the MED. In one of my patents, US6,862,605, see Figure 3 and its description. Rather than reducing defects, the deterministic processing can be used to multiply the input bits. While one bit (or word) is shifted in from the MED, a number of bits (words) can be produced by holding the input constant while shifting the rest of the bits (words) in the design. Say, input one bit and shift out 10, each of which will have entropy = 0.1 bits/bit.

I suppose the MED output could be stretched the usual way by XOring one bit with 10 sequential bits (or one byte with 10 sequential bytes) from a pseudorandom generator, but I haven’t tested that configuration. The design in the patent uses only entropic bits from the MED to update the seed in the deterministic component. That I have tested extensively and it does provide an advantage over the lower number of bits straight out of the MED. Note, the input, output and registers in the patent can be operated word-wise or bit-wise. Perhaps one could just replace one word in the seed each time a new word is received from the MED while spinning out 10 words from the pseudorandom generator between seed updates.

The approach is not limited to a 10x increase in output rate, but I don’t know at what point the increase in advantage will be offset by the reduction in entropy per bit. I used this to increase the effect size of hitting single bits, which is a different task than what you are proposing. I think it’s worth a try, but I don’t know how it will work. Algorithmically it will increase the data rate, but MMI is not just an algo.

I had different idea. We could use some kind of deterministic scrambler to get multidimensional vectors from MED.
So, we have bit stream from MEG. We shove it to 10 different scramblers and use them to generate random walk. As we can suppose, each scrambler output will give coordinate of random walker, which fits shifted poisson distribution. However, if we will ensure, that these 10 channels have no correlations without mind influence, we would be able to detect mind influence through appearance of correlations.
Since MMI can surpass complexity of scrambler, we can use fact of emergence of correlations for better detection of influence with multidimensional input.

I did use something like this. I designed multiple pseudorandom generators as described in the patent, each with multiple shift registers continuously seeded with the same MED data. The length of every shift register in every generator was relatively prime to all other registers. That means there were no common factors between any registers’ lengths, and the output sequences from every generator were then guaranteed to be independent from all others. The multiplication of the MED data was equal to the number of independent PRNGs. The entropy in each bit produced this way is still 1/number of generators.

I would caution against assuming how MMI interacts with processes and designs that have not been extensively tested and compared in actual MMI systems. Having tested literally hundreds of designs, I can say they don’t all work as expected. Some provide small marginal improvements and some are less effective than the original unmanipulated data. Certainly there is always room for improvements and I am eager to see better systems developed. It’s just necessary to have a way to asses when an improvement has been made.

Speaking about binary derivatives. Is it possible to calculate n-th binary derivative without calculation of entire bit ladder?

Looks like, there are such method.
So, let’s define x(i) as i-th bit of bit sequence, which we want to process. And y(j,i) are bits of binary derivative of j-th order. If we want to calculate value of y(j,i), we’ll need j+1 bits from x sequence. Exclusive or has properties of commutativity, associativity and distributivity. So, if we simplify formula for y(j,i), we’ll have just long string of exclusive or with different amounts of bits from x sequece.Also a^a=0 and a^0=a. So, all bits with even amount will equal to 0. And all bits with odd amount will equal to value of said bits. So, simplified formula will be:
y(j,i)=(sum{k=0…j}(x(i+k)&mask(j,k))) mod 2

How do we calculate that mask? That’s simple. Each bit occurs in said formula corresponding to binomial coefficient of j+1 order. So
mask(n,i)=bc(n+1,i) mod 2

However, it might mean, that derivatives of each order might have non linear and non logarithmic informational capacity.

Masks for first 32 orders of bit derivatives.
0b11 1
0b101 2
0b1111 3
0b10001 4
0b110011 5
0b1010101 6
0b11111111 7
0b100000001 8
0b1100000011 9
0b10100000101 10
0b111100001111 11
0b1000100010001 12
0b11001100110011 13
0b101010101010101 14
0b1111111111111111 15
0b10000000000000001 16
0b110000000000000011 17
0b1010000000000000101 18
0b11110000000000001111 19
0b100010000000000010001 20
0b1100110000000000110011 21
0b10101010000000001010101 22
0b111111110000000011111111 23
0b1000000010000000100000001 24
0b11000000110000001100000011 25
0b101000001010000010100000101 26
0b1111000011110000111100001111 27
0b10001000100010001000100010001 28
0b110011001100110011001100110011 29
0b1010101010101010101010101010101 30
0b11111111111111111111111111111111 31
0b100000000000000000000000000000001 32

Does it resemble something familiar?

Looks like binary derivatives are special kind of binary convolutional encoding. However, we convolute bitstream with layers of sierpinski triangle.
Honestly, now I think, that bientropy isn’t so good idea afterall. Because if we want to estimate entropy of system, we need to take in account entropy of bounds between elements of system(An Information-Theoretic Formalism for Multiscale Structure in Complex Systems.Benjamin Allen, 1, 2, 3 Blake C. Stacey, 4, 5 and Yaneer Bar-Yam 5). Chain of bits is special case of system with temporal bounds. So, yes, if we want to calculate overall information in such kind of system, we’ll need to calculate entropy of convolutions.
However, I can’t really get, why do we need to convolute bitstring specifically with sierpinski triangle layers. What’s meaning of it?
If we’ll take in account Lucas’s theorem, bc(n,k)%2=1 if n&k=0. So, mask(n,i)=((n+1)&i)==0. So, each mask is AND composition of several periodic functions. But it doesn’t help much.
Also, I can bitch about weighing function. Because, as we can notice, we need 2^popcnt(n) bits from original sequence to produce one bit of binary derivative of n-th order. So, I can’t take a grasp, why do we use linear or logarithmic function of n.

Today I found a critical flaw in Binary Derivative method, that puts its effectiveness into question. It turned out, that bias used for Bi-entropy weight calculation leaks through the BD function somehow and creates correllation between weights and bit-patterns. It becomes visible when you try to filter bits by weight and convert them into binary word coordinates. Patterns become visible starting from filter value 0.85, that means we use only bits with weights higher than 0.85 and become more biased with the growth of the filter value.