Word-Amplification

The premise of this study was the assumption that the amount of information transmitted using Psi may be limited by the nature of the bias amplifier. It has already been suggested earlier that the generation of geographic coordinates in MMI by the Binary Word method is hampered by the fact that Psi can transmit no more than 1 bit of information per second. In this case, it can be assumed that an increase in the amount of entropy will not lead to an increase in the amount of information, but will only increase the accuracy of its transmission.

However, it is worth noting that all currently known methods of bias-amplification are based on entropy compression from N input bits to 1 output bit. The very nature of such compression does not imply that information more than 1 bit can pass it without loss. Of course, if the coordinate generation algorithm is based on a random walk, the information can be accumulated and the coordinate can be obtained in several queries. However, is it possible to carry out such entropy compression, in which the output would be not 1 bit of information, but, for example, 32 bits?

To implement such compression, a slightly different approach to amplification is required, assuming that the entropy sequence is statistically dominated not by a certain bit value, but by a certain bit pattern. There are two obvious problems here:

  1. A pattern can occur with a non-constant interval, which means that if we feed a fixed number of bits to the amplifier input, a significant proportion of the patterns will be “cut into pieces” by the batch treshold.
  2. The pattern may be transmitted with errors and the sequence will contain many “mutations” of the same pattern with differences of 1-2 bits.

We tried to solve the first problem using the “rotation” algorithm. In this algorithm, the bit stream passes through a 32-bit window, shifting 1 bit per step. At each step, a new number is obtained from 32 bits in the window. Thus, an array of all possible numbers obtained by the Binary Word method, regardless of the size of the intervals between them. In the simplest version of the amplifier, the output is a number that occurs more often than others.
However, this approach introduces a “regularity problem” where numbers that have a highly regular bit pattern, such as 1010101010, are more likely to be duplicated when bitshifted.

The second problem concerning errors in patterns is solved in two ways. The first is to simply sort through all the found numbers and calculate how many analogues exist for each of them with a difference of 1-2 bits. This method can be difficult to calculate because it involves many nested loops. The second method is to convert all numbers to Gray codes and place their values ​​on a one-dimensional scale, and then find on this scale the point around which the density of values ​​is maximum.
The disadvantage of this method is that it only works if input bit sequences do not intersect with each other.

When trying to combine the rotation method with the similarity search method, strong autocorrelation biases are found that cannot be eliminated while analogs are searched among numbers that have mutual bits. To date, this problem has not been resolved.

1 Like

To solve the problem of regularity in RWA (rotation-based), it is supposed to add a condition that the same numerical value should not be taken into account if it has mutual bits with its past manifestation.
However, the possibility of combining RWA with SWA (similarity-based) is still in question. So far, the only possible method of their combination seems to be either sequential transfer of the results of the work of one amplifier to the input of another, or a choice between their results based on the weight of the output value.

The question also arises of how to organize a mode of variable entropy consumption in RWA. Is it necessary to calculate for each number the deviation relative to its inverted counterpart, or simply take as a condition that the number of occurrences of the number exceeds the average for all numbers twice?

One way to enhance SWA is to add channels. Thus, we run SWA on 32 channels containing the same entropy stream, but shifted by the number of bits equal to channel number. Since patterns in SWA are read in 32-bit words, this will not help to avoid them being cut by a word boundary, but it will allow to select a channel with the best density of similar values. Thus, we can influence the choice of the beginning of the sequence.

The third approach to word amplification is called positional amplification. In this approach, binary words of 32 bits are again fed into the input, and each bit is fed into a separate RW bias amplifier. When all Random walkers fire, an output pattern is assembled from their results. This approach is closest to the original 1-bit amplification, but also only works if the input is entropy divided by binary words.

1 Like

What you are discussing is complex, and touches on the fundamentals of practical MMI systems.

I would say that there is nothing that suggests 1 bit per second of information is the max that can be gotten through any type of MMI and bias amplification system. 1 bps is just the maximum theoretical error-free information rate I have measured in real MMI testing.

The question I think you are addressing is, how can one achieve higher information rates in an MMI system? An important additional question is, how much information is available or can be measured in response to mental intention in a stream of MMI bits? Clearly no more information than that can be gotten by any sort of algorithm.

The basic “bias” measurement is what everyone else has used since the beginning. What I call bias amplification literally takes a small bias in many bits and converts that into a large bias in a few bits. The amount of information and the information rate does not change. Information is preserved by the transformation into a form that is easier to measure and use. I have demonstrated that it is possible to gain higher information rate by using different properties of the bits from an entropy source. The most obvious is to look at the first-order autocorrelation of the bits. This is most efficiently done by converting the autocorrelation into a bias and then applying bias amplification. Again, this conversion and amplification is done without loss of information. The result is that twice as many bits are processed resulting in square-root of 2 times the bias alone, or about a 40% increase in information rate. This is the largest increase theoretically possible by using two properties of the original sequence. That is only possible because bias and first-order autocorrelation are independent of each other.

Any processing that produces dependencies, cross-correlations or mutual entropy between the processed streams cannot provide the theoretical square-root of n increase in information rate (n being the number of different processing methods or the original stream). Even if 100 independent ways of processing a stream could be devised, that doesn’t mean the information rate would automatically increase by a factor of 10. There is still the consideration of how much information is actually available to be found, for which I do not have a theoretical basis to predict. I can say that bias and autocorrelation processing is not the limit of what is possible. I have looked at a number of other properties in the raw sequence and gotten some increase, and certain types of artificial neural networks trained using real MMI testing data can also increase the information rate somewhat. However, I never got more than about a tripling of total information rate beyond a simple bias measurement – nothing like getting a 10x or even higher increase.

The simplest way to create multiple independent sequences from a single one is to combine the original sequence with a number of independent random sequences. The combination is accomplished by XOring the original sequence with the random sequence. This can be done as many times as desired, but to ensure independence, the length of the original (and each random sequence of equal length) should be many times the number of new sequences produced. I don’t have an exact model for this, but I suggest the number of bits in the sequence should be at least 10 times the number of new sequences produced. This method of creating new sequences does not increase the information contained in bias in the original sequence, but that is no indication whether or not more MMI information can be extracted - MMI cannot be simulated or modeled beyond usual statistical theory.

The most likely method of multiplying bits to subsequently process is to use a linear feedback shift register and extract multiple bits for each input bit. This approach does not use any deterministic processing or pseudorandom bits. That is, only entropy-containing bits are used as input. If anyone wants to try this approach, I can provide a detailed description of the design. In one of my previous messages I described a complex form of this approach, but that design may be simplified.

This is not meant to be a full response to your ideas, only some principles that apply more generally.

Thank you, it will be interesting to read the detailed design description, it can help in designing our research environments.

However, the idea I was talking about is based on the assumption that MMI is able to induce patterns of high complexity in the flow of entropy. It is the complexity of the bias in this assumption that determines the amount of information it carries. For example, if we direct our mental effort to ensure that there are more ones than zeros in the stream, the amount of information is 1 bit, and amplification of such a bias will only allow us to determine with greater accuracy that at a certain moment, ones statistically predominate over zeros. But if we think, for example, about the number 218, the pattern 11011010 may begin to appear with an increased frequency in the entropy stream. Most likely, such a pattern will not pass through the bias amplifier, since it will be compressed to 1. The idea of ​​a word amplifier is to detect an increase in the frequency of complex patterns in the entropy stream and output these same patterns to the output, something like this

000110110100101010101101101001111011010011011010 → 11011010

The main difficulty in creating such an algorithm is that even if we know how long the pattern is to be found, we do not know exactly how to determine its boundaries in the entropy flow. If we read the entropy in batches of 8 bits, we get the following:

00011011_01001010_10101101_10100111_10110100_11011010 - only one pattern detected. In reality, they should be counted like this:

000_11011010_010101010_11011010_011_11011010_0_11011010 - 4 patterns detected.

Naturally, we don’t know what number we need to find, since its value needs to be extracted from the Psi signal.

The RWA method I described earlier roughly solves the problem of finding patterns with non-constant intervals between them. However, I have not yet figured out how to statistically determine that a pattern has indeed been detected. At the moment, the algorithm simply returns the most frequently occurring pattern without any statistical assessment of the anomality of this frequency. In addition, RWA will most likely be impossible to use for 32-bit numbers, since the probability of repeating such a complex pattern is extremely small.

Therefore, the second approach to word amplification is based on the assumption that the pattern can be transmitted with errors, for example, we can have a series of patterns like this:

11011010 10011010 11111010 11011011 11011000 - which will also carry the pattern 11011010, but with errors of 1 bit.
To detect such a pattern, clustering of binary words by similarity and searching for the value they’re attracted to is required. I called this approach SWA. However, its main problem is that it cannot be combined with the RWA method, which allows you to extract patterns with an irregular interval. The problem is that RWA uses bit-shift to create a series of numbers, most of which have mutual bits, and when looking for similar values ​​among them, this automatically creates autocorrellation bias in favor of values ​​with a highly regular pattern, such as, for example, 170 (10101010), since with a bit shift, they turn into their copies with a high probability.

I think I finally understand what you are suggesting. My conclusion after some analysis is that it may not be possible to use a totally unconstrained approach to finding complex patterns in a sequence of bits. “Unconstrained” means no bounds are preset for the lengths or positions of the target pattern in the sequence. If we abandon that approach and apply some minimal constraints, it is possible to calculate the probability of the sampled data having occurred by chance if there is no mental influence. That is the definition of the null hypothesis test. Typically a threshold of 5% or less is chosen to indicate the null hypothesis is significantly improbable, suggesting that the alternate hypothesis may be true. Here, the alternate hypothesis is that the observed data was altered by mental influence.

First, assume the binomial distribution is the distribution of the sampled data. Then select a length and starting point of the data samples. The observed number of successes (occurrences) are for non-overlapping blocks of the selected length, counting from the starting point. The probability of success is 1/2^n where n is the length of blocks in bits. Note, we may do a complete search using different lengths and starting points, but it is not allowed to change these variables while processing the sequence, which would be the unconstrained version.

The probability of the null hypothesis test is the probability of the cumulative distribution function of the binomial distribution being equal to or greater than the observed number. An example will help:

Take 8192 bits, which is 1024 – 8-bit words, the selected length. Start with the first bit and count the number of times a selected pattern of 8 bits occurs in the 1024 blocks of data in the input sequence. The average probability of seeing the selected pattern is 1/256 = 0.00390625 and we should expect to see 4 of that pattern (on average) in the sequence. The probability of seeing just 4 or more of the pattern is 0.56691, meaning we cannot reject the null hypothesis if only that number were observed, and we cannot say if mental influence was present. If 8 occurrences of the pattern were observed, the probability would become p = 0.05078, meaning that we may reject the null hypothesis at the 5% level of significance suggesting the alternate hypothesis is true (that there was mental influence operating to produce the specific pattern). It would take 10 observations of the pattern to reject the null hypothesis at p = 0.00801, that is, less than 1%.

This approach can be used with any block length and any starting point, though there is a slight bias in the statistics due to adding new bits to the sequence for starting points not at the beginning of the sequence while keeping the number of blocks constant. The alternative is to use one less block of data for starting points greater than the first bit, which will also produce a bias when comparing the results for different starting points. This issue can be reduced by using very long input sequences compared to the block size so the average count of any pattern is fairly large. In the example above, the average number is only 4, which is small. This suggests that a pattern length of 8 bits may be on the edge for bit sequences of only 8192 bits.

A little thought will show how to generalize this approach to block lengths from 8 bits down to 1 bit and different starting points. A table of probabilities can be calculated by counting the number of occurrences of every 8-bit word, although there will likely be very many ties in the number of highest counts. For 8-bit words, there are also 8 possible starting points, making 2048 different counts. This is very computationally intensive in software. Also note, picking the highest count from those 2048 possibilities produces very different probabilities of the null hypothesis test, which only works as described above for a single pattern and starting point. A new statistic could be derived, but I will not attempt it now.

Note, for a length of 1 bit the method converges to the simple bias test.

One question I still have is, how does one use the information provided by seeing one significantly unexpected pattern? It’s really still only one bit of information (presence or absence of the pattern) unless all patterns are searched. Then it takes a much higher count of any one pattern to become significant.

@ScottWilber

One question I still have is, how does one use the information provided by seeing one significantly unexpected pattern?

If it was an-mmi interface for some console with 256 buttons, user could be able to choose one with intent for 1 request.
But in our case, the experiment design, where the user should confirm that the pattern matches the one he expected is made just for the purpose to ensure that the amplification works. In further experiments user will ask questions, he doesnt know the answer to, and the pattern will be the answer. The ultimate goal is to be able to generate a coordinate on a map with one request, using returned numbers as axis values. In this case the user will ask, where on the map he can finda certain something, and mmi will predict the place.

The probability of success is 1/2^n where n is the length of blocks in bits.

Isnt the probability for RWA is just the same but multiplied by N?
RWA is the method, where we shift the entire bitstream trough the 8-bit window 1 bit at a time.
10110[10101001]010100101 → 169
1011[01010100]1010100101 → 84

So we have a number per bit. Then we can count how many times each number appear. This method can extract patterns with all possible start positions, so no pattern is sliced by block boundaries. The only problem is that we need to exclude nubers, that repeat themselves, having mutual bits with their previous occurence, but their amout also can be counted.

This condition is required to avoid regularity bias. For example, the number 0 (00000000) when being shifted 1 bit left has a 50% chance to give another 0.

The biggest question here is what requires more Psi-energy, to create a lot of deviations with moderate proability for a long period of time or to create a few extremely improbable outcomes for a small period of time. In the first case autocorrellation-based amplifiers will work better due to persistency of the statistical effect, and in the second case, detection of precise patterns on a small time intervals is more important.

Many thousands of trials over years suggest mentally intended deviations are observed as a small bit-by-bit effect on the observation of a number of bits. The size of the effect is on the order of tens to hundreds of bits per million measured bits.

Several variables affect the effect size of the results, which can cause the range of information rate to vary over many orders of magnitude. Some variables relate to the subject performing the trials. Some subjects are more skilled or have learned to be more effective by practice. Another personal factor is how the subject is feeling at the time. Another set of variables relates to the experimental design, such as the quality and immediacy of feedback provided and the environment of the testing setup. Finally, the type of generator, its generation rate and processing method are important.

With respect to your question, experience indicates highly improbable results are accumulated over time by a number of mental efforts. At a constant information rate the accumulated deviation becomes more significant as the number of trials increases. Note, the increase in significance is highly nonlinear as the increase in z-score is proportional to the number of trials.

I also did many real-world tests trying to obtain more complex information. This was usually in connection with picking various lottery numbers. The simplest and most-tested was for Pick-three lotteries, where three numbers from 0-9 were selected. The probability of picking the exact three numbers in proper sequence is 1 in a 1000. The number of bits of information is Log1000 (base 2) = 9.9658 bits. (Getting 1 of 256, an 8-bit binary number, is just 8 bits.) A number of approaches were tried, but two primary ones were; 1) Ask the question for each number, is this one of the numbers? The number of hits for each number was used to rank the 10 numbers and the highest 3 were taken. The numbers were bet so that the order was not important, reducing the number of bits of information needed. 2) The question was asked, that is, the intended outcome was, show or predict the three numbers for the target drawing. A separate block of numbers was used to calculate a z-score for all 10 numbers for each trial. A number of trials were performed and the z-scores were combined for each number for each trial. Finally, the 3 highest ranking numbers, that is, those with the lowest probability of occurrence by chance were selected. Again, the bet was usually placed not requiring the exact sequence of the 3.

After hundreds of real bets using one of these methods, the total winnings exceeded the amount of the bets. We were ahead, but the amount of effort required was certainly not sufficiently rewarded by the relatively small winnings possible. We did this exercise to demonstrate that real, valuable future information could be obtained using an MMI system.

At no time did it seem possible that anything like the nearly 10 bits of future information could be gotten with a single measurement or trial. Such a result could be possible, but very significant improvements in the generator, data processing and methods of practicing with feedback would be required. This is where developments in MMI or Mind-Enabled technology come in. One core goal would be in developing generators that are much more responsive to mental influence. Modeling shows that generators that take less energy to respond (by flipping the result to the intended outcome) would be more responsive. Preferably the energy would be infinitesimally small or theoretically nil. This might be true for a pure quantum generator with only two possible outcomes (not just an analog signal above or below a threshold, which is the usual type). In addition, the generator should probably be reversible, meaning energy is not required whether moving forward or backward in time. This avoids issues with breaking thermodynamic laws, which is assumed to be impossible. Splitting light in a polarization bean splitter is an example of a reversible process. The two output beams can be recombined to recover the original input beam.

A schematic and functional description of the LFSR circuit is in the thread
Increasing MMI Effect Size by LFSR Processing of MMI Bits
Though that is an older thread, I thought it was the appropriate place to add the message.