Though string theory, currently the forerunner for unified theory, has made a lot of progress, I we have still not understood the true essence of what probability means. We are will still left with mathematical equations which work but continue to fall short of attaining true cognizance.
I have tried defining probability a million times now but I havent come with one good definition and it seems to me that probability is just a reflection of our inability to measure or predict accurately. When a coin is tossed, it IS possible for you to calculate the outcome, but that's a very cumbersome calculation which involves millions of minute variables - and hence we settle down to an empirical approximation.
But quantum mechanics says that a particular quantum property of an object doesnt inherently have a value, but attains a value only when measured. That too if you measure it n times, the object will be attain different property values everytime and we cannot predict which value will be attained before a particular measurement is made. But what we can predict approximately is the frequency pattern of the values attained. To put it in perspective of a coin toss when the coin is tossed we arent sure whether the result is going to be head or tail, only when we see the result we know it and also we cannot predict the result of next coin toss, but what we can know is approximately the ratio of number of head and tail results for the tosses done till now. Please note that your information is actually about the history of coin tosses done till now and we just extrapolate that to the future. In other words if some one claims that a true unbiased coin will have the ratio of (Heads/Tails) as 1, then it is a definition of an unbiased coin and nothing more than that. If your real coin has ratio of .95 till now then you could say that your coin is biased as far as what you have observed till NOW. So again statistics of past is one thing and using it to predict future is totally another thing. This is different from physics laws which typically has some elementary causality attached to it instead of relying on empirical curve fitting and extrapolation. This empiricity makes me uncomfortable with the concept of probability and randomness.
The coin toss experiment and its associated probability explanation doesnt explain exactly the concept of quantum mechanical probability. Extending further the coin toss example, suppose the coin has been tossed and it is now lying in the ground, but say no one has looked at the result. According to Quantum Mechanics, as no one has observed it, even now it is in a hybrid state of Head-Tail and it takes on a concrete head or tail state only when someone sees the result. The attainment of state happens not when it settles in the ground but when you see it or measure it. Though the parallel is not exactly perfect, it provides the necessary insight to understand what qmech means by "only when observed an object attains its quantum property state".
If the question is - but then before we measured what state is it in? The answer is - it is in a fuzzy state defined by the probability function - in the coin (unbiased). It is just a statistical probalistic state imagery based on the curve which approximately fits the result frequencies. The right answer is we do not know or we do not care (qmech extremists pov).
If suppose we really care, then the question is can we measure the fuzzy state without measuring? What a meaningless question that is, some would say? Not so, Infact this was considered to be one of the most important questions in modern physics and multiple actors including Einstein and Bohr played a part in answering that. Stating again as per QMech, when when we measure the fuzzy state, we disturb the measured and as a result the measured moves from fuzziness to a concrete state and hence lose the fuzzy state. So how do we find out what was the fuzzy value before we measured?.
For understanding this we need to discuss another topic called entanglement. Entanglement happens between two particles which are setup in such a manner that if we measure the value of a canonical property of one of the particles then we will automatically know the value of the same canonical property of the other particle - typically particles are entangled in such a manner that they give opposite results. So let us take two such entangled coins and my blind friend takes one coin to an uninhabited galaxy outside the milky way - we both toss that coin at the same time and I look at the result (the canonical property) and see it is heads. My blind friend cannot look at the result (has not measured) and hence should I say that the coin my friend has is still in a fuzzy state - that is not true as once I see "heads" result in my coin I automatically know that the result in my friend's coin is "tails" because of the entanglement. So my friend's coin without any "active or disturbing" measurement seems to have attained a concrete state which is not what QMech told us in the first place.
Now there was a trick which I slowly injected in the above paragraph. I made the assumption that me measuring the result of my coin will not affect or disturb my friend's coin as it is very far away. In other words I assumed the measurement I made on my coin will not somehow make my friend's coin attain a concrete state instead of remaining in the fuzzy state (whether my friend looks at it or not). This I assume because nothing can travel faster than the speed of the light and hence my measurement action on my coin cannot affect my friend's coin instantly. This is fancifully called "locality". If we remove this assumption and state that these coins are entangled in such a way that measuring one of it means in essence measuring both however separate they may be in space, then we will go back to where we started.
Now that we have explained the topic of entanglement and the complications it brings to the table let us see what Einstein thought of quantum mechanical measurements - we will get back to entanglement shortly. Einstein claimd that a state (result) already exists and that is the state which gets revealed when you measure and the state is not attained because you measure. While Bohr said no such state exists before measurement and the object attains the state only when measured and because you measure.
But if "Locality" assumption is really true (nothing travels faster than light), we would have known the result (quantum state) of my friend's coin without measuring it (or without making it attain a state by measuring it) and Einstein argued this is so because it already posseses a predefined concrete classical state and not a fuzzy quantum state. So entanglement and locality combined provided Einstein to bring home his contention.
Through this argumnet Einstein tried to prove quantum mechanics, though succesful in accurately predicting result frequencies, is limited as it cannot predict the result itself. More importantly it cannot hide behind the assumption that there is no concrete result (internal state) before measurement as that is not true based on the above paragraph. To put it differently, concrete (non fuzzy) result does exist before measurement (as in the case of my friend's coin) but quantum mechanics cannot predict it. It has to be improved further so that this limitation is removed.
So now having given this argument, Einstein wanted come up with a theory which without violating the Locality assumption, provide a mechanism to predict the result rather than just the result frequencies. Now we will get into Meta world. Let us for a moment assume such a theory H exists - a theory more complete than QMech yet operating within the Locality constraint. Now for an important point - here - This theory (as per Einstein) is such that it "reveals" the inherently existing concrete result before measurement is made and we do NOT play a part in attainment of the result. We are NOT in the business of merely predicting result frequencies "if measurements are made". This theory is not about predicting the "result of a measurement", but revealing the "result or inherent state" which exist whether we measure it or not. Hope that subtlety is appreciated. But then this is just conjecturing as we do not whether such a theory can exist at all. Now is it possible to prove that such a theory Einstein can never exist and hence can never be discovered? This is similar to Godel's theorem which states that no formal structure can be created in mathematics which can be used to completely prove all theorems in mathematics. The above statement is not fully accurate but sufficient enough for our current discussion.
Around 1960 a person named Bell, thought about proving the existence or non existence of such inherent states prior to measurements and existence of such local hidden variable or H theories which can extract such states. He also thought of an experiment which can be performed and came up with a set of proability equations (inequalties) which the results of the experiments should confirm to if it were to be explained by any local hidden variable theory. By doing this Bell provided a way to check out whether such a theory is possible to construct. Please note Bell didnt come up with such a theory - he just indicated for any theory which assumes internal state these probability equations has to be true. Hence if it is found that if these equations dont hold good using the experimental values got from actual observations, then the actual observations cannot be explained by these theories. Hence the class of such theories is false!
Now before moving to that experiment and the probality equations the results of that experiment should conform to, we need to cover some basic ground.
I have been using the coins and its results to explain till now. To conform with other concepts which Bell is going to discuss, I am planning to introduce the coin dice. This coin dice is similar to normal dice and has six faces, but each face doesnt have a number but is printed with head or tails. The opposing faces will have opposite impressions (head or tail). So if the top face has head, the bottom one will have tail. This is similar in the other two face pairs. Let us also color these three face pairs into Red, Blue and Green. Now say you roll the dice and it settles down and green face is facing you, if some one asked whether the result is head or tails - you would say it is heads for the green face. But then if they prod and ask what about the red face - you dont know what to say. You can only get a valid result only in one of the face (pairs) anytime you roll the dice and not in other faces (you could think of the other two faces as faces of a coin which is standing on its edge not settling down to a head or a tail). Seems reasonable?
This is similar to what Quantum Mechanics tells you about noncommuting observables. The red, blue and green faces of the dice are noncommuting observables as at any point of time you can only measure only one of those. When you measure with certainity one of those like our green face which had a "head" - we are totally unsure of the other two variables - infact we say they dont have a state.Infact we say they have a fuzzy state. So isnt quantum mechanics intuitive enough?
Now we have set the stage for discussing Bell.
Asssume in our previous experiment of entangled coins, we have entangled coin-dices (descibed above). They are entangled in such a way that if I get a head on the green face in my dice, then my blind friend will get tail on his green face (it is like my friend looks at the opposite face of my dice). This way if I roll the dice and measure head on a green face, then I know for sure that if my blind friends rolls the dice and if it falls on the green face the result will be tails whether he sees it or not. But if my friend rolls his dice and get a red face and then sees (my friend is no more blind - he got eyes) his result is heads, then we indirectly know my red face result is tails as both dices are engtangled. This way now I know the result of two faces of my dice - Red and Green though I personally measured only one. I have tricked using entanglement and I seemed to have violated a basic principle of noncommuting variables in quantum mechanics.
Now this is where Bell comes in.
To extend futher, assume me an my friend start rolling a lot of these entangled dices at agreed upon times one by one so that we make a lot of repeated measurements simultaneously. After a lot of dice rolling, we compare results and as a first step we strike all measurements where the dice rolls and same color face appear in both places. We are left with measurements where different color faces appear. Now if we calculate the probability then we would see of the n rolls of dice, 1/3 of the rolls will involve same color ~0.66n will have different colors in different places which is what we are bothered about.So we take that set of observations and try to compute some inequalities with those trials. Typically measurements are made specifically for the green and red faces, but in our case I am making random measurements being true to the dice paradigm. For the below discussion this doesnt have an implication, but I think some thought needs to go into this assumption as I believe it has some potential in explaining something.
In those (0.66n) if we assume this convention - N(Gh,Rt) means the number of rolls where I had a Green face with head showing as a result and my friend had Red face with tails.
And we write N(Gh,Rt:Bh)for those trials N(Gh,Rt) where B not explicitly measured has actually a value of head. This is where the possibilty of an inherent concrete state is getting introduced and this is a critical assumption. As for an unmeasured quantity having a definite concrete state and a numbers associated with it might not make sense from qmech point of view or for that matter even in our coin-dice roll POV.
Now if we console ourselves, take that assumption and proceed in that direction, we see that:
N(Gh,Rt:Bh) + N(Gt,Rh:Bt) >= 0 [we are using : to denote the unmeasured quantity] - Equation 1(both numbers at LHS correspond to no of trials and cant be negative)
Question - but can these numbers interfere?? and should we use addition or inteference addition - but then we postpone thinking about that disturbing thought for now.
Adding N(Gh,Rt:Bt) + N(Gh,Rh:Bt) to both sides of the above equation.
N(Gh,Rt:Bh) + N(Gt,Rh:Bt) + N(Gh,Rt:Bt) + N(Gh,Rh:Bt) >= N(Gh,Rt:Bt) + N(Gh,Rh:Bt) - Equation 2
If we can assume ->
N(Gh,Rt) = N(Gh,Rt:Bt)+ N(Gh,Rt:Bh) - Equation 3
Meaning Blue face could have head or tail in those trials we measured Green having head and Read having tail. So the above equation holds good. Now again we assume that it has the possibility of only two definite different concrete states and not a fuzzy state.
Similarly we can write -
N(Rh:Bt) = N(Gt,Rh:Bt)+ N(Gh,Rh:Bt) - Equation 4
N(Gh:Bt) = N(Gh,Rt:Bt) + N(Gh,Rh:Bt) -Equation 5
(These additions are even more intriguing to interpret)
So combining equation 2, 3 and 4 we get (I am using the : which is slightly different from how it is done. I want to use it as to treat it with context than treating this as just a set theory exercise)
N(Gh,Rt)+ N(Rh:Bt) >= N(Gh:Bt) - Bells inequality.
We could create similarly other inequalities using the same logic.
For deriving this inequality we used logic and apart from that we made two major assumptions which Einstein made of his theory:
a. Measuring value of one dice doesnt disturb other dice as they are so far apart (locality)
b. Even if a coin dice face value is not measured it has an inherent concrete state (head or tails)
Now there have been experiments for counting these Ns in the above equations and they found that the actual Ns from certain experiments (not all) dont confirm to the above inequality. So something is definitely wrong in one of our assumptions in arriving at these inequalities which we had a hint or two when we were deriving these inequalities. One way to resolve the paradox is to say that hidden variable theories cannot explain the experimental observations and are false. Another is to say Hidden Variable theories exist, but they are non local. Obviously we could have both assumptions wrong. There is another POV which suggest that the logic of set theory we used to derive the inequalities could be itself wrong and do not apply to our current situation which is like shaking all foundations!.
The last but not the least, the observations themselves could be wrong. There have been different attempts at understanding the reasoning behind the violations of Bell's Inequality in experiments but I think there is not full consensus.
But one thing which needs to noted is that the observations of these experiments tie very well with predictions of these observations through quantum mechanical theory. So even though quantum mechanical theories may be incomplete, they explain observations of all experiments conducted till now. On the other hand we dont yet have a credible local hidden variable theory, but yet Bell has dealt a blow in the hope of finding a credible one. This is quite similar to what Godel did for the Hilbert program.