1) Let's say you have a hypothesis about the world, H (example: H = spiders have 8 legs)
2) Let's say you experiment and find n data points (you find n = 300 spiders and count their legs)
3) Then the confidence you can have that H is true should increase with more data. It will always approach, but never reach 100% (because even after finding 99999999999 spiders, it's always possible that the next spider could have 7 legs).
I suspect this is what the graph should look like:
I'm sure statisticians can flesh this idea out more, but the essence is here. Notice, when you have no data, you should have no confidence in your belief - this is what I would call suspension of belief. So far this seems pretty simple, but wait for it becasue it gets contravercial.
Some atheists argue that they believe God doesn't exist because there is no evidence to prove his existance. I think this is a mistake - without evidence you should suspend judgement entirely and be completely impartial as to whether a God exists or not. It should go without saying I'm refering to a generic diety, not the Christian God which supposedly affects the world (ie something we can observe to be true or not).
But what about Bertrand Russell's tea pot? Our technology can't conclusively prove there isn't a tea pot orbiting around Mars. In the absence of data, should we still suspend opinion? Of course not. The analogy is flawed. Intuitively, we all know that tea pots are made of ceramics which require careful manufacturing on Earth, so we actually do have some data!
To throw confusion into the mix, consider two addititional ideas. 1) occam's razor: the claim that the the simplist idea is likely the correct one, and 2) The person making a claim has the peronal responsibility to back up that claim. I disagree with idea 1) because some things whcih are true are extreemly complicated. Is it simpler to believe the ISS is held into orbit using a giant thin rope? Nonetheless I agree that most true ideas seem simple, but I believe this is only the case because most phenomina with large data sets appear simple (ie sun rising every day). The other idea 2) seems wrong to me as well. I concede that this is a good rule of thumb in a practical sense; after all, if your flatmate claims that the broken sink was caused by polar bears, you would hope he backs it up before you need to call the plummer. But from a strict rational point of view, the Universe doesn't care about personal responsibility - so if you care about truth, you should want to find as much data as you can to prove any hypothesis true/fase. Obviously humans are not timeless truth seeking machines though, so at some point we should stop looking for evidence of polar bears to focus on other more pressing matters.
What about mathematics? Mathematics doesn't require data, right? Wrong! It requires the most amount of data, and it gets it in an entirely different way! Consider the derivation for the quadratic formula: x = -b +- sqrt(b^2-4ac)/2a. This formula has been derived with generic variables (a, b, c). The reason we know this proof is correct is because the variables could be (1,1,1) or (1,1,2) or (1,1,3) etc etc. We've effectively observed an infinite number of data points for which this formula will work! This means on our graph above, we actually reach 100%. Now at a very deep level, it's not actually at 100% because there will always be some uncertainty in whether we've done the algebra correctly. Likewise, how can we really know the axioms of mathematics are true (A = A, 1+1 =2 etc)? Some people have tried to prove some of these things using set theory, but how do we know the axioms of set theory hold? The rabbit hold must go on forever or intersect itself - both are problems for the rationalists. I suspect that even the most fundamental statements (1+1=2, A=A, "I think therefore I am" etc) all fundamentally depend on observation. This makes the shining beacon of mathematics fall in the unclean realm of observational science.
A few additional things I wanted to say are about AI, and causality. But this post has already gone on for long enough. In short 1) I believe to learn is to generalize, and 2) causality doesn't exist - it's a flawed way to simplfy a complicated phenominon that is a function of many variables. I'll talk about this in a future blog with an example of a spring mass damper.