sfrthomas@yahoo.com (S. F. Thomas) writes:>> my intuition is led quickly to consider it a likelihood, making >> each expert's opinion very like an uncertain measurement, if you >> are thinking in Bayesian terms. In other words, exactly Thomas's >> "semantic likelihood". >> >> Except, I don't understand his problem with "the vain Bayesian attempt >> to treat likelihood as though it were probability". > > The Bayesian inferential procedure is conceptually > > Posterior = Likelihood * Prior > > Take away the prior
wahh! luckily not necessary for what you go on to say:> Etc. It is powerful, heady stuff. But I maintain it is ultimately > vain, because the frequency notion of probability contained in the > likelihood is different from the belief notion of probability > contained in the prior,
Well, is it necessary to interpret the likelihood as any kind of "frequency notion" of probability? For serious Bayesians, probabilities are degrees of belief---or perhaps in this context one might better say: degrees of surprisal---, the axioms are normative, and everything is conditonal. So to understand a formula like p(Wilco says X is tall | X is 127cm from toe to nose) = 0.003 as saying "I would be surprised to a degree 5.809 if Wilco said X was tall when he believed s/he was actually 127cm etc." is not merely acceptable to a Bayesian, it's actually the basic interpretation. The connection which can in many cases be made between belief/surprisal numbers and frequency under repeated experimentation comes later and is not at all what Bayesian probabilities are constructed to mean.> and the belief notion in the prior is not sufficient to elevate the > point function that is likelihood to the set function that is the > posterior.
I don't follow here ... can you expand?> A lot of people are just not buying it, especially in "scientific" > inference where belief priors are actively kept to one side, rather > than incorporated into the analysis.
Um, large and growing numbers of people are buying it, even in industry ...> What the data say is captured entirely in the likelihood, given the > probability model that is hypothesized.
Fine, if you have a 1- or 2-dimensional problem, why not plot the likelihood for people and let them draw their own conclusions (i.e. apply Bayes rule for themselves)?> And the real challenge is to develop a likelihood calculus that is > as easy of manipulation as the probability calculus, Bayesian or > otherwise. This was the challenge posed by Fisher many years > ago.
If the question is "What is it that observations of the outcomes of imperfectly predictable processes in the world tell us, and how can we represent it scientifically?" then Bayes gives a very powerful answer. The posterior = likelihood * prior formula is really a great insight here. The answer is: what observations do is modify our beliefs, in ways captured quantitatively by the probability calculus. Of course the theory is an idealisation, but only in the sense that logic is an idealisation---both are very useful for practical work. That question seems to me to subsume any other question one might want a theory of likelihoods to answer?> even though Zadeh himself seemed to get stuck with a maximization > method of disjuction (evaluation of composite hypotheses) which > encounters the same difficulties observed many years earlier in the > different problem domain of statistical inference, and for the same > reason: putting the value of the set as that of its strongest > member.
That's certainly a problem (it always seemed rather ad hoc to me, and product-sum is clearly a beter bet), but isn't there a deeper problem in the blindness of FST towards the full complexity of disjunctive reasoning in the presence of general conditionality relationships?>>> There is plenty new semantics in the fuzzy set theory >>> of which probabilists have been blissfully unaware, and which in >>> fact helps to illuminate some problems in the foundations at >>> least of statistical inference theory. >> >> Specifically? > > One. The notion that data are fuzzy in general. To say that "John is > tall" is a height measurement, no different in principle from "John > is 1.92 m". Only the degree of fuzziness is different. [...] If you > relax the notion of point measurement from which theories of > statistical inference (classical as well as Bayesian) proceed,
Hmm, Bayes has no problem at all with uncertain measurements. In fact for practical engineering applications, fuzzy measurements (in the colloquial sense!) are precisely where Bayes really shines. Think of Kalman filters, information/communication theory, ... The Bayesian account "proceeds" from the _concept_ of point measurements because it is built on top of the scientific language of numerical quantities---the whole aim is to explain how to do reasoning about an uncertain reality in terms of our continuum-based, idealised physical theories. It explicitly doesn't force you to make all your _measurements_ point measurements, or collapse your beliefs into point beliefs, or anything. Why isn't there a very close analogy between the Bayesian (or naively statistical and actually Bayesian) models used in engineering for accounting for sensor measurement uncertainty, and the obvious Bayesian account, in terms of speakers' utterance dispositions, of what "John is tall" means? If the two cases are no different in principle, and you disagree with the latter, then do you also disagree with the former? To be concrete, what I mean here is models of the form p(widget height | sensor measurement, previous hypothesis) \propto p(sensor measurement | widget height) \times p(widget height | previous hypothesis) and analogously p(Mary's height in microns | John's tallness assertion, my previous hypothesis) \propto p(John's tallness assertion | Mary's height in microns) p(Mary's height in microns | my previous hypothesis)> one is led to a different set of semantics where, for one thing, the > partial ordering axiom of subjective probability may be relaxed, and
Right. I can see how that could lead to something different from Bayes, because Bayes (afaic) proceeds initially by saying "_suppose_ we do uncertain reasoning with ordered real numbers, what do the axioms have to be?" But I don't yet agree that there is an unmet need here. For me Bayes over numerical probabilities works really well for the problems we have been discussing.> likewise the indifference curves of utility may be > *thick*. Estimates of probability may now also be fuzzy, as can > expressions of preference. Think of what that would mean for > eliciting probability judgments from human judges who have > difficulty putting on the Bayesian strait-jacket of "coherence".
I think at some level there has to be a strait-jacket; any theory of rationality is bound to impose a kind of coherence which is "normative" in the philosophical sense. We can only figure out someone's beliefs (even in principle) to the extent that we take them to be a rational agent. If they're "free" from the "contraints" of rationality then it's arguably not clear that they have beliefs at all ...> A prior expression of "belief" may be expressed for example in a > simple statement such as "most Swedes are tall", which imposes a > (Computable, actually) fuzzy restriction on the height distribution > of Swedes. This notion of quantification was a master-stroke by > Zadeh, IMO.
OK, I have to show my ignorance here. Why can't "most Swedes are tall" be represented perfectly plausibly by any of a whole variety of Bayesian belief/expectation/probability distributions? Over the height distributions of Swedes, or over the heights of each new person I meet given the hypothesis they are Swedish, or (facetiously) over the output of my "height of person in visual field" neurons---whatever. I can see that one is faced with the question of _which_ distribution to pick (or which possible distributions to integrate over, etc.), but that surely must be a feature of any semantic account of that sentence: it must at least be a question for psycho-linguistic research, or even to be decided on a speaker-by-speaker basis by a hierarchical model. I don't see a problem of principle here?> Two. The notion that uncertainty about probability models and model > parameters is essentially of the fuzzy sort. The data from a sample > are like a fuzzy term, which gets progressively narrower in its range > of uncertainty the greater the sample size.
Again, this sounds like a description of Bayesian probability to me :).> Fisher so very long ago. Ultimately, it allows us to work entirely > with likelihood, without the inordinate effort that must go into > Bayesian analysis to develop conjugate priors, improper priors,
I think I mentioned above that for me, the "prior" is a key insight into what it is that observations tell us. As part of a theory of rationality I think it plays an extremely important role; no other account of what rational inference from evidence _is_ has been so complete or so fruitful, afaics. But you don't need conjugate or improper priors for the theory. In practice there may often be computational reasons why conjugate priors are handy. It's not clear to me how often improper priors are really needed; at least the putative examples I have seen in real-life situations arose from requirements like "we must take care that we impose absolutely no prior beliefs on the mean of this plant species' growth rate", which is clearly ridiculous ... If you can show examples in which improper, or otherwise "troublesome", priors arise in contexts other than a misguided attempt to construct a "totally noninformative" prior without stopping to think what "totally" means, I would be interested to hear. I reserve the right to say "OK, that's a hard case, but as a theory of happens in the ideal when an agent reads a voltmeter or hears someone say `X is tall', the prior * likelihood thing is the only game in town so you will have to show a better theory or learn to live with it!" :) I don't see how you can provide a full story about rationality---with or without fuzzy logic---based only on likelihoods; the likelihoods have to engage with a prior at some point, or at least stack up on top of an "unknown" prior.> There is more besides. But it's 1:00 am, and like Dodier, I need to go > to sleep.
Thank you for your long, readable and thought-provoking posting! Best wishes, William