From @helios.physics.utoronto.ca:LISS@FNALD.FNAL.GOV Mon Oct 4 17:08:58 1993 Received: from helios.physics.utoronto.ca by cepheid.physics.utoronto.ca with SMTP id AA25868; Mon, 4 Oct 93 17:08:58 -0400 Received: from FNALJ.FNAL.GOV ([131.225.108.4]) by helios.physics.utoronto.ca with SMTP id <1709>; Mon, 4 Oct 1993 17:08:54 -0400 Date: Mon, 4 Oct 1993 17:08:40 -0400 From: LISS@FNALD.FNAL.GOV To: pekka@physics.utoronto.ca Message-Id: <931004160840.20a0ad1b@FNALD.FNAL.GOV> Subject: Oh god... Status: R From: SMTP%"BRANDENBURG@HUHEPL.HARVARD.EDU" 4-OCT-1993 15:24:53.02 To: LISS CC: Subj: Found my marbles - PLEASE READ THIS! Date: Mon, 4 Oct 1993 16:23:42 -0400 (EDT) From: BRANDENBURG@HUHEPL.HARVARD.EDU (George - (617)495-2824) Message-Id: <931004162342.202004d3@HUHEPL.HARVARD.EDU> Subject: Found my marbles - PLEASE READ THIS! To: liss@fnald.fnal.gov X-Vmsmail-To: FNALD::ALVIN,HUTH,LISS,FNALD::CLAUDIOC After thinking things over I decide that the best way to estimate the significance of my observation is to make use of all the information I have. So far what I have figured out is that the probability that an observation of 12 marbles can be explained by the expected 5 "normal" marbles is only 0.6%. Although this is a small number it is still above the three standard deviation line (0.3%), so I can hardly claim to have highly significant evidence for the existence of "special" marbles. But I want to take the colors I have observed into account. Namely I observed 2 red, 3 blue, 4 green and 3 blue-green marbles and my best estimate of what would be expected from normal marbles is 0.6 red, 1.4 blue, 2.5 green, and 0.5 blue-green. This totals to 15 observed color occurances for a expected 5.5 number of color occurances. I assume that each of the four categories can be treated as being independent so I construct a liklihood for my observation which is just the product of the four Poisson probabilities, P(2;0.6)*P(3;1.4)*P(4;2.5)* P(3;0.5). I then define a Confidence Level as the probability that if I were to do the experiment again I would get a worse or equal liklihood. This is similar to constructing a chisq, but it correctly deals with the statistics of small numbers. Calculating using a FORTRAN program (included below) I find that CL = 0.45%. This is less than the 0.6% CL obtained by ignoring the colors and only counting marbles but still more that the three sigma level of 0.3%. It is also considerably more than the CL of 0.06% I erroniously obtained by applying Poisson statistics to color occurances. Having used all the information at hand I regretfully conclude that my observation is strongly suggestive of the existence of special marbles, but that a CL of 0.45% for the normal marble hypothesis is still too large for me to conclude that I have in fact discovered special marbles. Appendix: Sorry for the metaphor, but I sometimes find it easier to think about tangible objects. I have used the event numbers as presented in the three drafts with one exception, namely the expectation that the background has 0.5 double tag events (blue-green marbles). In the absence of other information zero seemed too small and one seemed to big so I tried 0.5. It is trivial to redo once that number has been estimated. Otherwise I think this is as good as it gets until you start adding event properties (besides tags) into the mix. Below is the program used to calculate the CL for four independent statistical samples. George real xn1/0.6/,xn2/1.4/,xn3/2.5/,xn4/0.5/ integer kn1/2/,kn2/3/,kn3/4/,kn4/3/ real*8 sums/0./,sumt/0./ real factn(0:32),pcut/-40./ factn(0)=1. do n=1,32 factn(n)=n*factn(n-1) enddo probs=exp(kn1*alog(xn1)-xn1-alog(factn(kn1)) > +kn2*alog(xn2)-xn2-alog(factn(kn2)) > +kn3*alog(xn3)-xn3-alog(factn(kn3)) > +kn4*alog(xn4)-xn4-alog(factn(kn4))) do 40 i1=0,32 probl1=i1*alog(xn1)-xn1-alog(factn(i1)) if (i1.gt.ifix(xn1).and.probl1.lt.pcut) go to 50 do 30 i2=0,32 probl2=i2*alog(xn2)-xn2-alog(factn(i2)) if (i2.gt.ifix(xn2).and.probl2.lt.pcut) go to 40 do 20 i3=0,32 probl3=i3*alog(xn3)-xn3-alog(factn(i3)) if (i3.gt.ifix(xn3).and.probl3.lt.pcut) go to 30 do 10 i4=0,32 probl4=i4*alog(xn4)-xn4-alog(factn(i4)) if (i4.gt.ifix(xn4).and.probl4.lt.pcut) go to 20 prob=exp(probl1+probl2+probl3+probl4) if (prob.le.probs) sums=sums+prob sumt=sumt+prob 10 continue 20 continue 30 continue 40 continue 50 print 90,xn1,xn2,xn3,xn4,kn1,kn2,kn3,kn4,i1,i2,i3,i4,sums,sumt stop 90 format(' Expected N = ',4f5.1/' Observed N =',4i5/ > ' Maximum N =',4i5/' CL =',f8.5,' Tot Prob =',f8.5) end