From @helios.physics.utoronto.ca:LISS@FNALD.FNAL.GOV  Mon Oct  4 17:08:58 1993
Received: from helios.physics.utoronto.ca by cepheid.physics.utoronto.ca with SMTP id AA25868; Mon, 4 Oct 93 17:08:58 -0400
Received: from FNALJ.FNAL.GOV ([131.225.108.4]) by helios.physics.utoronto.ca with SMTP id <1709>; Mon, 4 Oct 1993 17:08:54 -0400
Date: 	Mon, 4 Oct 1993 17:08:40 -0400
From: LISS@FNALD.FNAL.GOV
To: pekka@physics.utoronto.ca
Message-Id: <931004160840.20a0ad1b@FNALD.FNAL.GOV>
Subject: Oh god...
Status: R

From:	SMTP%"BRANDENBURG@HUHEPL.HARVARD.EDU"  4-OCT-1993 15:24:53.02
To:	LISS
CC:	
Subj:	Found my marbles - PLEASE READ THIS!

Date:    Mon, 4 Oct 1993 16:23:42 -0400 (EDT)
From:    BRANDENBURG@HUHEPL.HARVARD.EDU (George - (617)495-2824)
Message-Id: <931004162342.202004d3@HUHEPL.HARVARD.EDU>
Subject: Found my marbles - PLEASE READ THIS!
To:      liss@fnald.fnal.gov
X-Vmsmail-To: FNALD::ALVIN,HUTH,LISS,FNALD::CLAUDIOC

   After thinking things over I decide that the best way to estimate the
significance of my observation is to make use of all the information I have.
So far what I have figured out is that the probability that an observation of
12 marbles can be explained by the expected 5 "normal" marbles is only 0.6%.
Although this is a small number it is still above the three standard deviation
line (0.3%), so I can hardly claim to have highly significant evidence for the
existence of "special" marbles. But I want to take the colors I have observed
into account. Namely I observed 2 red, 3 blue, 4 green and 3 blue-green
marbles and my best estimate of what would be expected from normal marbles
is 0.6 red, 1.4 blue, 2.5 green, and 0.5 blue-green. This totals to 15
observed color occurances for a expected 5.5 number of color occurances.
   I assume that each of the four categories can be treated as being
independent so I construct a liklihood for my observation which is just
the product of the four Poisson probabilities, P(2;0.6)*P(3;1.4)*P(4;2.5)*
P(3;0.5). I then define a Confidence Level as the probability that if I
were to do the experiment again I would get a worse or equal liklihood.
This is similar to constructing a chisq, but it correctly deals with the
statistics of small numbers. Calculating using a FORTRAN program (included 
below) I find that CL = 0.45%. This is less than the 0.6% CL obtained by
ignoring the colors and only counting marbles but still more that the three
sigma level of 0.3%. It is also considerably more than the CL of 0.06%
I erroniously obtained by applying Poisson statistics to color occurances.
   Having used all the information at hand I regretfully conclude that
my observation is strongly suggestive of the existence of special marbles, 
but that a CL of 0.45% for the normal marble hypothesis is still too large 
for me to conclude that I have in fact discovered special marbles.

   Appendix: Sorry for the metaphor, but I sometimes find it easier to
think about tangible objects. I have used the event numbers as presented
in the three drafts with one exception, namely the expectation that the
background has 0.5 double tag events (blue-green marbles). In the absence
of other information zero seemed too small and one seemed to big so I
tried 0.5. It is trivial to redo once that number has been estimated.
Otherwise I think this is as good as it gets until you start adding 
event properties (besides tags) into the mix. Below is the program used
to calculate the CL for four independent statistical samples.
					George

	real xn1/0.6/,xn2/1.4/,xn3/2.5/,xn4/0.5/
	integer kn1/2/,kn2/3/,kn3/4/,kn4/3/
	real*8 sums/0./,sumt/0./
	real factn(0:32),pcut/-40./
	factn(0)=1.
	do n=1,32
	factn(n)=n*factn(n-1)
	enddo
	probs=exp(kn1*alog(xn1)-xn1-alog(factn(kn1))
     >	         +kn2*alog(xn2)-xn2-alog(factn(kn2))
     >	         +kn3*alog(xn3)-xn3-alog(factn(kn3))
     >	         +kn4*alog(xn4)-xn4-alog(factn(kn4)))
	do 40 i1=0,32
	probl1=i1*alog(xn1)-xn1-alog(factn(i1))
	if (i1.gt.ifix(xn1).and.probl1.lt.pcut) go to 50
	do 30 i2=0,32
	probl2=i2*alog(xn2)-xn2-alog(factn(i2))
	if (i2.gt.ifix(xn2).and.probl2.lt.pcut) go to 40
	do 20 i3=0,32
	probl3=i3*alog(xn3)-xn3-alog(factn(i3))
	if (i3.gt.ifix(xn3).and.probl3.lt.pcut) go to 30
	do 10 i4=0,32
	probl4=i4*alog(xn4)-xn4-alog(factn(i4))
	if (i4.gt.ifix(xn4).and.probl4.lt.pcut) go to 20
	prob=exp(probl1+probl2+probl3+probl4)
	if (prob.le.probs) sums=sums+prob
	sumt=sumt+prob
   10	continue
   20	continue
   30	continue
   40   continue
   50	print 90,xn1,xn2,xn3,xn4,kn1,kn2,kn3,kn4,i1,i2,i3,i4,sums,sumt
   	stop 
   90	format(' Expected N = ',4f5.1/' Observed N =',4i5/
     >         ' Maximum N  =',4i5/' CL =',f8.5,'  Tot Prob =',f8.5)
	end