Information Theory Workshop - January 2011

Information Theory Workshop

Short Course on Model Selection and Multimodal Inference

Dates: Wednesday and Thursday, January 12th & 13th, 2011

Time: 8:00 am – 5:00 pm

Location: The College of William and Mary
                     Mason School of Business
                     Williamsburg, VA 23187

Organizers: Romuald Lipcius and Gina Ralph

Goals/Objectives: This short course is an overview of these new methods and their underlying philosophy.  Several examples will demonstrate the application of these methods.  A hands-on session will take place on Thursday.


Overview:
A substantial paradigm shift is occurring in science and application. The past century relied on null hypothesis testing, asymptotic distributions of the test statistic, P-values, and an arbitrary ruling concerning “significant” or “not significant.” Under this analysis paradigm a test statistic (T) is computed from the data. The P-value is the focus of the analysis and is the Prob {T or more extreme, given the null hypothesis}. With this definition in mind, we can abbreviate slightly, Prob{X|H0}, where it is understood that X is the data or more extreme (unobserved) data. This is a so-called “tail probability.”

The null hypothesis (H0 ) takes center stage but is often trivial or even silly. The alternative hypothesis HA is not the subject of the test; support for the alternative occurs only if the P-value (for the null hypothesis) is low, often < 0.05). Support for the alternative hypothesis comes only by default.

The proper interpretation of the P-value is quite strained; this might explain why so many people erroneously pretend it means something quite different (i.e., the probability that the null hypothesis is true). This is not what is meant by a P-value.

These traditional methods are being replaced by “information-theoretic” methods (and to a lesser extent, at least at this time by a variety of Bayesian methods). They are termed “information-theoretic” because they are based on Kullback-Leibler information theory. These approaches focus on an a priori set of plausible science hypotheses, H1, H2, …, HR. Evidence for or against members of this set of “multiple working hypotheses” consists of a set of probabilities. Specifically, Prob{H1, H2, …, HR , given the data} or Prob{Hj|X}. These probabilities are direct evidence, where evidence = information = -entropy.

Simple evidence ratios allow a measure of the strength of the evidence for any two hypotheses. Note the radical difference in the probability statements (above) stemming from either a P-value or the probability of hypothesis j. Statistical inference should be about models and parameters, conditional on the data, however, P-values are probability statements about the data, conditional on the null hypothesis.

These new approaches allow statistical inference to be based on all (or some) the models in the a priori set (multimodel inference) and this is useful in prediction and well as getting robust estimates of parameters of particular interest. Alternative science hypotheses take center stage in these approaches and will require much more attention than in the past century (where one started with an alternative and the null was merely the nothing/naïve position; thus little science thinking was called for).

The set of science hypotheses “evolves” through time as implausible hypotheses are eventually dropped from consideration, new hypotheses are added, and existing hypotheses are further refined. Rapid progress in the theoretical or applied sciences can be realized as this set evolves, based on careful inferences from new data. This is an exciting time to be in science and biostatistics. There are important philosophies involved here; these approaches go well beyond methods for “data analysis.”


Schedule of Events: TBA

This overview course is based on the reference book,
Burnham, K. P., and D. R. Anderson. 2002. Model selection and multimodel inference: a practical information-theoretic approach. 2nd Ed., Springer- Verlag, New York, NY. 488pp.
and on the recent textbook (supplied as part of the registration fee),
Anderson, D. R. 2008. Model based inference in the life sciences: a primer on evidence. Springer, New York, NY. 184pp.

Cost: TBA

To register: Please complete the online Information Theory Workshop Registration Form.

The deadline for registration is December 22, 2010.

If you have any questions about registration please contact Gina Ralph at ginaralph@vims.edu


Speaker: David Anderson, Ph.D. is the President and Chief Executive Officer of Applied Information Company (AIC), a small company based in Fort Collins, Colorado. David Anderson spent most of his professional life as a research scientist with the U.S. Department of the Interior. He holds a PhD in Theoretical Ecology from the University of Maryland and has worked in a wide variety of quantitative areas in the biological sciences. He has worked intensively on model selection and related subjects since 1990, beginning with joint work with Drs. Jean-Dominique Lebreton and Jean Clobert (France) and Kenneth Burnham (USA). During this time he published 18 journal papers and two editions of the Springer-Verlag book on model selection, multimodel inference, and closely related topics. Much of this work has been done in close collaboration with Dr. Kenneth P. Burnham.
Dr. Anderson has published 15 books and research monographs; 99 papers in peer-reviewed national/international scientific journals; 45 book chapters, government scientific report series, and conference proceedings and transactions; and 15 technical reports in ecology and other life sciences and statistical science. He was also a Senior Scientist with the U.S. Geological Survey. Additionally, David Anderson is a retired Unit leader from the Colorado Cooperative Fish and Wildlife Research Unit and the Department of Fishery and Wildlife Biology, where he holds a professorship.