Multiple comparisons problem  Wikipedia
Briefly compare and contrast the Null Hypothesis versus the Alternative Hypothesis.
Power and Sample Size in a nutshell
Regardless of the method used, the value derived from a test for differences between proportions will answer the following question: What is the probability that the two experimental samples were derived from the same population? Put another way, the null hypothesis would state that both samples are derived from a single population and that any differences between the sample proportions are due to chance sampling. Much like statistical tests for differences between means, proportions tests can be one or twotailed, depending on the nature of the question. For the purpose of most experiments in basic research, however, twotailed tests are more conservative and tend to be the norm. In addition, analogous to tests with means, one can compare an experimentally derived proportion against a historically accepted standard, although this is rarely done in our field and comes with the possible caveats discussed in . Finally, some software programs will report a 95% CI for the difference between two proportions. In cases where no statistically significant difference is present, the 95% CI for the difference will always include zero.
Entire books are devoted to the statistical method known as . This section will contain only three paragraphs. This is in part because of the view of some statisticians that ANOVA techniques are somewhat dated or at least redundant with other methods such as (see ). In addition, a casual perusal of the worm literature will uncover relatively scant use of this method. Traditionally, an ANOVA answers the following question: are any of the mean values within a dataset likely to be derived from populations^{} that are truly different? Correspondingly, the null hypothesis for an ANOVA is that all of the samples are derived from populations, whose means are identical and that any difference in their means are due to chance sampling. Thus, an ANOVA will implicitly compare all possible pairwise combinations of samples to each other in its search for differences. Notably, in the case of a positive finding, an ANOVA will not directly indicate which of the populations are different from each other. An ANOVA tells us only that at least one sample is likely to be derived from a population that is different from at least one other population.
How to Write Guide: Sections of the Paper  Bates College
... The questionnaires used to assess potentialsexual problems in the two cited randomized controlled trials in Kenyaand Uganda were not presented in detail in the original publications.^{4,5}Rather than blindly accepting such findings as any more trustworthythan other findings in the literature, it should be recalled that astrong study design, such as a randomized controlled trial, does notoffset the need for highquality questionnaires. Having obtained thequestionnaires from the authors (RH Gray and RC Bailey, personalcommunication), I am not surprised that these studies provided littleevidence of a link between circumcision and various sexual difficulties.^{4,5}Several questions were too vague to capture possible differencesbetween circumcised and notyet circumcised participants (e.g. lack ofa clear distinction between intercourse and masturbationrelated sexualproblems and no distinction between premature ejaculation and troubleor inability to reach orgasm). Thus, nondifferential misclassificationof sexual outcomes in these African trials probably favoured the nullhypothesis of no difference, whether an association was truly presentor not.
Introduction and Objective:Controversy continues to exist about the effect of circumcision onpenile sensitivity and sexual satisfaction. This study was designed toevaluate penile sensitivity in both circumcised and uncircumcisedmales. We evaluated both large and small axon nerve fibers usingvibration, pressure, spatial perception, and warm and cold thermalthresholds. Measurements both in functional men and men with erectiledysfunction (ED) were obtained to evaluate for differences in penilesensitivities.
Methods: Seventynine patients wereevaluated. In the cohort evaluated, 54% (43/79) were uncircumcised,while 46% (36/79) were circumcised. All patients completed the erectilefunction domain of the International Index of Erectile Function (IIEF)questionnaire. Patients were subsequently tested Vibration (Biothesiometer),pressure (SemmesWeinstein monofilaments), spatial perception (TactileCircumferential Discriminator), and warm and cold thermal thresholds(Physitemp NTE2) were measured. Bivariate relationships were assessed using chi square, t test, andPearson correlations. Composite null hypotheses were assessed with mixed models repeatedmeasures analysis of variance allowing us to covary for age, diabetes,and hypertension.
Results: Functional group t testanalysis only demonstrated a significant (p= 0.048) difference for warmthermal thresholds with a higher threshold (worse sensation) foruncircumcised men. However, significance was lost when we controlledfor age, hypertension, and diabetes. For the dysfunctional groups ttest analysis only demonstrated a significant (p= 0.01) difference forvibration (biothesiometry) with a higher threshold (worse sensation)for uncircumcised men. Again, this also lost significance (p=0.08) whencontrolling for age, hypertension, and diabetes. We also found thatoverall race is related to circumcision status with Caucasian men 25times and African American men 8 times more likely to be circumcisedthan Hispanics.
A Pharmacogenetic versus a Clinical Algorithm for …
proportions or distributions refer to data sets where outcomes are divided into three or more discrete categories. A common textbook example involves the analysis of genetic crosses where either genotypic or phenotypic results are compared to what would be expected based on Mendel's laws. The standard prescribed statistical procedure in these situations is the test, an approximation method that is analogous to the normal approximation test for binomials. The basic requirements for multinomial tests are similar to those described for binomial tests. Namely, the data must be acquired through random sampling and the outcome of any given trial must be independent of the outcome of other trials. In addition, a minimum of five outcomes is required for each category for the Chisquare goodnessoffit test to be valid. To run the Chisquare goodnessoffit test, one can use standard software programs or websites. These will require that you enter the number of expected or control outcomes for each category along with the number of experimental outcomes in each category. This procedure tests the null hypothesis that the experimental data were derived from the same population as the control or theoretical population and that any differences in the proportion of data within individual categories are due to chance sampling.
5) If the pvalue is NOT less than or equal to the significance level  we do not say that we "accept" the null hypothesis. (Never admit defeat!) The reason for this is that our alternative might still be right. Perhaps there's just too much variability in our estimator to get a clear picture of what's going on. (The technical term for this is "power" . The power is the probability that you correctly reject the null hypothesis when the null hypothesis is false  a good thing. Power increases with sample size, so if you don't reject the null hypothesis today, go back and collect more data to increase your power.)
Randomized Trial of Stent versus Surgery for …

Kenneth Rosenfield, M.D., M.H.C.D.S., Jon S
Why a Scientific Format

Using Regression to Test Differences Between Group …
Experiment  Wikipedia

Global Warming: Policy Hoax versus Dodgy Science « …
An experiment is a procedure carried out to support, refute, or validate a hypothesis
A biologist's guide to statistical thinking and analysis
Interestingly, there is considerable debate, even among statisticians, regarding the appropriate use of one versus twotailed tests. Some argue that because in reality no two population means are ever identical, that all tests should be one tailed, as one mean must in fact be larger (or smaller) than the other (). Put another way, the null hypothesis of a twotailed test is always a false premise. Others encourage standard use of the twotailed test largely on the basis of its being more conservative. Namely, the value will always be higher, and therefore fewer falsepositive results will be reported. In addition, twotailed tests impose no preconceived bias as to the direction of the change, which in some cases could be arbitrary or based on a misconception. A universally held rule is that one should never make the choice of a onetailed test ^{} after determining which direction is suggested by your data In other words, if you are hoping to see a difference and your twotailed value is 0.06, don't then decide that you really intended to do a onetailed test to reduce the value to 0.03. Alternatively, if you were hoping for no significant difference, choosing the onetailed test that happens to give you the highest value is an equally unacceptable practice.
Blue space: The importance of water for preference, …
1. Single Sample We've collected data on a single variable which we assume has a normal distribution with mean mu and SD sigma. We are interested in the value of the mean of the population, mu. The conventional wisdom says that the mu is a particular value, let's call it mu0. (For example, conventional wisdom claims that the mean is 0.) We believe instead that something else is true. We have three options: 1) mu is not mu0. 2) mu is greater than mu0 3) mu is less than mu0. (Note that our options are much more vague than the conventional wisdom. This gives us an advantage over the conventional wisdom that some would say is unfair in some contexts. But that's a fine point for later.) The "conventional wisdom" is called the Null Hypothesis. Our alternative is called the "Alternative Hypothesis." Because the null hypothesis represents conventional wisdom, it gets the benefit of the doubt, and will be overturned only with exceptional evidence to the contrary.Because we want to know about the mean of the population, we estimate using one of our favorite estimators, the average Xbar (sum of the observations divided by the number of observations). We know, however, that Xbar is NOT equal to mu, it is merely an estimate that is unbiased (so it likes to "hang out" around mu) and has a relatively small standard error (so it doesn't stray too far from mu  and in fact, the larger the sample size, the closer Xbar tends to stay to mu.) But we can be even more precise; because we assumed the population was normal, Xbar is also normal with mean mu and SD equal to sigma/sqrt(n). (The SD of a estimator , like Xbar, is called the Standard Error, or SE). So we can compute, for example, that the probability that xbar is within 1 SE of mu is roughly 68%, within two SE's is roughly 95% and within 3 is roughly 99.7%.So here's our approach:1) Choose a significance level  the probability that we will reject the null hypothesis, even though its true (oops!). 5% is a popular choice, as is 1% and even 10%, if you can live with yourself knowing you are wrong 10% of the time.2) Calculate your test statistic. In the opinion of the Null hypothesis, is this test statistic "extreme"?3) Calculate the pvalue: what's the probability of getting a test statistic as extreme or MORE extreme than the one you got?4) If the pvalue is less than or equal to the significance level  reject the null hypothesis. When this happens, it means you just saw an event that happens very rarely if the null hypothesis is true so maybe it isn't true! If I said I could predict the outcome of 100 flips of a coin, and I got 52 of them right, you probably wouldn't be too impressed, because your null hypothesis expected this. But if I got 90 of them correct, this would be so unusual that you would have to change your assumptions! (Which doesn't mean I'm psychic of course  I might have cheated.)5) If the pvalue is NOT less than or equal to the significance level  we do not say that we "accept" the null hypothesis. (Never admit defeat!) The reason for this is that our alternative might still be right. Perhaps there's just too much variability in our estimator to get a clear picture of what's going on. (The technical term for this is "power" . The power is the probability that you correctly reject the null hypothesis when the null hypothesis is false  a good thing. Power increases with sample size, so if you don't reject the null hypothesis today, go back and collect more data to increase your power.)Details:If sigma is a known number, like 3, then it's easy to calculate the pvalue: