统计学中显著水平与P值的区别.docx

资源描述

《统计学中显著水平与P值的区别.docx》由会员分享，可在线阅读，更多相关《统计学中显著水平与P值的区别.docx（7页珍藏版）》请在三一文库上搜索。

1、Understanding Hypothesis Tests: Significance Levels (Alpha) and P values in StatisticsWhat do significance levels and P values mean in hypothesis tests? What is statistical significance anyway? In this post, Ill continue to focus on concepts and graphs to help you gain a more intuitive understanding

2、 of how hypothesis tests work in statistics.To bring it to life, Ill add the significance level and P value to the graph in my previous post in order to perform a graphical version of the 1 sample t-test. Its easier to understand when you can see what statistical significance truly means!Heres where

3、 we left off in my last post. We want to determine whether our sample mean (330.6) indicates that this years average energy cost is significantly different from last years average energy cost of $260.The probability distribution plot above shows the distribution of sample means wed obtain under the

4、assumption that the null hypothesis is true (population mean = 260) and we repeatedly drew a large number of random samples.请预览后下载！I left you with a question: where do we draw the line for statistical significance on the graph? Now well add in the significance level and the P value, which are the de

5、cision-making tools well need.Well use these tools to test the following hypotheses: Null hypothesis: The population mean equals the hypothesized mean (260). Alternative hypothesis: The population mean differs from the hypothesized mean (260).What Is the Significance Level (Alpha)?The significance l

6、evel, also denoted as alpha or , is the probability of rejecting the null hypothesis when it is true. For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference.These types of definitions can be hard to understand because o

7、f their technical nature. A picture makes the concepts much easier to comprehend!The significance level determines how far out from the null hypothesis value well draw that line on the graph. To graph a significance level of 0.05, we need to shade the 5% of the distribution that is furthest away fro

8、m the null hypothesis.In the graph above, the two shaded areas are equidistant from the null hypothesis value and each area has a probability of 0.025, for a total of 0.05. In statistics, we call these shaded areas the 请预览后下载！critical region for a two-tailed test. If the population mean is 260, wed

9、expect to obtain a sample mean that falls in the critical region 5% of the time. The critical region defines how far away our sample statistic must be from the null hypothesis value before we can say it is unusual enough to reject the null hypothesis.Our sample mean (330.6) falls within the critical

10、 region, which indicates it is statistically significant at the 0.05 level.We can also see if it is statistically significant using the other common significance level of 0.01.The two shaded areas each have a probability of 0.005, which adds up to a total probability of 0.01. This time our sample me

11、an does not fall within the critical region and we fail to reject the null hypothesis. This comparison shows why you need to choose your significance level before you begin your study. It protects you from choosing a significance level because it conveniently gives you significant results!Thanks to

12、the graph, we were able to determine that our results are statistically significant at the 0.05 level without using a P value. However, when you use the numeric output produced by statistical software, youll need to compare the P value to your significance level to make this determination.What Are P

13、 values?请预览后下载！P-values are the probability of obtaining an effect at least as extreme as the one in your sample data, assuming the truth of the null hypothesis.This definition of P values, while technically correct, is a bit convoluted. Its easier to understand with a graph!To graph the P value for

14、 our example data set, we need to determine the distance between the sample mean and the null hypothesis value (330.6 - 260 = 70.6). Next, we can graph the probability of obtaining a sample mean that is at least as extreme in both tails of the distribution (260 +/- 70.6).In the graph above, the two

15、shaded areas each have a probability of 0.01556, for a total probability 0.03112. This probability represents the likelihood of obtaining a sample mean that is at least as extreme as our sample mean in both tails of the distribution if the population mean is 260. Thats our P value!When a P value is

16、less than or equal to the significance level, you reject the null hypothesis. If we take the P value for our example and compare it to the common significance levels, it matches the previous graphical results. The P value of 0.03112 is statistically significant at an alpha level of 0.05, but not at

17、the 0.01 level.If we stick to a significance level of 0.05, we can conclude that the average energy cost for the population is greater than 260.请预览后下载！A common mistake is to interpret the P-value as the probability that the null hypothesis is true. To understand why this interpretation is incorrect,

18、 please read my blog postHow to Correctly Interpret P Values.Discussion about Statistically Significant ResultsA hypothesis test evaluates two mutually exclusive statements about a population to determine which statement is best supported by the sample data. A test result is statistically significan

19、t when the sample statistic is unusual enough relative to the null hypothesis that we can reject the null hypothesis for the entire population. “Unusual enough” in a hypothesis test is defined by: The assumption that the null hypothesis is truethe graphs are centered on the null hypothesis value. Th

20、e significance levelhow far out do we draw the line for the critical region? Our sample statisticdoes it fall in the critical region?Keep in mind that there is no magic significance level that distinguishes between the studies that have a true effect and those that dont with 100% accuracy. The commo

21、n alpha values of 0.05 and 0.01 are simply based on tradition. For a significance level of 0.05, expect to obtain sample means in the critical region 5% of the time when the null hypothesis is true. In these cases, you wont know that the null hypothesis is true but youll reject it because the sample

22、 mean falls in the critical region.Thats why the significance level is also referred to as an error rate!This type of error doesnt imply that the experimenter did anything wrong or require any other unusual explanation. The graphs show that when the null hypothesis is true, it is possible to obtain

23、these unusual sample means for no reason other than random sampling error. Its just luck of the draw.Significance levels and P values are important tools that help you quantify and control this type of error in a hypothesis test. Using these tools to decide when to reject the null hypothesis increas

24、es your chance of making the correct decision.If you like this post, you might want to read the other posts in this series that use the same graphical framework: Previous: Why We Need to Use Hypothesis Tests Next: Confidence Intervals and Confidence LevelsIf youd like to see how I made these graphs, please read: How to Create a Graphical Version of the 1-sample t-Test. （注：可编辑下载，若有不当之处，请指正，谢谢!）请预览后下载！

展开阅读全文