[企业管理]6SIGMA统计概念培训.ppt

资源描述

《[企业管理]6SIGMA统计概念培训.ppt》由会员分享，可在线阅读，更多相关《[企业管理]6SIGMA统计概念培训.ppt（100页珍藏版）》请在三一文库上搜索。

1、Hence, P(4.5 x 8.5) = 0.9864 - 0.3745 = 0.6119 (approx equal to 0.6123 by binomial distribution),Introduction to Sampling,What is population in statistic ? A population in statistic refers to all items that have been chosen for study.,What is a sample in statistic ? A sample in statistic refers to a

2、 portion chosen from a population, by which the data obtain can be used to infer on the actual performance of the population,Population,Sample 2,Sample 6,Sample 8,Sample 1,Sample 3,Sample 7,Sample 4,Sample 5,Sampling distribution - a distribution of sample means If you take 10 samples out of the sam

3、e populations, you will most likely end up with 10 different sample means 和sample 标准偏差s. A Sampling distribution describes the probability of all possible means of the samples taken from the same population.,When sample size increases, the standard error (or the std deviation of sampling distributio

4、n) will get smaller.,Sampling distribution (cont),Central Limit Theorem,Example of sampling distribution,The population distribution of annual income of engineers is skewed negatively. This distribution has a mean of $48,000, 和a 标准偏差 of $5000. If we draw a sample of 100 engineers, what is the probab

5、ility that their average annual income is $48700 和more.,Example of sampling distribution (cont),Therefore, mean = 48000 sigma = 500 X = 50000 Z = (48700 - 48000) / 500 = 700 / 500 = 1.4,From the standardized normal distribution table, P(X $48700) = 0.9192 Therefore; P(X $48700) = 1 - 0.9192 = 0.0808

6、 Thus, we have determined that it has only 8.08% chance for the average annual income of 100 engineers to be more than $48700.,Central Limit Theorem Exercise,Break into 4 groups as below: Group 1: The population group. Group 2 to 4: The sample sub-group The population group will have 3 of their memb

7、ers throwing a single dices 60 times each. A total of 180 throws will be recorded 和this data will be the population data. Each sample sub-group will have 3 of their members throwing 5 dices at one time, 和collect the sum 和average value of the particular throw. Each member is to conduct 20 throws 和obt

8、ain the sample mean of each throw. At the end of the exercise, a total of 180 sample means will be collected from the 3 sub-groups. From the arrived data, plot the histogram 和comment on the distribution of both the population 和the samplings.,The finite population multiplier,Previously we say that:,F

9、inite population multiplier with respect to population 和sample size,Rule of thumbs The finite population multiplier need only be included if population size to sample size ratio is less than 25.,Confidence Interval,“Point estimates” A point estimate is a single number that is used to estimate an unk

10、nown population parameter.,What does 95% confidence intervals means ? It defines 95% of the time, the average value of a random sampling will fall within a value range which is +/- 1.96 standard error from the sample mean.,为什么1.96 standard error ? Referring to the standardized normal distribution ta

11、ble, when z = 1.96, the associated probability is 0.975 (or 97.5%) as below:,But this is a 2 tails interval (i.e. +/- 1.96), hence we need to minus off another 2.5% from the other end, giving a total coverage of 95%.,Equation for computing confidence intervals For continuous data:,For discrete data:

12、,Confidence interval for continuous data A large disk drive manufacturer needs an estimate of the mean life it can expect from the magnetic media by reciprocally switching its digital state at 1MHz frequency. The development team has determined previously that the variance of the population life is

13、36 months, 和had conducted a reliability testing for 100 samples, collecting data of its useful life as below:,From the above data, what is the 95% confidence interval for the useful mean life of the magnetic media in a disk drive ? What does this mean ?,Confidence interval for continuous data (cont)

14、,Applying the confidence intervals equation: Upper confidence limit = 27.75 + 1.96 (0.6) = 28.926 Lower confidence limit = 27.75 1.96 (0.6) = 26.574,As such, there is 95% confidence level that the average useful life of the magnetic media to fall between 26.574 和28.926 months.,Confidence interval fo

15、r discrete data Break into 4 or 5 teams, combined all the M&Ms in one team 和calculate the confidence intervals for each color type, using below table.,Combine all the data from the 4 teams, what are the changes to the confidence intervals ?,Determine the sample size for confidence intervals (continu

16、ous data),Given Sample mean = 21 S = 6 What is the sample size required, such that there is a 95% confidence level that the average value will fall within +/- 1.176 from its mean ? Ans: Applying the equation, n = (1.96 x 6) / 1.1762 = (10) 2 = 100 The sample size must be 100.,Determine the sample si

17、ze for confidence intervals (discrete data),Given, p = 0.4 (Proportion agree) q = 0.6 (Proportion not agree) What is the sample size required, such that there is a 99% confidence level that the proportion agree will fall within +/- 0.146 from p ? Ans: Applying the equation, n = (2.58) 2 (0.4)(0. 6)

18、/ (0.146)2 = 74.95 = 75 The sample size must be 75.,Revision : 1.00 Date : June 2001,第二天: Tests of Hypotheses Week 1 recap of Statistics Terminology Introduction to Student T distribution Example in using Student T distribution Summary of formula for Confidence Limits Introduction to Hypothesis Test

19、ing The elements of Hypothesis Testing -Break- Large sample Test of Hypothesis about a population mean p-Values, the observed significance levels Small sample Test of Hypothesis about a population mean Measuring the power of hypothesis testing Calculating Type II Error probabilities Hypothesis Exerc

20、ise I -Lunch- Hypothesis Exercise I Presentation Comparing 2 population Means: Independent Sampling Comparing 2 population Means: Paired Difference Experiments Comparing 2 population Proportions: F-Test -Break- Hypothesis Testing Exercise II (paper clip) Hypothesis Testing Presentation 第一天wrap up,第二

21、天: Analysis of variance 和simple linear regression Chi-square : A test of independence Chi-square : Inferences about a population variance Chi-square exercise ANOVA - Analysis of variance ANOVA Analysis of variance case study -Break- Testing the fittness of a probability distribution Chi-square: a go

22、odness of fit test The Kolmogorov-Smirnov Test Goodness of fit exercise using dice Result 和discussion on exercise -Lunch- Probabilistic 关系hip of a regression model Fitting model with least square approach Assumptions 和variance estimator Making inference about the slope Coefficient of Correlation 和De

23、termination Example of simple linear regression Simple linear regression exercise (using statapult) -Break- Simple linear regression exercise (cont) Presentation of results 第二天wrap up,Day 3: Multiple regression 和model building Introduction to multiple regression model Building a model Fitting the mo

24、del with least squares approach Assumptions for model Usefulness of a model Analysis of variance Using the model for estimation 和prediction Pitfalls in prediction model -Break- Multiple regression exercise (statapult) Presentation for multiple regression exercise -Lunch- - Qualitative data 和dummy va

25、riables Models with 2 or more quantitative independent variables Testing the model Models with one qualitative independent variable Comparing slopes 和response curve -Break- Model building example Stepwise regression an approach to screen out factors Day 3 wrap up,Day 4: 设计of Experiment Overview of E

26、xperimental Design What is a designed experiment Objective of experimental 设计和its capability in identifying the effect of factors One factor at a time (OFAT) versus 设计of experiment (DOE) for modelling Orthogonality 和its importance to DOE H和calculation for building simple linear model Type 和uses of D

27、OE, (i.e. linear screening, linear modelling, 和non-linear modelling) OFAT versus DOE 和its impact in a screening experiment Types of screening DOEs -Break- Points to note when conducting DOE Screening DOE exercise using statapult Interpretating the screening DOEs result -Lunch- Modelling DOE (Full fa

28、ctoria with interactions) Interpreting interaction of factors Pareto of factors significance Graphical interpretation of DOE results 某些rules of thumb in DOE 实例of Modelling DOE 和its analysis -Break- Modelling DOE exercise with statapult Target practice 和confirmation run Day 4 wrap up,Day 5: Statistic

29、al 流程Control What is Statistical 流程Control Control chart the voice of the 流程流程control versus 流程capability Types of control chart available 和its application Observing trends for control chart Out of Control reaction Introduction to Xbar R Chart Xbar R Chart example Assignable 和Chance causes in SPC R

30、ule of thumb for SPC run test -Break- Xbar R Chart exercise (using Dice) Introduction to Xbar S Chart Implementing Xbar S Chart 为什么Xbar S Chart ? Introduction to Individual Moving Range Chart Implementing Individual Moving Range Chart 为什么Xbar S Chart ? -Lunch- Choosing the sub-group Choosing the cor

31、rect sample size Sampling frequency Introduction to control charts for attribute data np Charts, p Charts, c Charts, u Charts -Break- Attribute control chart exercise (paper clip) Out of control not necessarily is bad Day 5 wrap up,Recap of Statistical Terminology,Area under a Normal Distribution,流程

32、capability potential, Cp Based on the assumptions that :,流程is normal,It is a 2-sided specification,流程mean is centered to the device specification,Spread in specification,Natural tolerance,流程Capability Index, Cpk,Based on the assumption that the 流程is normal 和in control 2. An index that compare the 流程

33、center with specification center,Therefore when , Cpk Cp ; then 流程is not centered,Cpk = Cp ; then 流程is centered,The 流程of collecting, presenting 和describing sample data, using graphical 工具和numbers. Pareto Chart Population mean Histogram Population 标准偏差,Descriptive Statistics,Probability Theory,Probab

34、ility is the chance for an event to occur. Statistical dependence / independence Posterior probability Relative frequency Make decision through probability distributions (i.e. Binomial, Poisson, Normal),Inferential Statistics,The 流程of interpreting the sample data to draw conclusions about the popula

35、tion from which the sample was taken. Confidence Interval (Determine confidence level for a sampling mean to fluctuate) T-Test 和F-Test (Determine if the underlying populations is significantly different in terms of the means 和variations) Chi-Square Test of Independence (Test if the sample proportion

36、s are significantly different) Correlation 和Regression (Determine if 关系hip between variables exists, 和generate model equation to predict the outcome of a single output variable),Central Limit Theorem,某些take aways for sample size 和sampling distribution,Percentiles of the t Distribution,Whereby, df =

37、Degree of freedom = n (sample size) 1 Shaded area = one-tailed probability of occurence a = 1 Shaded area Applicable when: Sample size 30 标准偏差 is unknown Population distribution is at least approximately normally distributed,Percentiles of the Normal Distribution / Z Distribution,Whereby, Shaded are

38、a = one-tailed probability of occurence a = 1 Shaded area,Student t Distrbution example,FDA requires pharmaceutical companies to perform extensive tests on all new drugs before they can be marketed to the public. The first phase of testing will be on animals, while the second phase will be on human

39、on a limited basis. PWD is a pharmaceutical company currently in the second phase of testing on a new antibiotic project. The chemists are interested to know the effect of the new antibiotic on the human blood pressure, 和they are only allowed to test on 6 patients. The result of the increase in bloo

40、d pressure of the 6 tested patients are as below: ( 1.7 , 3.0 , 0.8 , 3.4 , 2.7 , 2.1 ) Construct a 95% confidence interval for the average increase in blood pressure for patients taking the new antibiotic, using both normal 和t distributions.,Student t Distrbution example (cont),Using normal or z di

41、stribution,Using student t distribution,Although the confidence level is the same, using t distribution will result in a larger interval value, because: 标准偏差, S for small sample size is probably not accurate 标准偏差, S for small sample size is probably too optimistic Wider interval is therefore necessa

42、ry to achieve the required confidence level,Summary of formula for confidence limit,6 Sigma 流程和1.5 Sigma Shift in Mean,Statistically, a 流程that is 6 Sigma with respect to its specifications is:,But Motorola defines 6 Sigma with a scenario of 1.5 Sigma shift in mean,DPM = 3.4 Cp = 2 Cpk = 1.5,某些Explan

43、ations on 1.5 Sigma Mean Shift,Motorla has conducted a lot of experiments, 和found that in long term, the 流程mean will shift within 1.5 sigma if the 流程is under control. 1.5 sigma mean shift in a 3 Sigma 流程control plan will be translated to approximately 14% of the time a data point will be out of control, 和this is deem acceptable in statistical 流程control (SPC) practices.,Our Explanation,Most frequently used sample size for SPC in industry is 3 to 5 units per sampling. Take the middle value of 4 as an average sample size used in the sampling. Assuming the 流程is

展开阅读全文