多水平模型(英文原著)chap(8).docx

上传人:scccc 文档编号:13065825 上传时间:2021-12-13 格式:DOCX 页数:10 大小:31.58KB
返回 下载 相关 举报
多水平模型(英文原著)chap(8).docx_第1页
第1页 / 共10页
多水平模型(英文原著)chap(8).docx_第2页
第2页 / 共10页
多水平模型(英文原著)chap(8).docx_第3页
第3页 / 共10页
多水平模型(英文原著)chap(8).docx_第4页
第4页 / 共10页
多水平模型(英文原著)chap(8).docx_第5页
第5页 / 共10页
点击查看更多>>
资源描述

《多水平模型(英文原著)chap(8).docx》由会员分享,可在线阅读,更多相关《多水平模型(英文原著)chap(8).docx(10页珍藏版)》请在三一文库上搜索。

1、Chapter 1 Introduction 1.1 Multilevel dataMany kinds of data, including observational data collected in the human and biological sciences, have a hierarchical or clustered structure. For example, animal and human studies of inheritance deal with a natural hierarchy where offspring are grouped within

2、 families. Offspring from the same parents tend to be more alike in their physical and mental characteristics than individuals chosen at random from the population at large. For instance, children from the same family may all tend to be small, perhaps because their parents are small or because of a

3、common impoverished environment. Many designed experiments also create data hierarchies, for example clinical trials carried out in several randomly chosen centres or groups of individuals. For now, we are concerned only with the fact of such hierarchies not their provenance. The principal applicati

4、ons we shall deal with are those from the social sciences, but the techniques are of course applicable more generally. In subsequent chapters, as we develop the theory and techniques with examples, we shall see how a proper recognition of these natural hierarchies allows us to seek more satisfactory

5、 answers to important questions.We refer to a hierarchy as consisting of units grouped at different levels. Thus offspring may be the level 1 units in a 2-level structure where the level 2 units are the families: students may be the level 1 units clustered within schools that are the level 2 units.T

6、he existence of such data hierarchies is neither accidental nor ignorable. Individual people differ as do individual animals and this necessary differentiation is mirrored in all kinds of social activity where the latter is often a direct result of the former, for example when students with similar

7、motivations or aptitudes are grouped in highly selective schools or colleges. In other cases, the groupings may arise for reasons less strongly associated with the characteristics of individuals, such as the allocation of young children to elementary schools, or the allocation of patients to differe

8、nt clinics. Once groupings are established, even if their establishment is effectively random, they will tend to become differentiated, and this differentiation implies that the group' and its members both influence and are influenced by the group membership. To ignore this relationship risks ov

9、erlooking the importance of group effects, and may also render invalid many of the traditional statistical analysis techniques used for studying data relationships.We shall be looking at this issue of statistical validity in the next chapter, but one simple example will show its importance. A well k

10、nown and influential study of primary (elementary) school children carried out in the 1970's (Bennett, 1976) claimed that children exposed to so called 'formal' styles of teaching reading exhibited more progress than those who were not. The data were analysed using traditional multiple r

11、egression techniques which recognised only the individual children as the units of analysis and ignored their groupings within teachers and into classes. The results were statistically significant. Subsequently, Aitkin et al, (1981) demonstrated that when the analysis accounted properly for the grou

12、ping of children into classes, the significant differences disappeared and the 'formally' taught children could not be shown to differ from the others. This reanalysis is the first important example of a multilevel analysis of social science data. In essence what was occurring here was that

13、the children within any one classroom, because they were taught together, tended to be similar in their performance. As a result they provide rather less information than would have been the case if the same number of students had been taught separately by different teachers. In other words, the bas

14、ic unit for purposes of comparison should have been the teacher not the student. The function of the students can be seen as providing, for each teacher, an estimate of that teacher's effectiveness. Increasing the number of students per teacher would increase the precision of those estimates but

15、 not change the number of teachers being compared. Beyond a certain point, simply increasing the numbers of students in this way hardly improves things at all. On the other hand, increasing the number of teachers to be compared, with the same or somewhat smaller number of students per teacher, consi

16、derably improves the precision of the comparisons.Researchers have long recognised this issue. In education, for example, there has been much debate (see Burstein et al, 1980) about the so called 'unit of analysis' problem, which is the one just outlined. Before multilevel modelling became w

17、ell developed as a research tool, the problems of ignoring hierarchical structures were reasonably well understood, but they were difficult to solve because powerful general purpose tools were unavailable. Special purpose software, for example for the analysis of genetic data, has been available lon

18、ger but this was restricted to 'variance components' models (see chapter 2) and was not suitable for handling general linear models. Sample survey workers have recognised this issue in another form. When population surveys are carried out, the sample design typically mirrors the hierarchical

19、 population structure, in terms of geography and household membership. Elaborate procedures have been developed to take such structures into account when carrying out statistical analyses. We return to this in a little more detail in a later section. In the remainder of this chapter we shall look at

20、 the major areas explored in this book.1.2 School effectivenessSchooling systems present an obvious example of a hierarchical structure, with pupils grouped or nested or clustered within schools, which themselves may be clustered within education authorities or boards. Educational researchers have b

21、een interested in comparing schools and other educational institutions, most often in terms of the achievements of their pupils. Such comparisons have several aims, including the aim of public accountability (Goldstein, 1992) but, in research terms, interest usually is focused upon studying the fact

22、ors that explain school differences. Consider the common example where test or examination results at the end of a period of schooling are collected for each school in a randomly chosen sample of schools. The researcher wants to know whether a particular kind of subject streaming practice in some sc

23、hools is associated with improved examination performance. She also has good measures of the pupils' achievements when they started the period of schooling so that she can control for this in the analysis. The traditional approach to the analysis of these data would be to carry out a regression

24、analysis, using performance score as response, to study the relationship with streaming practice, adjusting for the initial achievements. This is very similar to the initial teaching styles analysis described in the previous section, and suffers from the same lack of validity through failing to take

25、 account of the school level clustering of students. An analysis that explicitly models the manner in which students are grouped within schools has several advantages. First, it enables data analysts to obtain statistically efficient estimates of regression coefficients. Secondly, by using the clust

26、ering information it provides correct standard errors, confidence intervals and significance tests, and these generally will be more 'conservative' than the traditional ones which are obtained simply by ignoring the presence of clustering - just as Bennett's previously statistically sign

27、ificant results became non-significant on reanalysis. Thirdly, by allowing the use of covariates measured at any of the levels of a hierarchy, it enables the researcher to explore the extent to which differences in average examination results between schools are accountable for by factors such as or

28、ganisational practice or possibly in terms of other characteristics of the students. It also makes it possible to study the extent to which schools differ for different kinds of students, for example to see whether the variation between schools is greater for initially high scoring students than for

29、 initially low scoring students (Goldstein et al, 1993) and whether some factors are better at accounting for or 'explaining' the variation for the former students than for the latter. Finally, there is often considerable interest in the relative ranking of individual schools, using the perf

30、ormances of their students after adjusting for intake achievements. This can be done straightforwardly using a multilevel modelling approach.To fix the basic notion of a level and a unit, consider figures 1 and 2 based on hypothetical relationships. . Figure 1 shows the exam score and intake achieve

31、ment scores for five students in a school, together with a simple regression line fitted to the data points. The residual variation in the exam scores about this line, is the level 1 residual variation, since it relates to level 1 units (students) within a sample level 2 unit (school). In figure 2 t

32、he three lines are the simple regression lines for three schools, with the individual student data points removed. These vary in both their slopes and their intercepts (where they would cross the exam axis), and this variation is level 2 variation. It is an example of multiple or complex level 2 var

33、iation since both the intercept and slope parameters vary. Figure 1 Figure 2The other extreme to an analysis which ignores the hierarchical structure is one which treats each school completely separately by fitting a different regression model within each one. In some circumstances, for example wher

34、e we have very few schools and moderately large numbers of students in each, this may be efficient. It may also be appropriate if we are interested in making inferences about just those schools. If, however, we regard these schools as a (random) sample from a population of schools and we wish to mak

35、e inferences about the variation between schools in general, then a full multilevel approach is called for. Likewise, if some of our schools have very few students, fitting a separate model for each of these will not yield reliable estimates: we can obtain more precision by regarding the schools as

36、a sample from a population and using the information available from the whole sample data when making estimates for any one school. This approach is especially important in the case of repeated measures data where we typically have very few level 1 units per level 2 unit. We introduce the basic proc

37、edures for fitting multilevel models to hierarchically structured data in chapter 2 and discuss the design problem of choosing the numbers of units at each level in chapter 11.1.3 Sample survey methodsWe have already mentioned sample survey data which will be discussed in many of the examples of thi

38、s book. The standard literature on surveys, reflected in survey practice, recognises the importance of taking account of the clustering in complex sample designs. Thus, in a household survey, the first stage sampling unit will often be a well-defined geographical unit. From those which are randomly

39、chosen, further stages of random selection are carried out until the final households are selected. Because of the geographical clustering exhibited by measures such as political attitudes, special procedures have been developed to produce valid statistical inferences, for example when comparing mea

40、n values or fitting regression models (Skinner et al, 1989). While such procedures usually have been regarded as necessary they have not generally merited serious substantive interest. In other words, the population structure, insofar as it is mirrored in the sampling design, is seen as a 'nuisa

41、nce factor'. By contrast, the multilevel modelling approach views the population structure as of potential interest in itself, so that a sample designed to reflect that structure is not merely a matter of saving costs as in traditional survey design, but can be used to collect and analyse data a

42、bout the higher level units in the population. The subsequent modelling can then incorporate this information and obviate the need to carry out special adjustment procedures, which are built into the analysis model directly.Although the direct modelling of clustered data is statistically efficient,

43、it will generally be important to incorporate weightings in the analysis which reflect the sample design or, for example, patterns of non-response, so that robust population estimates can be obtained and so that there will be some protection against serious model misspecification. A procedure for in

44、troducing external unit weights into a multilevel analysis is discussed in Chapter 3.1.4 Repeated measures dataA different example of hierarchically structured data occurs when the same individuals or units are measured on more than one occasion. A common example occurs in studies of animal and huma

45、n growth. Here the occasions are clustered within individuals that represent the level 2 units with measurement occasions the level 1 units. Such structures are typically strong hierarchies because there is much more variation between individuals in general than between occasions within individuals.

46、 In the case of child height growth, for example, once we have adjusted for the overall trend with age, the variance between successive measurements on the same individual is generally no more than 5% of the variation in height between children. There is a considerable past literature on procedures

47、for the analysis of such repeated measurement data (see for example Goldstein, 1979), which has more or less successfully confronted the statistical problems. It has done so, however, by requiring that the data conform to a particular, balanced, structure. Broadly speaking these procedures require t

48、hat the measurement occasions are the same for each individual. This may be possible to arrange, but often in practice individuals will be measured irregularly, some of them a great number of times and some perhaps only once. By considering such data as a general 2-level structure we can apply the s

49、tandard set of multilevel modelling techniques that allow any pattern of measurements while providing statistically efficient parameter estimation. At the same time modelling a 2-level structure presents a simpler conceptual understanding of such data and leads to a number of interesting extensions that will be explored in chapter 6.One particularly important extension occurs in the study of growth where the aim is to fit growth curves to measurements over time. In a multilevel framework this involves, in th

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 社会民生


经营许可证编号:宁ICP备18001539号-1