References
[1]. Alderson, J. C., Clapham, C. & Wall, D. (1995).
Language test construction and evaluation. Cambridge:
Cambridge University Press.
[2]. Bachman, L. F. (1990). Fundamental considerations in
language testing. Oxford: OUP.
[3]. Brennan, R. L. (1984). Estimating the dependability of
scores. In R. A. Berk (Ed.), A guide to criterion-referenced
test construction (pp. 292-334). Baltimore, Md.: The Johns
Hopkins University Press.
[4]. Cronbach, L. J. (1951). Coefficient alpha and the
internal structure of tests. Psychometrika, 16, 292-334.
[5]. Cronbach, L. J. (1984). Essentials of psychological
testing (4 th ed.). New York: Harper and Row.
[6]. Cronbach, L. J., Geleser, G. C., Nanda, H., &
Rajaratnam, N. (1972). The dependability of behavioral
measurement: Theory of generalizability for scores and
profiles. New York: John Wiley.
[7]. Ebel, R. L. (1951). Estimation of the reliability of ratings.
Psychometrika, 16, 407-424.
[8]. Farhady, H. (1980). Justification, development, and
validation of functional language tests. Unpublished
doctoral dissertation, University of California at Los
Angeles.
[9]. Fisher, R. A. (1925). Statistical methods for research
workers. London: Oliver & Bond.
[10]. Kane, M. T. (1982). A sampling model for validity.
Applied Psychological Measurement, 6, 125-160.
[11]. Kane, M. T., & Brennan, R. L. (1980). Agreement
coefficients as indices of dependability for domainreferenced
tests. Applied Psychological Measurement, 4,
219-240.
[12]. Lindquist, E. F. (1953). Design and analysis of
experiments in psychology and education. Boston:
Houghton Mifflin.
[13]. Lord, F. M. (1957). Do tests of the same length have
the same standard error of measurement? Educational
and Psychological Measurement, 22, 511-521.
[14]. Messick, S. (1988). Validity. In L. R. Linn (Ed.),
Educational measurement (pp. 13-103). New York:
American Council on Education/McMillan.
[15]. Rasch, G. (1980). Probabilistic models for some
intelligence and attainment tests. Chicago: University of
Chicago Press.
[16]. Shavelson, R., & Webb, N. (1981). Generalizability
theory: 1973-1980. British Journal of Mathematical and
Statistical Psychology, 34, 133-166.
[17]. Shavelson, R. J., Webb, N., & Rowley, G. L. (1989). Generalizability theory. American Psychologist, 44, 922-
932.
[18]. Spearman, (1910). Correlation calculated from
faulty data. British Journal of Psychology, 3, 271-295.
[19]. University of Sothern Florida. Item Response Theory.
Paper retrieved from: http://luna.cas.usf.edu/~mbrannic /
files/pmet/irt.htm.
[20]. Zumbo, B. D. (1999). A handbook on the theory and
methods of differential item functioning (DIF): Logistic
regression modeling as a unitary framework for binary
and likert-type (ordinal) scores. Ottawa, ON.: Directorate
of Human Resources Research and Evaluation,
Department of National Defense.