Measurement Theory and Applications for the Social Sciences (Methodology in the Social Sciences) - Hardcover

Bandalos, Deborah L.

9781462532131: Measurement Theory and Applications for the Social Sciences (Methodology in the Social Sciences)

Hardcover

ISBN 10: 1462532136 ISBN 13: 9781462532131

Verlag: Guilford Press, 2018

Alle Exemplare dieser ISBN-Ausgabe

0 Gebraucht

8 Neu

Von EUR 80,05

Which types of validity evidence should be considered when determining whether a scale is appropriate for a given measurement situation? What about reliability evidence? Using clear explanations illustrated by examples from across the social and behavioral sciences, this engaging text prepares students to make effective decisions about the selection, administration, scoring, interpretation, and development of measurement instruments. Coverage includes the essential measurement topics of scale development, item writing and analysis, and reliability and validity, as well as more advanced topics such as exploratory and confirmatory factor analysis, item response theory, diagnostic classification models, test bias and fairness, standard setting, and equating. End-of-chapter exercises (with answers) emphasize both computations and conceptual understanding to encourage readers to think critically about the material. The companion website (www.guilford.com/bandalos-materials) provides annotated examples, syntax, and datasets in both SPSS and SAS (for most chapters), so that readers can redo the analyses in each chapter.

Die Inhaltsangabe kann sich auf eine andere Ausgabe dieses Titels beziehen.

�ber die Autorin bzw. den Autor

Deborah L. Bandalos, PhD, is Professor and Director of the Assessment and Measurement Doctoral Program in the Department of Graduate Psychology at James Madison University, where she teaches courses in exploratory factor analysis, measurement theory, and missing data methodologies. Her research areas include structural equation modeling and the effects of item wording changes in instrument development. Dr. Bandalos has published articles and book chapters in the areas of structural equation modeling, exploratory factor analysis, and item and scale development. She is an associate editor of Multivariate Behavioral Research and a past associate editor of Structural Equation Modeling. In addition, Dr. Bandalos serves on the editorial boards of Psychological Methods and Applied Measurement in Education, is on the Executive Committee of Division 5 (Quantitative and Qualitative Methods) of the American Psychological Association, and has been elected 2019 President of the Society for Multivariate Experimental Psychology.

Auszug. � Genehmigter Nachdruck. Alle Rechte vorbehalten.

Measurement Theory and Applications for the Social Sciences

By Deborah L. Bandalos

The Guilford Press

Part I. Instrument Development and Analysis, 
1 • Introduction, 3, 
2 • Norms and Standardized Scores, 21, 
3 • The Test Development Process, 41, 
4 • Writing Cognitive Items, 63, 
5 • Writing Noncognitive Items, 85, 
6 • Item Analysis for Cognitive and Noncognitive Items, 120, 
7 • Introduction to Reliability and the, 155, 
8 • Methods of Assessing Reliability, 172, 
9 • Interrater Agreement and Reliability, 210, 
10 • Generalizability Theory, 226, 
11 • Validity, 254, 
Part III. Advanced Topics in Measurement Theory, 
12 • Exploratory Factor Analysis, 301, 
13 • Confirmatory Factor Analysis, 350, 
14 • Item Response Theory with Christine E. DeMars, 403, 
15 • Diagnostic Classification Models with Laine P. Bradshaw, 446, 
16 • Bias, Fairness, and Legal Issues in Testing, 478, 
17 • Standard Setting, 519, 
18 • Test Equating, 547, 
Answers to Exercises, 585, 
References, 623, 
Author Index, 645, 
Subject Index, 651, 
About the Author, 661,

CHAPTER 1

Introduction

When I told my relatives and friends I was writing a book about measurement, the most common reaction was confusion. How hard can measurement be, they asked, that you need to write an entire book about it? Just take a ruler or thermometer and measure whatever it is you want to measure. For many physical measurements, it is indeed this simple, although the measurement devices we now take for granted, such as rulers and thermometers, took some time to develop and become accepted. In the social sciences, however, measurement is not so simple. This is because most of the things social scientists want to measure are not physical but mental attributes. That is, social scientists are interested in such things as people's intellectual abilities, attitudes, personality characteristics, and values. Such attributes do not lend themselves easily (or at all) to physical measurement. We cannot look at a person and discern his or her attitudes or values, nor is there any ruler- or thermometer-like device we can use to measure them. Instead, we hypothesize the existence of theoretical entities known as constructs (also called latent constructs, factors, or unobserved variables) to account for certain characteristics or behaviors. For example, a researcher might recognize, based on long experience working with people, that some seem to learn faster and adapt more quickly to new situations than others. The researcher hypothesizes a construct now known as intelligence to account for this difference. Note that the researcher cannot directly observe people's intelligence, but must infer the existence of intelligence from observations of their behavior. Other examples of constructs include creativity, anxiety, attitudes toward gun control, altruism, propensity to buy a product, and aptitude for learning. All of these are latent in the sense that we cannot measure them directly but must devise some way of getting at them indirectly.

Measurement of constructs is therefore indirect, relying on samples of behavior such as responses to test items or observations of behavior. A test is a way of eliciting such behaviors. Here, and throughout this book, I use the term test to refer to a procedure for obtaining a sample of behavior that can be used to infer a person's level or status on a construct of interest. The terms measure, instrument, and scale are often used in the same way as "test," and although some authors make distinctions among them, the terms are generally used somewhat interchangeably, a practice I follow in this book. As an example of a test, suppose a researcher theorizes that the ability to apply knowledge learned in one context to a new context is one aspect of intelligence. To measure this ability, the researcher would have to devise a series of tasks requiring test takers to apply knowledge learned in one subject area to other subject areas. These tasks could make up a test of the ability to apply knowledge to new contexts, which could then be administered to test takers.

I note several things about this procedure. First, the researcher would likely be able to think up many tasks that could elicit the desired ability. In fact, there may be limitless tasks that can measure many constructs (consider, as an example, the ability to add two-digit numbers). This implies that the tasks included on a test are typically a sample of all the possible tasks that might have been used. Second, the researcher would have to put some limitations on the manner in which test takers are allowed to complete the tasks. For example, the researcher might stipulate that test takers cannot consult outside resources, such as websites or friends, to help them complete the tasks. The researcher might also impose a time limit so that some test takers do not have more time than others to complete the tasks. Thus, the test would likely be administered under controlled, or standardized conditions. Third, although the researcher would like to assume that correct completion of the tasks was an indication of the ability to apply knowledge in new contexts, this may not be the case. Suppose some test takers are able to complete the task in a new subject area by some other means than generalizing knowledge from the original subject area. For example, suppose some test takers are completely unable to generalize their knowledge to the new subject area but know a great deal about the new subject area and are able to answer correctly based on that knowledge. Would the test still measure the ability to apply knowledge learned in one context to another? Probably not, because the test takers did not use that ability to answer the questions. This example points to a persistent problem in the measurement of constructs: it is always possible that the tasks used do not actually elicit the construct of interest.

Problems in Social Science Measurement

In the previous section, I discussed some of the issues inherent in measurement in the social sciences. One issue is that tests are usually based on limited samples of behavior; we cannot ask every possible question or observe every instance of behavior. A related issue is that there is no one "correct" method of measuring a construct. In the previous example, the ability to apply knowledge in new contexts could have been measured by performance assessments in which test takers are given problems to solve in different subject areas, by multiple-choice tests asking test takers to choose the most likely outcome of a theory if applied in a new context, or by interviewing test takers about how they would solve the problem in a new context, just to name a few. Use of the different measurement methods would get at somewhat different aspects of the ability, and each method would have its own advantages and disadvantages. The researcher developing the test would therefore have to carefully consider which type of test would be best aligned with the purpose(s) of testing. For example, different methods might be appropriate for testing theories about the ability than for selecting students for an advanced educational program.

Another testing issue is that many things can (and likely will) interfere with our measurement of the construct of interest. As indicated in the previous section, test takers may be able to complete the tasks using skills or abilities other than those the test was designed to measure. Or some test takers may have the requisite abilities but may be so anxious about the test that they fail to complete any of the tasks correctly. Other test takers may have the ability to complete the tasks but have limited English proficiency, causing them to misinterpret the tasks or instructions. Other types of interference are more relevant to attitude measurement. For example, respondents to attitude items may not answer truthfully because they know their attitudes are not politically correct and they do not want to draw attention to this; this tendency is known as socially desirable responding. Some respondents may have a tendency to choose a "neutral" or middle response option, whereas others may tend to choose more extreme response options. Such response styles are ubiquitous in the measurement of attitudes, personality characteristics, and psychological disorders. Those measuring psychological disorders must also contend with malingering — the tendency to exaggerate one's symptoms in an effort to obtain a particular diagnosis.

All of these issues in the measurement of constructs come under the broad heading of errors of measurement. It is important to understand that such measurement errors are part and parcel of most social science measurement. As a result, our measures are not perfect, but they should instead be thought of as approximations. Although some tests may provide quite good approximations, none are error-free. One of the tasks of those developing and using tests is therefore to be aware of the possibilities for error in test scores and to interpret and use test results with these possibilities in mind.

What is Measurement Theory?

Another important task for those involved in social science measurement is to investigate the impact of measurement error on test results and to use the findings from these investigations to improve the tests and testing procedures. Such investigations are part of the broad field of measurement or test theory, known in psychology as psychometrics.

These terms refer broadly to the study of methods for measuring constructs, and of their attendant problems. Measurement theory is therefore the study of how to develop tests that are as free as possible of measurement error and that yield the most appropriate measures of the desired constructs. Without good tests, social scientists would be unable to diagnose many learning disabilities and personality disorders, to study individual differences in constructs of interest, or to test theories involving these constructs. Our measurements are the basis of our diagnoses and of our ability to test theories, which are therefore only as good as the measures underlying them. Good tests are thus crucial to both practical applications and to theory development in the social sciences.

Measurement Defined

So what do I mean by measurement? Stanley S. Stevens's (1946) definition of measurement as the "assignment of numerals to objects or events according to rules" (p. 677) is commonly cited. This definition was later amended to clarify that it is the properties of objects (usually people), such as the strength of their attitudes or their levels of altruism, and not the objects themselves that are measured. Note that, according to Stevens's definition, coding responses on a questionnaire with a "1" for male and a "2" for female constitutes measurement because this process involves assigning numerals (1 and 2) to properties of objects (male and female), according to the rule that 1 means male and 2 means female. Stevens defined four levels of measurement: nominal, ordinal, interval, and ratio. These levels are distinguished by the properties they include and are hierarchical in the sense that a higher level of measurement includes the properties of those lower in the hierarchy. In the sections that follow I describe the four levels and their properties. I also indicate the statistical operations for which Stevens felt each level was appropriate.

The Nominal Level of Measurement

Nominal measures are those for which the numbers serve only to distinguish different categories and do not have any real numerical meaning. The previous example of coding males as 1 and females as 2 exemplifies the nominal level of measurement. The only property of this type of measurement is that of distinctiveness. That is, the numbers distinguish the two categories of male and female but have no quantitative meaning. The coding of most demographic variables, such as political party or hair or eye color, constitute nominal measurement. Such measures can be transformed by applying any one-to-one substitution. In other words, we could substitute any other pairs of numbers, such as 3 and 4, or 65 and 83, for 1 and 2 because they serve to distinguish the categories equally well. The only transformation we cannot use is one in which the same number is used to represent both male and female categories because this would destroy the property of distinctiveness. Because the numbers used in nominal measurement have no numeric meaning, it is not appropriate to add, subtract, or otherwise manipulate them numerically. The only statistical indices appropriate at this level are those based on counts, such as the mode or the chi-square tests of independence and goodness of fit.

The Ordinal Level of Measurement

At the ordinal level of measurement the numbers represent a rank order of the properties of objects. The rank order could be based on size, speed, importance, correctness, or any other property capable of being ranked. Common examples of ordinal measures are the outcomes of a race (first place, second place, etc.), military ranks, and class rank. To the property of distinctiveness, ordinal measures therefore add the property of order. Although the numbers used in ordinal measurement imply rank order, the intervals between adjacent scale points are not assumed to be equal. Taking the outcomes of a race as an example, we know that the person finishing first is faster than the person finishing second, but we do not know how much faster because ordinal measurement does not tell us anything about the amounts by which scale points differ. We also do not know whether the time difference between those finishing first and second is the same as that between those finishing second and third because for ordinal measurement these intervals are not assumed to be equal. Ordinal measures can be transformed in any way that preserves the original order. Thus, we could substitute the numbers 3, 18, and 21 for the numbers 1, 2, and 3 without losing the properties of distinctiveness and order. Because ordinal measures cannot be assumed to have equal intervals, it is not meaningful or appropriate to perform arithmetic operations such as addition or subtraction on them. Statistical indices such as the mode, median or interquartile range are appropriate because these do not assume equal intervals.

Items measured on the commonly used "strongly disagree" to "strongly agree" Likert-type scale (see Chapter 5) are, strictly speaking, at the ordinal level of measurement. This is because we do not know whether respondents consider the psychological distance between "strongly agree" and "agree" to be the same as that between "strongly disagree" and "disagree," or any other adjacent scale points. Having said this, researchers differ in their willingness to treat data from Likert scales as having equal intervals. Some argue that such data probably have equal or nearly equal intervals and that little is lost, statistically speaking, by treating these data as interval. Others argue that this does not make sense unless we know that respondents do treat the intervals as equal and we generally do not have such knowledge.

The Interval Level of Measurement

In addition to the properties of distinctiveness and order, interval measures have the property of equal intervals. This means that the intervals between adjacent scale points are assumed to be the same across the entire scale continuum. A common example of interval level measurement is temperature as measured by the Fahrenheit or Centigrade scales, in which the difference in heat between scale points of 50� and 51� is the same as the difference between 90� and 91�. Interval measures can be transformed through any linear transformation of the form y = a + bX. Because of their equal-interval property, it is appropriate to calculate nearly all parametric statistics, such as the mean, standard deviation, and correlation from interval level data.

The Ratio Level of Measurement

The ratio level is the highest of Stevens's (1946) levels of measurement. In addition to the properties of distinctiveness, order, and equal intervals, ratio-level measurement has the property of a true zero point. A true zero point is one that represents the absolute lack of the property being measured. For example, $0 indicates the absolute lack of any money.

The Kelvin temperature scale is on a ratio scale because, on that scale, zero degrees indicates a complete lack of heat. This is not the case for the Fahrenheit and Centigrade scales, which is why they are relegated to the interval level of measurement. Many physical scales, such as height, weight, and time, as well as things that can be counted, such as the number of test items correct or the number of students in a classroom, are at the ratio level of measurement. Numbers on a ratio scale can be legitimately transformed only through multiplication of scale points by a constant. Adding a constant, as is permissible at the interval level of measurement, is not permissible at the ratio level because this would change the value of the zero point, rendering it nonabsolute. All parametric statistical operations are permissible for variables at the ratio level of measurement.

It may seem that test scores are at the ratio level of measurement. This depends, however, on how we want to interpret the scores. If we are content to interpret a test score as the number of points obtained on the test, the scores can be considered as ratio level. This is because the zero point can be appropriately interpreted as the absolute absence of any points obtained. However, if we want to interpret the test score as an indication of a particular level of knowledge or achievement, achieving the ratio level of measurement becomes much more problematic. This is because it is difficult to argue that a test score of zero means the absolute absence of any knowledge or achievement. A more likely interpretation is that a student earning a score of zero has some knowledge but does not have knowledge of the particular questions included on the test. It may be that if different questions had been asked, the student would have obtained a higher score. Or the student may have suffered from test anxiety, may have been unable to correctly interpret the test questions, or may have marked the answers incorrectly on the bubble sheet. As you can see, it is much more difficult to make ratio-level interpretations for abstract constructs such as achievement than for more concrete entities such as the number of points earned.

(Continues...)

Excerpted from Measurement Theory and Applications for the Social Sciences by Deborah L. Bandalos. Copyright � 2018 The Guilford Press. Excerpted by permission of The Guilford Press.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.

��ber diesen Titel� kann sich auf eine andere Ausgabe dieses Titels beziehen.

Verlag: Guilford Press
Erscheinungsdatum: 2018
Sprache: Englisch
ISBN 10: 1462532136
ISBN 13: 9781462532131
Einband: Tapa dura
Auflage: 1
Anzahl der Seiten: 661
Kontakt zum Hersteller: Nicht verf�gbar
Verantwortliche Person: Nicht verf�gbar

Neu kaufen

Diesen Artikel anzeigen

EUR 80,05

W�hrung umrechnen

EUR 5,71 f�r den Versand von Vereinigtes K�nigreich nach Deutschland

Versandziele, Kosten & Dauer

In den Warenkorb

Suchergebnisse f�r Measurement Theory and Applications for the Social...

Beispielbild f�r diese ISBN

MEASUREMENT THEORY & APPLICATIONS FOR TH

BANDALOS, DEBORAH L.

Verlag: The Guilford Press, 2018

ISBN 10: 1462532136 ISBN 13: 9781462532131

Neu Hardcover

Anbieter: Speedyhen, London, Vereinigtes K�nigreich

Verk�uferbewertung 5 von 5 Sternen

Zustand: NEW. Artikel-Nr. NW9781462532131

Verk�ufer kontaktieren

Neu kaufen

EUR 80,05

W�hrung umrechnen

Versand: EUR 5,71

Von Vereinigtes K�nigreich nach Deutschland

Versandziele, Kosten & Dauer

Anzahl: 2 verf�gbar

In den Warenkorb

Beispielbild f�r diese ISBN

Measurement Theory and Applications for the Social Sciences

Deborah L. Bandalos

Verlag: Guilford Publications, 2018

ISBN 10: 1462532136 ISBN 13: 9781462532131

Neu Hardcover

Anbieter: PBShop.store UK, Fairford, GLOS, Vereinigtes K�nigreich

Verk�uferbewertung 5 von 5 Sternen

HRD. Zustand: New. New Book. Shipped from UK. Established seller since 2000. Artikel-Nr. FT-9781462532131

Verk�ufer kontaktieren

Neu kaufen

EUR 85,22

W�hrung umrechnen

Versand: EUR 4,86

Von Vereinigtes K�nigreich nach Deutschland

Versandziele, Kosten & Dauer

Anzahl: 10 verf�gbar

In den Warenkorb

Foto des Verk�ufers

Measurement Theory and Applications for the Social Sciences

Deborah L. Bandalos (James Madison University, United States)

Verlag: Guilford Publications, 2018

ISBN 10: 1462532136 ISBN 13: 9781462532131

Neu Hardcover

Anbieter: moluna, Greven, Deutschland

Verk�uferbewertung 4 von 5 Sternen

Zustand: New. Deborah L. Bandalos, PhD, is Professor and Director of the Assessment and Measurement Doctoral Program in the Department of Graduate Psychology at James Madison University, where she teaches courses in exploratory factor analysis, measurement theory, and. Artikel-Nr. 902929350

Verk�ufer kontaktieren