Frederick J Gravetter，Larry B. WaLLnau《Statistics for the Behavioral Sciences 10e》
一， Statistics, Science, and Observations
Specifcally, statistics serve two general purposes:
1. Statistics are used to organize and summarize the information so that the researcher can see what happened in the research study and can communicate the results to others.
2. Statistics help the researcher to answer the questions that initiated the research by determining exactly what general conclusions are justifed based on the specifc results that were obtained.
The term statistics refers to a set of mathematical procedures for organizing, summarizing, and interpreting information.
A populationis the set of all the individuals of interest in a particular study.
A sampleis a set of individuals selected from a population, usually intended to represent the population in a research study.
A variableis a characteristic or condition that changes or has different values for different individuals.
Descriptive statistics are statistical procedures used to summarize, organize, and simplify data.
Descriptive statistics are techniques that take raw scores and organize or summarize them in a form that is more manageable. Often the scores are organized in a table or a graph so that it is possible to see the entire set of scores. Another common technique is to summarize a set of scores by computing an average. Note that even if the data set has hundreds of scores, the average provides a single descriptive value for the entire set.
The second general category of statistical techniques is called inferential statistics. Inferential statistics are methods that use sample data to make general statements about a population.
Inferential statistics consist of techniques that allow us to study samples and then make generalizations about the populations from which they were selected.
Because populations are typically very large, it usually is not possible to measure everyone in the population. Therefore, a sample is selected to represent the population. By analyzing the results from the sample, we hope to make general statements about the population. Typically, researchers use sample statistics as the basis for drawing conclusions about population parameters. One problem with using samples, however, is that a sample provides only limited information about the population. Although samples are generally representativeof their populations, a sample is not expected to give a perfectly accurate picture of the whole population. There usually is some discrepancy between a sample statistic and the corresponding population parameter. This discrepancy is calledsampling
error, and it creates the fundamental problem inferential statistics must always address.
Sampling error is the naturally occurring discrepancy, or error, that exists between a sample statistic and the corresponding population parameter.
The concept of sampling error is illustrated in Figure 1.2. The fgure shows a population of 1,000 college students and 2 samples, each with 5 students who were selected from the population. Notice that each sample contains different individuals who have different characteristics. Because the characteristics of each sample depend on the specifc people in the sample, statistics will vary from one sample to another. For example, the fve students in sample 1 have an average age of 19.8 years and the students in sample 2 have an average age of 20.4 years. It is also very unlikely that the statistics obtained for a sample will be identical to the parameters for the entire population. In Figure 1.2, for example, neither sample has statistics that are exactly the same as the population parameters. You should also realize that Figure 1.2 shows only two of the hundreds of possible samples. Each sample would contain different individuals and would produce different statistics. This is the basic concept of sampling error: sample statistics vary from one sample to another and typically are different from the corresponding population parameters.
E x a m p l E 1 . 1
Figure 1.3 shows an overview of a general research situation and demonstrates the roles that descriptive and inferential statistics play. The purpose of the research study is to address a question that we posed earlier: Do college students learn better by studying text on printed pages or on a computer screen? Two samples are selected from the population of college students. The students in sample A are given printed pages of text to study for 30 minutes and the students in sample B study the same text on a computer screen. Next, all of the students are given a multiple-choice test to evaluate their knowledge of the material. At this point, the researcher has two sets of data: the scores for sample A and the scores for sample B (see the figure). Now is the time to begin using statistics.
First, descriptive statistics are used to simplify the pages of data. For example, the researcher could draw a graph showing the scores for each sample or compute the average score for each sample. Note that descriptive methods provide a simplified, organized
二、 Data Structures, Research Methods, and Statistics
In the correlational method, two different variables are observed to determine whether there is a relationship between them.
The independent variableis the variable that is manipulated by the researcher. In behavioral research, the independent variable usually consists of the two (or more) treatment conditions to which subjects are exposed.
The independent variable consists of the antecedentconditions that were manipulatedpriorto observing the dependent variable.
The dependent variable is the one that is observed to assess the effect of the treatment.
三、 Variables and Measurement
Constructs and Operational Defnitions
Constructs are internal attributes or characteristics that cannot be directly observed but are useful for describing and explaining behavior.
An operational defnitionidentifes a measurement procedure (a set of operations) for measuring an external behavior and uses the resulting measurements as a defnition and a measurement of a hypothetical construct. Note that an operational defnition has two components. First, it describes a set of operations for measuring a construct. Second, it defnes the construct in terms of the resulting
The scores that make up the data from a research study are the result of observing and measuring variables. For example, a researcher may fnish a study with a set of IQ scores, personality scores, or reaction-time scores. In this section, we take a closer look at the variables that are being measured and the process of measurement.
Some variables, such as height, weight, and eye color are well-defned, concrete entities that can be observed and measured directly. On the other hand, many variables studied by behavioral scientists are internal characteristics that people use to help describe and
explain behavior. For example, we say that a student does well in school because he or she isintelligent. Or we say that someone isanxiousin social situations, or that someone seems to behungry. Variables like intelligence, anxiety, and hunger are calledconstructs, and because they are intangible and cannot be directly observed, they are often called hypothetical constructs. Although constructs such as intelligence are internal characteristics that cannot be directly observed, it is possible to observe and measure behaviors that are representative of the construct. For example, we cannot “see” intelligence but we can see examples of intelligent behavior. The external behaviors can then be used to create an operational defnition for the construct. Anoperational defnitiondefines a construct in terms of externalbehaviors that can be observed and measured. For example, your intelligence is measured
and defned by your performance on an IQ test, or hunger can be measured and defned by the number of hours since last eating.
Discrete and Continuous Variables
A discrete variableconsists of separate, indivisible categories. No values can exist between two neighboring categories.
Discrete variables are commonly restricted to whole, countable numbers—for example, the number of children in a family or the number of students attending class. If you observe class attendance from day to day, you may count 18 students one day and 19 students the next day. However, it is impossible ever to observe a value between 18 and 19. A discrete variable may also consist of observations that differ qualitatively. For example, people can be classifed by gender (male or female), by occupation (nurse, teacher, lawyer, etc.), and college students can by classifed by academic major (art, biology, chemistry, etc.). In each case, the variable is discrete because it consists of separate, indivisible categories.
On the other hand, many variables are not discrete. Variables such as time, height, and weight are not limited to a fxed set of separate, indivisible categories. You can measure time, for example, in hours, minutes, seconds, or fractions of seconds. These variables are called continuousbecause they can be divided into an infnite number of fractional parts.
For a continuous variable, there are an infnite number of possible values that fall between any two observed values. A continuous variable is divisible into an infnite number of fractional parts.
Suppose, for example, that a researcher is measuring weights for a group of individuals participating in a diet study. Because weight is a continuous variable, it can be pictured as a continuous line (Figure 1.7). Note that there are an infnite number of possible points on the line without any gaps or separations between neighboring points. For any two different points on the line, it is always possible to fnd a third value that is between the two points. Two other factors apply to continuous variables:
1. When measuring a continuous variable, it should be very rare to obtain identical measurements for two different individuals. Because a continuous variable has an infnite number of possible values, it should be almost impossible for two people to have exactly the same score. If the data show a substantial number of tied scores, then you should suspect that the measurement procedure is very crude or that the variable is not really continuous.
2. When measuring a continuous variable, each measurement category is actually an intervalthat must be defned by boundaries. For example, two people who both claim to weigh 150 pounds are probably notexactlythe same weight. However, they are both around 150 pounds. One person may actually weigh 149.6 and the other 150.3. Thus, a score of 150 is not a specifc point on the scale but instead is an interval (see Figure 1.7). To differentiate a score of 150 from a score of 149 or 151, we must set up boundaries on the scale of measurement. These boundaries are calledreal limitsand are positioned exactly halfway between adjacent scores. Thus, a score ofX= 150 pounds is actually an interval bounded by alower real limit of 149.5 at the bottom and anupper real limit of 150.5 at the top. Any individual whose weight falls between these real limits will be assigned a score ofX= 150.
Students often ask whether a value of exactly 150.5 should be assigned to the X=150 interval or the X=151 interval. The answer is that 150.5 is theboundarybetween the two intervals and is not necessarily in one or the other. Instead, the placement of 150.5 depends on the rule that you are using for rounding numbers. If you are rounding up, then 150.5 goes in the higher interval (X = 151) but if you are rounding down, then it goes in the lower interval (X=150).
Scales of Measurement
It should be obvious by now that data collection requires that we make measurements of our observations. Measurement involves assigning individuals or events to categories. The categories can simply be names such as male/female or employed/unemployed, or they can be numerical values such as 68 inches or 175 pounds. The categories used to measure a variable make up ascale of measurement, and the relationships between the categories determine different types of scales. The distinctions among the scales are important because they identify the limitations of certain types of measurements and because certain statistical procedures are appropriate for scores that have been measured on some scales but not on others. If you were interested in people’s heights, for example, you could measure a group of individuals by simply classifying them into three categories: tall, medium, and short. However, this simple classifcation would not tell you much about the actual heights of the individuals, and these measurements would not give you enough information to calculate an average height for the group. Although the simple classifcation would be adequate for some purposes, you would need more sophisticated measurements before you could answer more detailed questions. In this section, we examine four different scales of measurement, beginning with the simplest and moving to the most sophisticated.
■The Nominal Scale
A nominal scale consists of a set of categories that have different names. Measurements on a nominal scale label and categorize observations, but do not make any quantitative distinctions between observations.
The word nominalmeans “having to do with names.” Measurement on a nominal scale involves classifying individuals into categories that have different names but are not related to each other in any systematic way. For example, if you were measuring the academic majors for a group of college students, the categories would be art, biology, business, chemistry, and so on. Each student would be classifed in one category according to his or her major. The measurements from a nominal scale allow us to determine whether two individuals are different, but they do not identify either the direction or the size of the difference. If one student is an art major and another is a biology major we can say that they are different, but we cannot say that art is “more than” or “less than” biology and we cannot specify how much difference there is between art and biology. Other examples of nominal scales include classifying people by race, gender, or occupation.
Although the categories on a nominal scale are not quantitative values, they are occasionally represented by numbers. For example, the rooms or offces in a building may be identifed by numbers. You should realize that the room numbers are simply names and do not reﬂect any quantitative information. Room 109 is not necessarily bigger than Room 100 and certainly not 9 points bigger. It also is fairly common to use numerical values as a code for nominal categories when data are entered into computer programs. For example, the data from a survey may code males with a 0 and females with a 1. Again, the numerical values are simply names and do not represent any quantitative difference. The scales that follow do reﬂect an attempt to make quantitative distinctions.
■The Ordinal Scale
An ordinal scale consists of a set of categories that are organized in an ordered sequence. Measurements on an ordinal scale rank observations in terms of size or magnitude.
The categories that make up anordinal scalenot only have different names (as in a nominal scale) but also are organized in a fxed order corresponding to differences of magnitude.
Often, an ordinal scale consists of a series of ranks (frst, second, third, and so on) like the order of fnish in a horse race. Occasionally, the categories are identifed by verbal labels like small, medium, and large drink sizes at a fast-food restaurant. In either case, the fact that the categories form an ordered sequence means that there is a directional relationship between categories. With measurements from an ordinal scale, you can determine whether two individuals are different and you can determine the direction of difference. However, ordinal measurements do not allow you to determine the size of the difference between two individuals. In a NASCAR race, for example, the frst-place car fnished faster than the second-place car, but the ranks don’t tell you how much faster. Other examples of ordinal scales include socioeconomic class (upper, middle, lower) and T-shirt sizes (small, medium, large). In addition, ordinal scales are often used to measure variables for which it is diffcult to assign numerical scores. For example, people can rank their food preferences but might have trouble explaining “how much” they prefer chocolate ice cream to steak.
■The Interval and Ratio Scales
An interval scale consists of ordered categories that are all intervals of exactly the same size. Equal differences between numbers on scale reﬂect equal differences in magnitude. However, the zero point on an interval scale is arbitrary and does notindicate a zero amount of the variable being measured.
A ratio scale is an interval scale with the additional feature of an absolute zero point. With a ratio scale, ratios of numbers do reﬂect ratios of magnitude.
Both an interval scaleand aratio scaleconsist of a series of ordered categories (like an ordinal scale) with the additional requirement that the categories form a series of intervals that are all exactly the same size. Thus, the scale of measurement consists of a series of equal intervals, such as inches on a ruler. Other examples of interval and ratio scales are the measurement of time in seconds, weight in pounds, and temperature in degrees Fahrenheit. Note that, in each case, one interval (1 inch, 1 second, 1 pound, 1 degree) is the same size, no matter where it is located on the scale. The fact that the intervals are all the same size makes it possible to determine both the size and the direction of the difference between two measurements. For example, you know that a measurement of 80° Fahrenheit is higher than a measure of 60°, and you know that it is exactly 20° higher.
The factor that differentiates an interval scale from a ratio scale is the nature of the zero point. An interval scale has an arbitrary zero point. That is, the value 0 is assigned to a particular location on the scale simply as a matter of convenience or reference. In particular, a value of zero does not indicate a total absence of the variable being measured. For example a temperature of 0º Fahrenheit does not mean that there is no temperature, and it does not prohibit the temperature from going even lower. Interval scales with an arbitrary zero point are relatively rare. The two most common examples are the Fahrenheit and Celsius temperature scales. Other examples include golf scores (above and below par) and relative measures such as above and below average rainfall. A ratio scale is anchored by a zero point that is not arbitrary but rather is a meaningful value representing none (a complete absence) of the variable being measured. The existence of an absolute, non-arbitrary zero point means that we can measure the absolute amount of the variable; that is, we can measure the distance from 0. This makes it possible to compare measurements in terms of ratios. For example, a gas tank with 10 gallons (10 more than 0) has twice as much gas as a tank with only 5 gallons (5 more than 0). Also note that a completely empty tank has 0 gallons. To recap, with a ratio scale, we can measure the direction and the size of the difference between two measurements and we can describe the difference in terms of a ratio. Ratio scales are quite common and include physical measures such as height and weight, as well as variables such as reaction time or the number of errors on a test.
E x a m p l E 1 . 2
A researcher obtains measurements of height for a group of 8-year-old boys. Initially, the researcher simply records each child’s height in inches, obtaining values such as 44, 51, 49, and so on. These initial measurements constitute a ratio scale. A value of zero represents no height (absolute zero). Also, it is possible to use these measurements to form ratios. For example, a child who is 60 inches tall is one and a half times taller than a child who is 40 inches tall. Now suppose that the researcher converts the initial measurement into a new scale by calculating the difference between each child’s actual height and the average height for this age group. A child who is 1 inch taller than average now gets a score of+1; a child 4 inches taller than average gets a score of+4. Similarly, a child who is 2 inches shorter than average gets a score of–2. On this scale, a score of zero corresponds to average height. Because zero no longer indicates a complete absence of height, the new scores constitute an interval scale of measurement. Notice that original scores and the converted scores both involve measurement in inches, and you can compute differences, or distances, on either scale. For example, there is a 6-inch difference in height between two boys who measure 57 and 51 inches tall on the first scale. Likewise, there is a 6-inch difference between two boys who measure+9 and +3 on the second scale. However, you should also notice that ratio comparisons are not possible on the second scale. For example, a boy who measures+9 is not three times taller than a boy who measures+3 .