How can we picture the distribution of a quantitative variable? In this section, we present several types of graphs that can be used to display quantitative data. Here are data on the number of goals scored in 20 games played by the U.
Draw and label the axis. Be sure to include units of measurement. Scale the axis. Look at the smallest and largest values in the data set.
Start the horizontal axis at a convenient number equal to or less than the smallest value and place tick marks at equal intervals until you equal or exceed the largest value. Plot the values. Mark a dot above the location on the horizontal axis corresponding to each data value. Try to make all the dots the same size and space them out equally as you stack them. Remember what we said in Section 1. Unfortunately, the team lost to Sweden on penalty kicks in the Summer Olympics.
To estimate fuel economy, the EPA performs tests on several vehicles of the same make, model, and year. Here are data on the highway fuel economy ratings for a sample of 25 model year Toyota 4Runners tested by the EPA:. Make a dotplot of these data.
Toyota reports the highway gas mileage of its model year 4Runners as 22 mpg. To make the dotplot: Draw and label the axis. Note variable name and units in the label. The smallest value is So we choose a scale from Describing Shape When you describe the shape of a dotplot or another graph of quantitative data, focus on the main features.
Look for major peaks, not for minor ups and downs in the graph. Look for clusters of values and obvious gaps. Decide if the distribution is roughly symmetric or clearly skewed. A distribution is skewed to the right if the right side of the graph is much longer than the left side. A distribution is skewed to the left if the left side of the graph is much longer than the right side.
The drawing is a cute but corny way to help you keep this straight. To avoid danger, Mr. Starnes skis on the gentler slope —in the direction of the skewness.
Graph a shows the scores of 21 statistics students on a point quiz. Graph b shows the results of rolls of a 6-sided die. Describe the shape of each distribution. The distribution of statistics quiz scores is skewed to the left, with a single peak at 20 a perfect score. There are two small gaps at 12 and The distribution of die rolls is roughly symmetric. It has no clear peak. Some quantitative variables have distributions with easily described shapes. But many distributions have irregular shapes that are neither symmetric nor skewed.
Some distributions show other patterns, like the dotplot in Figure 1. This graph shows the durations in minutes of eruptions of the Old Faithful geyser. The dotplot has two distinct clusters and two peaks: one at about 2 minutes and one at about 4. When you examine a graph of quantitative data, describe any pattern you see as clearly as you can. This graph has two distinct clusters and two clear peaks. Some quantitative variables have distributions with predictable shapes.
Many biological measurements on individuals from the same species and gender—lengths of bird bills, heights of young women—have roughly symmetric distributions. Salaries and home prices, on the other hand, usually have right-skewed distributions. There are many moderately priced houses, for example, but the few very expensive mansions give the distribution of house prices a strong right skew.
Knoebels Amusement Park in Elysburg, Pennsylvania, has earned acclaim for being an affordable, family-friendly entertainment venue. Knoebels does not charge for general admission or parking, but it does charge customers for each ride they take.
How much do the rides cost at Knoebels? The table shows the cost for each ride in a sample of 22 rides in a recent year. Describing Distributions Here is a general strategy for describing a distribution of quantitative data.
In any graph, look for the overall pattern and for clear departures from that pattern. You can describe the overall pattern of a distribution by its shape, center, and variability.
An important kind of departure is an outlier, an observation that falls outside the overall pattern. Variability is sometimes referred to as spread. There are several ways to measure the variability spread of a distribution, including the range. This means using the variable name, not just the units the variable is measured in. We will discuss more formal ways to measure center and variability and to identify outliers in Section 1.
For now, just use the median middle value in the ordered data set when describing center and the minimum and maximum when describing variability. Shape: The distribution of goals scored is skewed to the right, with a single peak at 1 goal. There is a gap between 5 and 9 goals. Outliers: The games when the team scored 9 and 10 goals appear to be outliers. Center: The median is 2 goals scored. Variability: The data vary from 1 to 10 goals scored. Describe the distribution.
Shape: The distribution of highway fuel economy ratings is roughly symmetric, with a single peak at There are two clear gaps: between Be sure to include context by discussing the variable of interest, highway fuel economy ratings. And give the units of measurement: miles per gallon mpg. Comparing Distributions Some of the most interesting statistics questions involve comparing two or more groups. Which of two popular diets leads to greater long-term weight loss? Who texts more—males or females?
As the following example suggests, you should always discuss shape, outliers, center, and variability whenever you compare distributions of a quantitative variable. Here are dotplots of the household sizes reported by the survey respondents. Compare the distributions of household size for these two countries. Shape: The distribution of household size for the U. The distribution of household size for the South Africa sample is skewed to the right, with a single peak at 4 people and a clear gap between 15 and The South African distribution seems to have two outliers: the households with 15 and 26 people.
Variability: The household sizes for the South African students vary more from 3 to 26 people than for the U. You need to mention the variable of interest, household size. Notice that in the preceding example, we discussed the distributions of household size only for the two samples of 50 students. We might be interested in whether the sample data give us convincing evidence of a difference in the population distributions of household size for South Africa and the United Kingdom.
Each student in the contest was given a small cup of ice cream and instructed to eat it as fast as possible. Compare the distributions of eating times for males and females. The stems are ordered from lowest to highest and arranged in a vertical column. The leaves are arranged in increasing order out from the appropriate stems. According to the American Heart Association, a resting pulse rate above beats per minute is considered high for this age group.
Also, the distribution of pulse rates for these 19 students is skewed to the right toward the larger values. Stemplots give us a quick picture of a distribution that includes the individual observations in the graph. It is fairly easy to make a stemplot by hand for small sets of quantitative data. Make stems. Separate each observation into a stem, consisting of all but the final digit, and a leaf, the final digit. Write the stems in a vertical column with the smallest at the top.
Draw a vertical line at the right of this column. Do not skip any stems, even if there is no data value for a particular stem. Add leaves. Write each leaf in the row to the right of its stem. Order leaves. Arrange the leaves in increasing order out from the stem.
Add a key. Provide a key that identifies the variable and explains what the stems and leaves represent. Making and interpreting stemplots. A football coach plans to obtain specially made helmets for his players that are designed to reduce the chance of getting a concussion. Here are the measurements of head circumference in inches for the 30 players on the team:. Make a stemplot of these data.
Describe the shape of the distribution. Are there any obvious outliers? To make the stemplot: Make stems. The smallest head circumference is We use the first two digits as the stem and the final digit as the leaf. So we need stems from 20 to The distribution of head circumferences for the 30 players on the team is roughly symmetric, with a single peak on the inch stem.
There are no obvious outliers. We can get a better picture of the head circumference data by splitting stems. In Figure 1. This time, values with leaves from 0 to 4 are placed on one stem, while those with leaves from 5 to 9 are placed on another stem. Now we can see the shape of the distribution even more clearly—including the possible outlier at The graph in b improves on the graph in a by splitting stems.
Here are a few tips to consider before making a stemplot: There is no magic number of stems to use. Five stems is a good minimum. If you split stems, be sure that each stem is assigned an equal number of possible leaf digits. When the data have too many digits, you can get more flexibility by rounding or truncating the data. See Exercises 61 and 62 for an illustration of rounding data before making a stemplot. You can use a back-to-back stemplot with common stems to compare the distribution of a quantitative variable in two groups.
The leaves are placed in order on each side of the common stem. For example, Figure 1. Write a few sentences comparing the distributions of resting and after-exercise pulse rates in Figure 1. The low outlier is Alaska. What percent of Alaska residents are 65 or older? Ignoring the outlier, the shape of the distribution is a. The center of the distribution is close to a. Histograms You can use a dotplot or stemplot to display quantitative data.
Both graphs show every individual data value. For large data sets, this can make it difficult to see the overall pattern in the graph. We often get a clearer picture of the distribution by grouping together nearby values. Doing so allows us to make a new type of graph: a histogram. The heights of the bars show the frequencies or relative frequencies of values in each interval. Notice how the histogram groups together nearby values. Choose equal-width intervals that span the data.
Five intervals is a good minimum. Make a table that shows the frequency count or relative frequency percent or proportion of individuals in each interval. Put values that fall on an interval boundary in the interval containing larger values.
Draw horizontal and vertical axes. Put the name of the quantitative variable under the horizontal axis. To the left of the vertical axis, indicate whether the graph shows the frequency count or relative frequency percent or proportion of individuals in each interval. Scale the axes. Place equally spaced tick marks at the smallest value in each interval along the horizontal axis.
On the vertical axis, start at 0 and place equally spaced tick marks until you exceed the largest frequency or relative frequency in any interval. Draw bars above the intervals. Make the bars equal in width and leave no gaps between them. Be sure that the height of each bar corresponds to the frequency or relative frequency of individuals in that interval. An interval with no data values will appear as a bar of height 0 on the graph.
It is possible to choose intervals of unequal widths when making a histogram. Such graphs are beyond the scope of this book. The table shows data on the average total tax rate for each of the remaining 46 states and the District of Columbia. Make a frequency histogram to display the data.
What percent of values in the distribution are less than 6. Interpret this result in context. To make the histogram: Choose equal-width intervals that span the data. The data vary from 1. So we choose intervals of width 1.
Make a table. Record the number of states in each interval to make a frequency histogram. The distribution has a single peak in the 6. Graph b uses intervals half as wide: 1. Now we see a distribution with more than one distinct peak. The choice of intervals in a histogram can affect the appearance of a distribution.
Histograms with more intervals show more detail but may have a less clear overall pattern. You can use a graphing calculator, statistical software, or an applet to make a histogram. Set up a histogram in the Statistics Plots menu. Adjust the settings as shown. Use ZoomStat to let the calculator choose intervals and make a histogram. Adjust the intervals to match those in Figure 1. But is this really how such scores are distributed? The IQ scores of 60 fifth-grade students chosen at random from one school are shown here.
Construct a histogram that displays the distribution of IQ scores effectively. Is the distribution bell-shaped? Using Histograms Wisely We offer several cautions based on common mistakes students make when using histograms. Although histograms resemble bar graphs, their details and uses are different. A histogram displays the distribution of a quantitative variable. Its horizontal axis identifies intervals of values that the variable takes. A bar graph displays the distribution of a categorical variable.
Its horizontal axis identifies the categories. Be sure to draw bar graphs with blank space between the bars to separate the categories. Draw histograms with no space between bars for adjacent intervals. Use percents or proportions instead of counts on the vertical axis when comparing distributions with different numbers of observations.
Mary was interested in comparing the reading levels of a biology journal and an airline magazine. She counted the number of letters in the first words of an article in the journal and of the first words of an article in the airline magazine. Mary then used statistical software to produce the histograms shown in Figure 1. This figure is misleading—it compares frequencies, but the two samples were of very different sizes and By using relative frequencies, this figure makes the comparison of word lengths in the two samples much easier.
In graph a , the vertical scale uses frequencies. Graph b fixes the problem of different sample sizes by using percents relative frequencies on the vertical scale. The 15 students in a small statistics class recorded the number of letters in their first names. What kind of graph is this? But first-name length is a quantitative variable, so a bar graph is not an appropriate way to display its distribution. The histogram on the right is a much better choice because the graph makes it easier to identify the shape, center, and variability of the distribution of name length.
Questions 2 and 3 refer to the following setting. The graph displays data on the percent of first-year students who plan to major in several disciplines. Is this a bar graph or a histogram?
Would it be correct to describe this distribution as right-skewed? Why or why not? A dotplot displays individual values on a number line. Stemplots separate each observation into a stem and a one-digit leaf. Histograms plot the frequencies counts or relative frequencies proportions or percents of values in equal-width intervals. Some distributions have simple shapes, such as symmetric, skewed to the left, or skewed to the right. The number of peaks is another aspect of overall shape.
So are distinct clusters and gaps. When examining any graph of quantitative data, look for an overall pattern and for clear departures from that pattern. Shape, center, and variability describe the overall pattern of the distribution of a quantitative variable. Outliers are observations that lie outside the overall pattern of a distribution. When comparing distributions of quantitative data, be sure to compare shape, center, variability, and possible outliers.
Remember: histograms are for quantitative data; bar graphs are for categorical data. Be sure to use relative frequencies when comparing data sets of different sizes.
Making histograms Page Students in a high school statistics class responded to a survey designed by their teacher. Experts recommend that high school students sleep at least 9 hours per night. What proportion of students in this class got the recommended amount of sleep?
Easy reading? Long words can make a book hard to read. What percentage of words in the sample have 8 or more letters? The following dotplot displays the goal differential for those same games, computed as U. What does the graph tell us about how well the team did in ? Be specific. What does the graph tell us about fuel economy in the city versus on the highway for these car models? Prudential Insurance Company asked people to place a blue sticker on a huge wall next to the age of the oldest person they have ever known.
An image of the graph is shown here. Feeling sleepy? Refer to Exercise Fuel efficiency Refer to Exercise Compare the distributions of total family incomes in these two samples. Healthy streams Nitrates are organic compounds that are a main ingredient in fertilizers. When those fertilizers run off into streams, the nitrates can have a toxic effect on fish.
An ecologist studying nitrate pollution in two streams measures nitrate concentrations at 42 places on Stony Brook and 42 places on Mill Brook.
The parallel dotplots display the data. Compare the distributions of nitrate concentration in these two streams. Enhancing creativity Do external rewards—things like money, praise, fame, and grades —promote creativity? Researcher Teresa Amabile recruited 47 experienced creative writers who were college students and divided them at random into two groups.
The students in one group were given a list of statements about external reasons E for writing, such as public recognition, making money, or pleasing their parents. Both groups were then instructed to write a poem about laughter. Is the variability in creativity scores similar or different for the two groups? Justify your answer. Do the data suggest that external rewards promote creativity? Healthy cereal? Researchers collected data on 76 brands of cereal at a local supermarket.
A dotplot of the data is shown here. Is the variability in sugar content of the cereals on the three shelves similar or different? Critics claim that supermarkets tend to put sugary cereals where kids can see them. Do the data from this study support this claim? Note that Shelf 2 is at about eye level for kids in most supermarkets. Here are the weights in grams of 17 Snickers Fun Size bars from a single bag:. What interesting feature does the graph reveal?
The advertised weight of a Snickers Fun Size bar is 17 grams. What proportion of candy bars in this sample weigh less than advertised? Eat your beans! Beans and other legumes are a great source of protein. The following data give the protein content of 30 different varieties of beans, in grams per grams of cooked beans. What proportion of these bean varieties contain more than 9 grams of protein per grams of cooked beans?
South Carolina counties Here is a stemplot of the areas of the 46 counties in South Carolina. Note that the data have been rounded to the nearest 10 square miles mi2. Describe the distribution of area for the 46 South Carolina counties. Shopping spree The stemplot displays data on the amount spent by 50 shoppers at a grocery store. Note that the values have been rounded to the nearest dollar. What was the smallest amount spent by any of the shoppers?
Describe the distribution of amount spent by these 50 shoppers. Where do the young live? Here is a stemplot of the percent of residents aged 25 to 34 in each of the 50 states:. Are there any outliers? Watch that caffeine! The U. That translates to a maximum of 48 milligrams of caffeine per 8-ounce serving. Data on the caffeine content of popular soft drinks in milligrams per 8-ounce serving are displayed in the stemplot. Give an appropriate key for this graph.
Acorns and oak trees Of the many species of oak trees in the United States, 28 grow on the Atlantic Coast and 11 grow in California. The back-to-back stemplot displays data on the average volume of acorns in cubic centimeters for these 39 oak species.
Who studies more? Researchers asked the students in a large first-year college class how many minutes they studied on a typical weeknight. The back-to-back stemplot displays the responses from random samples of 30 women and 30 men from the class, rounded to the nearest 10 minutes.
Write a few sentences comparing the male and female distributions of study time. The table displays CO2 emissions per person from countries with populations of at least 20 million. Make a histogram of the data using intervals of width 2, starting at 0. Which countries appear to be outliers?
Country CO2 Algeria 3. Traveling to work How long do people travel each day to get to work? AL Make a histogram to display the travel time data using intervals of width 2 minutes, starting at 14 minutes. What is the most common interval of travel times? DRP test scores There are many ways to measure the reading ability of children.
In a research study on third- grade students, the DRP was administered to 44 students. Make a histogram to display the data. Write a few sentences describing the distribution of DRP scores. Country music The lengths, in minutes, of the 50 most popular mp3 downloads of songs by country artist Dierks Bentley are given here.
Write a few sentences describing the distribution of song lengths. Return is usually expressed as a percent of the beginning price. The figure shows a histogram of the distribution of monthly returns for the U.
Explain why you cannot find the exact value for the minimum return. Between what two values does it lie? A return less than 0 means that stocks lost value in that month. About what percent of all months had returns less than 0? Researchers collected data on calories per serving for 77 brands of breakfast cereal.
The histogram displays the data. Describe the overall shape of the distribution of calories. What is the approximate center of this distribution? About what percent of the cereal brands have or more calories per serving? Paying for championships Does paying high salaries lead to more victories in professional sports? And over the years, the team has won many championships. The figure shows histograms of the salary distributions for the two teams during the season.
Paying for championships Refer to Exercise Here is a better graph of the salary distributions for the Yankees and the Phillies. Write a few sentences comparing these two distributions. Value of a diploma Do students who graduate from high school earn more money than students who do not? To find out, we took a random sample of U. The educational level and total personal income of each person were recorded.
The data for the 57 non-graduates No and the graduates Yes are displayed in the relative frequency histograms. Would it be appropriate to use frequency histograms instead of relative frequency histograms in this setting? Explain why or why not. Compare the distributions of total personal income for the two groups. Two of Mr. They selected a random sample of 30 Bounty paper towels and a random sample of 30 generic paper towels and measured their strength when wet.
To do this, they uniformly soaked each paper towel with 4 ounces of water, held two opposite edges of the paper towel, and counted how many quarters each paper towel could hold until ripping, alternating brands. The data are displayed in the relative frequency histograms.
Compare the distributions. Compare the distributions of number of quarters until breaking for the two paper towel brands. Birth months Imagine asking a random sample of 60 students from your school about their birth months.
Draw a plausible believable graph of the distribution of birth months. Should you use a bar graph or a histogram to display the data? Die rolls Imagine rolling a fair, six-sided die 60 times.
Draw a plausible graph of the distribution of die rolls. Grade 5 4 3 2 1 Total Calculus AB 76, 53, 53, 30, 94, , Statistics 29, 44, 51, 32, 48, , Write a few sentences comparing the two distributions of exam grades.
Here are the amounts of money cents in coins carried by 10 students in a statistics class: 50, 35, 0, 46, 86, 0, 5, 47, 23, This title may not be available in all areas. Please contact your representative for more information. The basic practice of statistics Author : D. The organisation and design has been improved for the fifth edition, coverage of engaging, real-world topics has been increased and content has been updated to appeal to today's trends and research.
Popular Books. There are fourteen chapters followed by Practice exams and Exam tips for AP exam. Marks Basic Medical Biochemistry 5th edition pdf free download.
The given observational study infers that vitamin E therapyreduces the risk of heart disease. This suggestion can be obtainedby the presence of some lurking variables. One of the lurking variables is the people who are morehealth-conscious. The researcher would have selected theindividuals who are more health-conscious. In other words, thestudy may be more likely towards people who are morehealth-conscious than people who are not. In experiments, people under all types are randomly assigned tothe treatments.
0コメント