SECTION I.

Introduction

In engineering education, the development of conceptual understanding of mathematical and statistical concepts, is pivotal. Among the important concepts within this context is the sample mean, a requisite for computer engineering students, as underscored by the conspicuous challenges faced by these students, indicated in prior research [1], [2]. The concept of the sample mean, with its fundamental properties and many applications [3], [4], [5], [6], serves as a foundational cornerstone in probability and statistics, thereby rendering its comprehension imperative for computer engineers to access the underlying theories and ideas within the discipline at a conceptual level. Nevertheless, research has revealed that the concept of the sample mean is one of the most challenging topics for students to comprehend (e.g., [7] and [8]) leading to persistent errors in their understanding.

This lack of understanding of the concept of the sample mean can impede progress in conceptual learning. For instance, the utilization of the formula for a confidence interval for the population mean frequently occurs without conceptual grasp of the underlying significance and origin of the factor $1/\sqrt {n}$ , which stems from the standard deviation of the sample mean. Furthermore, students often grapple with conceptual challenges, such as conflating the most probable range of the sample mean with that of an individual observation (distinguishing between confidence intervals and prediction intervals) [9], [10]. Additionally, there is a recurring tendency among students to conflate the rationale behind normality, whether attributed to the central limit theorem or the linearity of the sample mean [11], [12].

The analysis in this study employs the action-process-object-schema (APOS) framework developed by Dubinsky and colleagues [13], offering a structured framework to explore how learners engage with and acquire proficiency in statistical concepts, thereby facilitating the identification of instructional enhancements and the evaluation of teaching methodologies. The APOS framework has a well-established track record of utility within undergraduate mathematics education, albeit with limited applications in the realm of statistics, as evidenced by prior research (e.g., [14], [15], and [16]). Prior studies predominantly adopted qualitative methodologies. Findings from teaching interventions at preuniversity levels recommend the effective use of diverse mathematical representations to bolster students’ conceptual understanding [16]. However, among freshmen, results indicate that students often struggle to attain an object conception of fundamental statistical notions, such as mean and standard deviation, with even more advanced topics like the central limit theorem often eluding an action conception [14]. Moreover, studies involving in-service teachers reveal that only a minority of participants demonstrated an object conception with standard scores, whereas the majority remained at the action or process conceptions [15]. These findings in statistics are in accordance with the general literature on the APOS framework, emphasizing the inherent challenges associated with the transition from a process to an object conception [17]. Furthermore, the indication that in-service teachers may perceive the development of an object conception as relatively more attainable aligns with the idea that individuals with strong mathematical aptitude tend to find this transition more manageable [14], [18]. Nevertheless, the observation that only a minority of in-service teachers demonstrated an object conception highlights the enduring complexity associated with fostering robust schema.

The APOS framework posits a cognitive structure that systematically categorizes the conceptual comprehension of undergraduate mathematics, delineating a continuum from actions to schemas. The framework’s particular value lies in its ability to facilitate an in-depth examination of the myriad ways in which learners conceptualize the “sample mean” and employ diverse strategies to address statistical challenges. Additionally, the employment of this framework empowers researchers to identify areas of instruction amenable to improvement and to evaluate the efficacy of pedagogical interventions.

The primary objective of this study is to offer insight into students’ conceptual comprehension of the sample mean through the implementation of an interactive and blended mini-course specifically designed to enhance their grasp of this foundational statistical concept. This undertaking is supported by the observation that computer engineering students displayed enthusiasm for interactive and online learning environments, a phenomenon substantiated by existing scholarly studies [19], [20]. The novelty of the current study lies in its singular application of the APOS framework to the sample mean, a distinctive approach considering the limited utilization of this framework in the broader field of statistics education, as evidenced by a handful of prior studies (among them [14], [15], [16]).

This research makes a contribution to the academic community by concentrating on refining the conceptual understanding of the sample mean which is an essential statistical concept, among computer engineering students enrolled in a university-level statistics course in the Netherlands. This study significantly contributes through the implementation of a digital mini-course, the application of the APOS theoretical framework, and the presentation of an analytical approach to assess students’ comprehension, specifically in the realm of the sample mean concept. Furthermore, it investigates the relatively underexplored application of the APOS framework in statistics education.

SECTION II.

Methodology

A. Mini Course

The mini-course consists of digital self-tests based on common misconceptions and educational sources accompanied by instructional videos. The eight questions in the digital self-tests were designed to reflect the findings of existing literature (e.g., [21]) and the teachers’ experiences. The questions consisted of both multiple-choice with up to two correct answers and true-false formats. The educational sources are selected parts of videos that are 10 to 20 min long, followed by relevant questions. These questions make up the digital self-tests that students must answer. The mini-course was implemented and presented to students through Wooclap. The students were given one week to finish the mini-course, which entailed responding to the questions in the digital self-tests and viewing the instructional videos at their own convenience. Due to space constraints, a selection of questions from this mini-course can be found in Appendix A.

B. Implementation

The present study included 97 second-year Computer Engineering students from the University of Twente, all of whom were enrolled in a mandatory Statistics course. Attendance of the mini-course, which required a commitment of approximately 2 h, was voluntary. Data were collected through student responses to the self-test embedded in the mini-course.

C. Data Collection Through Interviews

Approximately four weeks after the conclusion of the mandatory statistics course, a total of 13 self-selected students were administered an online survey and seven of the same students took part in semi-structured interviews. In the current study, semi-structured interviews adhere to an organized approach, with defined research objectives, open-ended questions, participant recruitment with ethical considerations, a logically organized interview guide, rapport-building, open-ended inquiries, and probing techniques during interviews, as well as recording and note-taking for accuracy. The process concluded by addressing inquiries and reaffirming confidentiality.

The survey consisted of eight open-ended questions (see Appendix B), informed by the results of the mini-course self-tests. While the survey questions demonstrate a higher degree of specificity in alignment with their intended purpose, they are nevertheless structured in a manner analogous to the questions posed within the self-test of the mini-course. For instance, the three examples of self-test questions given in Appendix A correspond to a majority of the survey questions in Appendix B, which validates the alignment between the content of the mini-course and the survey.

Purposive sampling was employed to select students for interviews based on their higher retention of knowledge as indicated by survey results. The eight survey questions formed the basis of the interviews.

The semi-structured student interviews allowed researchers to further explore students’ experiences and knowledge of the mini-course. Each interview was approximately 20 min in length.

Following data transcription and familiarization, via open coding initial codes for key concepts were generated and later organized into themes through axial coding. Constant comparison and concise notes refined themes and documented insights. Thematic analysis deepened understanding, supported by triangulation and member checking for credibility. The APOS framework guides the analysis, and findings are reported with illustrative quotes.

D. APOS Framework for Analysis

In this study, the APOS framework was chosen to analyze the knowledge that students display when being involved in a specific activity due to its suitability for this purpose [13]. According to the APOS framework, an action is an operation on an established object. When an individual perceives a stimulus from the environment, this is the first step in a transformation that needs to be explicitly carried out by following certain instructions. An example of an action would be utilizing a formula to calculate a mean of a data set. A person is said to have an action conception of a concept if their engagement with that concept is limited to performing actions [13].

An instance of action for the sample mean would be to calculate the “confidence interval for the population mean” by routinely applying the given formula. Another example is, given the normal samples, concluding the sample mean as normally distributed without reasoning. Although it is preferable for students to proceed to process and object conceptions of the sample mean and not to remain at an action conception level, a thorough foundation of actions needs to be established before that development can occur. Lack of proficiency in, for example, calculating the expectation of the sample mean using its linearity, constitutes difficulties for students’ further conceptualizing of deriving the variance of the sample mean distribution or later on interpreting the $1/\sqrt {n}$ in the confidence interval formula.

Reflection on how and why actions work helps students to abstract the main characteristics of the concept, and incorporate them into procedures or algorithms. This is referred to as interiorizing actions into a mental process. Processes are seen by the individual as being internal and under their control, in contrast to actions which are carried out in response to external prompts. Having a process conception of a transformation enables the individual to think about, explain, or reverse the steps of the transformation without taking the actual steps. With a process conception, it is enough to simply contemplate the steps without necessarily performing them. A person with a process conception of standard deviation, for example, is able not only to carry out the action of using the standardization formula $z = {} [({x - {\mu }})/{\sigma }]$ to find ${z}$ -values but also can manipulate it to find ${x}$ if ${z}$ is given [15].

When an individual is aware of a process as a whole and can construct transformations on it, it is said to be encapsulated into a cognitive object, therefore cognitive objects are those items which can be acted on. For example, a student who understands the standard deviation of a data set as a measure of the spread of the data around the mean, and not simply as something related to a formula, is said to have an object conception of standard deviation [14].

The development of an object conception involves breaking a concept into its component parts and being able to manipulate those parts to achieve a desired outcome. When manipulating an object, it is often necessary to de-encapsulate it back to the processes from which it originated in order to use its properties. For example, de-encapsulation is evident when considering the effects of changing sample size on the sample mean and the shape of the corresponding sampling distribution. Encapsulating processes into objects is particularly difficult [17]. It is thought that those who have a strong aptitude for mathematics hold this conception without much outside help [14], [18], but for others, the development of an object conception must be scaffolded and takes time and effort.

A schema is a collection of cognitive objects and internal processes used to manipulate them. It can aid students in better understanding, organizing, and making sense of a problem and its context. A schema is a network connecting actions, processes and objects perceived by the individual as being related to the concept. In Dubinsky’s [13] framework, mathematical concepts can be organized into schemas, which are characterized by their dynamism and continual reconstruction based on the subject’s mathematical activity in specific contexts. The depth and complexity of a student’s understanding of a concept depends on their ability to establish connections among the mental structures that compose it (Actions, Processes, Objects, and other Schemas), forming the foundation of a coherent schema. The schema’s coherence is vital for making sense of mathematical situations related to the concept. Once a schema is constructed with interconnected structures and coherence, it can be either transformed into a static structure (object) or utilized as a dynamic structure that assimilates other related objects or schemas. The importance of schemas in an individual’s mathematical ability is clear, yet the specifics of these schemas and how they impact one’s mathematical performance have not been studied in sufficient detail [22]. Additionally, it is likely that different individuals have different schemas for the same mathematical concept.

Thematization occurs when a schema is integrated into a larger organizational system. A good example is the understanding that the best-unbiased linear estimator for the sample mean corresponds to the least squares estimator of a population mean. A thematized schema is classified when one is able to move between these concepts and properties, compare and contrast them. According to the APOS framework, thematized schemas and encapsulated processes are the only mental constructs that can be considered objects.

SECTION III.

Results and Discussion

The results of the study are divided into two categories: 1) results from the implementation of the mini-course and 2) results from the interviews. Each category includes a discussion of the relevant findings.

A. Results From the Mini-Course

The percentages of the students’ correct answers to the questions in the mini-course are shown in Fig. 1. The results indicate that most of the questions had a correct answer percentage of over 40% with the exception of two questions with lower-correct answer percentages. The lowest-correct answer percentage is 35%, whereas the highest-correct answer percentage is 67% (see Appendix A). It is believed that the strong distractors are the cause of this result. For example, one of the questions asked for an understanding of the probability distribution of the sample mean. It was observed that the majority (51%) of the students selected the distractor which involved calculating the mean of one sample (see Appendix A).

Fig. 1. - Percentages of correct answers.
Fig. 1.

Percentages of correct answers.

Fig. 2 displays the correct answer counts of seven interviewees, showcasing a generally high-correct answer rate. Interestingly, the question with the lowest percentage of correct responses among these seven students aligns with the lowest-performing question in the entire class. This particular question, found in Appendix A, is characterized by its complex and conceptual structure.

Fig. 2. - Number interviewees answering survey questions correctly.
Fig. 2.

Number interviewees answering survey questions correctly.

For example, the question with the poorest correct response rate of the self-tests (see Appendix A) suggested that the probability distribution of the sample mean and the reasoning behind determining its parameters was problematic and this prompted the survey questions 1, 2, 3, and 5. This poorest correct response can be observed within the survey results in the context that students do not have a good command of the sample mean in object conception, which requires a good command of the probability distribution with the parameters of the sample mean.

B. Results From the Interviews

The results of the interviews will be presented in four subheadings, each covering components of the APOS framework. For each subheading, pertinent excerpts will be provided. A full review has taken place, though only some key points are being displayed here.

  1. Action: It is reassuring to note that all seven students interviewed had progressed from an action conception to, at minimum, a process conception regarding the sample mean. The action cognitive development was demonstrated through their ability to calculate and describe the process of finding the confidence interval. All students demonstrated an action conception by correctly applying the procedures or formulas. Some examples of students’ explanations about their calculations of a confidence interval are given below:

    [S1]: By taking the mean of the sample with 100 observations, I added the standard error and subtracted it. Thus, I could calculate the confidence interval.

    [S2]: Now this is simple, the midpoint is the mean and then $\cdots $ Well, what we do is make this an interval by plus and minus the sample standard deviation divided by the square root of the sample size.

    One student confused the sample mean with a single observation and provided the definition of the mean of a single sample. This is consistent with the previous findings of [9] and [10]. When asked what the probabilistic distribution of the sample mean was, the student gave a more suitable answer.

    [S7]: Then (pauses), it is the same, we do not add the values for our sample, we are going to add the other things… They were X’s and we add them together and divide them by sample size… Oh no, that is not the sample size. We aren’t talking about a sample. Well, they are then the umm… the number of the X’s we have.

    When the students were asked to calculate the expectation of the sample mean for a given random sample of size $n (X_{i}\sim N(\mu,\sigma ^{2}))$ , they were able to correctly compute and explain the procedure for computing the expected value of the sample mean. Some excerpts are given as follows:\begin{align*} E\left ({\overline {X}}\right)=&E\left ({\frac {1}{n}(X_{1} + X_{2} + \ldots + X_{n})}\right) \tag{1}\\=&\frac {1}{n}E\left ({X_{1} + X_{2} +\cdots +X_{n}}\right) \\=&\frac {1}{n}\left ({E(X_{1}) + E(X_{2}) + \ldots + E(X_{n})}\right) \\=&\frac {1}{n}\left ({\mu + \mu + \cdots + \mu }\right)=\frac {n\mu }{n} = \mu.\tag{2}\end{align*} View SourceRight-click on figure for MathML and additional features.

    [S1]: Here, these (means (1) and (2)) can be written. The rest is, to sum up, and divide by n, we have n pieces of $\mu $ .

    [S6]: Well all the expectations of $X_{1},X_{2},\ldots,X_{n}$ are the same, so as far as we can write this (means (1)) step then we see there are n of the same expectations. Also, we learned that this $\overline {X}$ is unbiased. Expectation is linear so we can write it (2).

  2. Process: Among the cohort of seven students who actively engaged in the interview process, it is noteworthy that four of them had distinctly achieved a process conceptualization of the sample mean. The ensuing excerpts, derived from interviews conducted with Students 3 and 4, offer exemplifications of their proficient grasp of the probability distribution of the sample mean. When asked what “the probability distribution of the sample mean” is, students showed a process concept understanding.

    [S3]: It’s the distribution of values of a lot of samples. But they are (taken) from the same population.

    [S4]: I think it was $\cdots $ The mean of all samples that we can take from the same population. The means of the samples are all different but the (probability) distribution shows the same mean of the population.

    The definitions provided above suggest a process conception of the probability distribution of the sample mean, as the definition is within the control of the students. However, when asked what the characteristics of the sample mean are, the students consistently responded by describing its value. When asked about the shape or variability of this distribution, their responses revealed a dependence on the mean value and an evident uncertainty concerning the shape or the variation.

    [S3]: As I said, we collect many samples and we get the same mean as that of the population value which is mu. It has the shape of, umm $\cdots $ the population $\cdots $ umm I think. I.. will now make a guess, I think we use the same variance basically with the population too. Maybe divided with the sample size.

    [S5]: Yes I remember that the shape is always (a) normal (distribution). We always get the normal (distribution) histogram. Because $\cdots $ well I do not remember why $\cdots $ The dispersion is, well that does not matter, I guess $\cdots $ We estimate from the sample variance.

    For these students, their understanding of the sample mean does not include the notion that it is roughly normally distributed or that its standard deviation (known as the standard error) decreases as the sample size increases. Their conception is not an object with properties, rather they only have a process conception.

    The interviews showed a consistent confusion between prediction intervals and confidence intervals, consistent with the prior research [9], [10]. Confidence intervals provide an estimate of the mean value of a response variable based on given values of predictor variables, whereas prediction intervals estimate the value of a response variable for a single new observation based on the predictor variables. Prediction intervals resemble confidence intervals but the width of a prediction interval is larger than that of a confidence interval by definition.

    [S7]: This confidence interval means that we are 95% sure of the value will take place in this interval, if we repeated the experiment 100 times 95 will give the asked value in this range.

    [S4]: Well, here there are these values (means lower and upper bounds of the 95% confidence interval) and we are 95% certain that we will see that next time our value will be between these.

    When asked whether a future observation or the population parameter was being predicted, student 4 showed greater clarity in their thought process.

    [S4]: It would make sense to predict but I guess here we do not, we collected samples (pauses), or did we? No, we have only one sample. Umm $\cdots $ Let me think. We want to see the population value, do not we? So we cannot make predictions for the future. Fine, then I would say we know by 95% confidence that the population mean will be in this interval.

  3. Object: Within the sample of seven interviewed students, the presence of just one individual who had already developed an object conception is encouraging, especially considering the acknowledged challenges associated with encapsulating processes as objects, as underscored in previous research [17]. The object conception of an interviewee is particularly noteworthy in light of the potential inclination of individuals with strong mathematical aptitude toward naturally embracing this conceptual framework, as indicated in previous studies [14], [18]. When asked about the connections between the distribution of the sample mean and the formula used to compute the confidence interval, the student said:

    [S6]: Well, we have this mid-point, right?.. This is a starting point and we add some error bars $\cdots $ Or subtract, of course. These are, umm $\cdots $ They come from the standardization of these bars. We can use the standard normal values because the shape is normal.

    While this is clearly an object conception, following the interview the student was unable to satisfactorily explain how the square root of the sample size in the confidence interval formula was linked to the sample mean as a linear function.

    [S6]: There is a square root in the formula that is because we must divide with the standard deviation. I remember that comes from our distribution, that was in the video…. “Why do we have the square root in the formula?” Yeah, that’s because we have to take the square root of the variance, the $\sigma ^{2}$ becomes $\sigma $ . It is simply the variance of the distribution.

    This student has a robust understanding of the distribution of the sample mean, as demonstrated by their ability to apply the linearity of the expectation (see the excerpt in Action). However, when this concept is used in another context they are unable to connect the underlying concepts and properties which would enable them to take into account the population variance and incorporate sample size into the formula of the confidence intervals. The student’s understanding of the concepts is not yet organized, limiting their ability to recognize why the sample mean has a square root in the formula outside of the context. The inability to use cognitive objects and internal processes related to the concept, suggests that the student has not yet developed a robust schema of the topic.

  4. Schema: The attainment of a coherent schema poses a significant challenge, as indicated by prior research (e.g., [23]), and the outcomes of the present study are in concordance with this inherent difficulty. Nonetheless, one student exhibited notable progress toward this perspective, coming close to suggesting that the sample mean was unbiased. This observation prompted a more detailed investigation into the potential existence of alternative estimators, and the subsequent response is elucidated in the following section:

    [S6]: You mean of $X_{1} $ to $X_{n}$ ? Well, in fact we can but… umm I do not know if we really can. I am more into computers and this is different umm… abstract. Why would we take another estimator? The sample mean does the job for us.

SECTION IV.

Conclusion

In this study a mini-course was designed on the concept of the sample mean that students of computer engineering could take at their own pace. Thirteen students answered an online survey on the sample mean and seven of these students were chosen to be invited for semi-structured interviews. Questions arising from performance on the mini-course formed the basis of the interviews that were analyzed to gain insights on understanding of the sample mean and related properties or concepts. The APOS theoretical framework was used as a lens to interpret varying levels of understanding.

All seven of the students were proficient in accurately responding to interview questions that required an action conception of the sample mean accurately, but several had difficulty comprehending questions that required a deeper understanding of the core concepts. Difficulties in answering more challenging questions revealed how the participant students’ level of understanding of the sample mean and its properties had consequences for their interpretations of basic methods and results in statistics. Actions not being internalized into processes, and processes not being encapsulated into objects makes the connections and progressions dependent on the concept of the sample mean more difficult and complex to navigate.

The mini-course was an optional additional learning opportunity for students enrolled for a mandatory statistics course, for which the first author is the teacher. The course instruction places emphasis on the meaning and importance of the sample mean, yet despite this emphasis only one of the interviewed students exhibited an object conception of the concept. A similar disconnect was discovered in [14], where the majority of students in the study had no comprehension of the central limit theorem in spite of the instructors highlighting this subject. It is clear that the standard instructional modes and resources, despite including what the teachers considered considerable emphasis on the sample mean, are inadequate.

The statistics course in question is a standard course; courses very like it are taught in many universities. The findings of this study have implications for teaching and the authors offer recommendations for teachers of similar courses. The authors suggest allocating additional instructional time to facilitate the development of cognitive frameworks at the process, object, and schema levels for students. The authors are contemplating the following modifications to their instructional approach: 1) providing ample practice opportunities for students to review and reconsider the sample mean in various contexts throughout the statistics course; 2) deconstructing symbolic representations through real-world examples to illustrate and explain the applications of the sample mean; and 3) constructing possible schema conceptions of the sample mean that organize and link the relevant actions, processes and objects.

The APOS theoretical framework was used to analyze the interview data. The framework was found to be suitable for this sort of data and exhibited satisfactory explanatory power. Given the relative paucity of research which has applied the APOS framework to statistics education (e.g., [14], [15], and [16]), the authors strongly encourage statistics education researchers to make use of this framework in their investigations of a range of statistics concepts-.

As previously mentioned, it has been argued that students highly proficient in mathematics or statistics may be able to attain an object conception of a concept without much external support. Nevertheless, most students require a teaching and learning environment rich in appropriate resources in order to reach this goal. Authors’ institutions place significant emphasis on diversity and inclusivity. Outside the classroom this position is clear. For instance, each programme has at least one dedicated study advisor who fills the role of a confidential councilor, and students with special needs are offered suitable assessment support. By providing a diverse range of learning opportunities and educational resources, which help students with a wide range of proficiency develop deep understanding, this study extends the institutional support structures into the classroom and thereby observes principles of inclusivity.

The current study is focused on students’ comprehension of a single, yet vital, statistics concept. The study observed that even though the interviewees were chosen from amongst the more proficient students only one showed evidence of reaching an object conception of the sample mean. These findings will inform the continued efforts to enhance instructional and learning techniques to aid cognitive development in a more diverse body of students than those who can develop object conceptions with relative ease. Furthermore this study has the potential to inform the development of teaching strategies that are tailored to the needs of diverse student populations because this is taken digitally in their own time and place.

An inherent limitation of the study is that the data was only gathered after the mini-course. Given the purpose of the study as the investigation of the conceptual understanding of the sample mean, the mini-course was used to inform the survey and interview questions. In future iterations, the researchers intend to replicate the mini-course with upcoming cohorts, thereby affording them the opportunity to gather data collection both prior to and following the completion of the mini-course to ascertain its effectiveness.

Appendix A

The question from the mini-course with the lowest-correct answer (35%): Let $X_{1},X_{2},\ldots,X_{k}$ be given and let $X_{i} \backsim N(\mu,\sigma^{2})$ . Which of the following is always correct for the sample mean?

  1. $\overline {X} \backsim N(\mu,\sigma ^{2})$ by the law of large numbers

  2. $\overline {X} \backsim N(\mu,\sigma ^{2})$ because of linearity

  3. $\overline {X} \backsim N[\mu,{} ({\sigma ^{2}}/{n})]$ by the law of large numbers

  4. $\overline {X} \backsim N[\mu,{} ({\sigma ^{2}}/{n})]$ because of linearity (correct answer)

  5. $\overline {X} \backsim N[\mu,{} ({\sigma ^{2}}/{n^{2}})]$ by the law of large numbers

  6. $\overline {X} \backsim N[\mu,{} ({\sigma ^{2}}/{n^{2}})]$ because of linearity.

The question from the mini-course with the highest-correct answer (67%): If all possible random samples of size ${n}$ are taken from a population that is not normally distributed, and the mean of each sample is determined, what can you say about the sampling distribution of sample means?

  1. It is approximately normal provided that ${n}$ is large enough. (correct answer)

  2. It is positively skewed.

  3. It is negatively skewed.

  4. None of the above.

The question from the mini-course that asked for an understanding of the probability distribution of the sample mean: Which of the following represents the probability distribution for the sample mean?

(12% response rate for A)

  1. It represents the standard normal distribution.

(37% response rate for B)

  1. It is the stochastic distribution for all possible values of the sample mean computed for the samples of size ${n}$ . (correct answer)

(51% response rate for C)

  1. If we have a sample of five with observations as follows: 3, 4, 5, 6, and 9 then the sample mean will be \begin{equation*} \frac {3 + 4 + 5 + 6 + 9}{5} = \frac {21}{5} = 4.2.\end{equation*} View SourceRight-click on figure for MathML and additional features.

Appendix B

The Survey Questions:

  1. What is the probability distribution of the sample mean?

  2. Name three connections between the distribution of the sample mean and the formula used to compute the confidence interval.

  3. How does the sample size affect the sampling distribution of the sample mean?

  4. What is the difference between the standard deviation of the population and the standard error of the sample mean?

  5. Why is there a square root of the sample size in the confidence interval formula?

  6. How does the shape of the probability distribution of the sample mean change as the sample size increases?

  7. What is the difference between prediction and confidence intervals?