Exam 1 5380 BUAL

docx

School

Lamar University *

*We aren’t endorsed by this school

Course

BUAL-538

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

10

Uploaded by ProfessorPuppyMaster2839

Report
Exam 1 1. Researchers may gain insight into the characteristics of a population by examining a a. Mathematical model describing the population b. Sample of the population c. Description of the population d. Replica 2. Coding males as 1 and females as 0 in a data set illustrates the use of a. Nominal variables b. Dummy variables c. Numerical variables d. Ordinal variables 3. The interquartile range (IQR) represents what percent of the observations? a. Lower 25% b. Middle 50% c. Upper 75% d. Upper 90% e. 100% 4. Expressed in percentiles, the interquartile range is the difference between the a. 10 th and 60 th percentiles b. 15 and 65 th percentiles c. 20 th and 70 th percentiles d. 25 th and 75 th percentiles e. 35 th and 85 th percentiles 5. Data that arise from counts are called a. Continuous data b. Nominal data c. Counted data d. Discrete data 6. How is the median defined if the number of observations is even? a. The average of the two middle observations b. The difference between the two middle observations
c. The most frequent observation d. The difference between the highest and smallest observation 7. A sample of a population taken at one particular point in time is categorized as: a. Categorical b. Discrete c. Cross-sectional d. Time-series 8. What is the most common type of chart for showing the distribution of a numerical variable? a. Time series graph b. Histogram c. Bin d. Box plot 9. In a generic box plot, the x inside the box indicated the location of the a. Mean b. Median c. Minimum value d. Maximum value 10. Which of the following are the three most common measures of central tendency? a. Mean, median, mode b. Mean, variance, and standard deviation c. Mean, median, and variance d. Mean, median, and standard deviation e. First quartile, second quartile, and third quartile 11. In order for the characteristics of a sample to be generalized to the entire population, it should be: a. Symbolic of the population b. Atypical of the population c. Representative of the population d. Illustrative of the population 12. Categorizing age variables as “young,” “Middle-aged,” and “elderly” is an example of
a. Counting b. Ordering c. Value adding d. Binning e. Categorizing 13. The average score for a class of 30 students was 75. The 20 male students in the class averaged 70. The 10 female students in the class averaged a. 75 b. 85 c. 60 d. 70 e. 80 14. Gender and State are examples of which type of data? a. Discrete data b. Continuous data c. Categorical data d. Ordinal data 15. If a value represents the 95 th percentile, this means that a. 95% of all values are below this value b. 95% of all values are above this value c. 05% of the time you will observe this value d. There is a 5% chance that this value is incorrect e. There is a 95% chance that this value is correct 16. In a regression analysis, the variables used to help explain or predict the response variable are called the a. Independent variables b. Dependent variables c. Regression variables d. Statistical variables 17. A multiple regression analysis including 50 data points and 5 independent variables results in e 1 2 40 . The multiple standard error of estimate will be:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
a. 0.901 b. 0.888 c. 0.800 d. 0.953 e. 0.894 18. The percentage of variation ( R 2 ) can be interpreted as the fraction (or percent) of variation of the a. Explanatory variable explained by the independent variable b. Explanatory variable explained by the regression line c. Response variable explained by the regression line d. Error explained by the regression line 19. Outliers are observations that a. Lie outside the sample b. Render the study useless c. Lie outside the typical pattern of points on a scatterplot d. Disrupt the entire linear trend 20. Is/are especially helpful in identifying outliers a. Linear regression b. Regression analysis c. Normal curves d. Scatterplots e. Multiple regression 21. In linear regression, we can have an interaction variable. Algebraically, the interaction variable is the other variables in the regression equation a. Sum b. Ratio c. Product d. Mean 22. The correlation value ranges from a. 0 to +1 b. -1 to +1
c. -2 to +2 d. -Y to +Y 23. The percentage of variation ( R 2 ) ranges from a. 0 to +1 b. -1 to +1 c. -2 to +2 d. -1 to 0 24. In regression analysis, if there are several explanatory variables, it is called: a. Simple regression b. Multiple regression c. Compound regression d. Composite regression 25. In multiple regression, the coefficients reflect the expected change in: a. Y when the associated X value increases by one unit b. X when the associated Y value increases by one unit c. Y when the associated X value decreases by one unit d. X when the associated Y value decreases by one unit 26. The appropriate hypothesis test for an ANOVA test is: a. H 0 : all B 0, H : at least one B = 0 b. H 0 : all B = 0, H : at least one B = 0 c. H 0 : at least one B 0, H : all B = 0 d. H 0 : at least one B = 0, H : all B 0 27. In the standardized value ( b i B i ) s b i , the symbol s b i represents the: a. Mean of b i b. Variance of b i c. Standard error of b i d. Degrees of freedom of b i 28. Many statistical packages have three types of equation-building procedures. They are: a. Forward, linear, and non-linear b. Forward, backward, and stepwise
c. Simple, complex, and stepwise d. Inclusion, exclusion, and linear 29. The appropriate hypothesis test for a regression coefficient is: a. H 0 : B 0, H : B =0 b. H 0 : B = 0, H : B 0 c. H 0 : B = 1, H : B 1 d. None of these options 30. Which of the following definitions best describes parsimony? a. Explaining the most with the least b. Explaining the least with the most c. Being able to explain all of the change in the response variable d. Being able to predict the value of the response variable far into the future 31. Another term for constant error variance is: a. Homoscedasticity b. Heteroscedasticity c. Autocorrelation d. Multicollinearity 32. The test statistic in an ANOVA analysis is: a. The t -statistic b. The z -statistic c. The F -statistic d. The Chi-square statistic 33. Which of the following is not one of the assumptions of regression? a. There is a population regression line b. The response variable is not normally distributed c. The response variable is normally distributed d. The errors are probabilistically independent 34. If you can determine that the outlier is not really a member of the relevant population, then it is appropriate and probably best to: a. Average it b. Reduce it
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
c. Delete it d. Leave it 35. Which of the following is not one of the assumptions of regression? a. The standard deviation of the response variable increases as the explanatory variables increase 36. What is the decision making process? a. A purposeful and goal-directed effort that uses a systematic process to choose among options i. Identify the problem and define it ii. Gather data iii. Analyze data iv. Identify the options/solutions v. Pros and cons of each options vi. Selection – make the decision 37. What is the modelling process? a. Involves creating a simplified representation of a real-world system or phenomenon using mathematical, statistical or computational techniques. b. Models are used to gain insights, make predictions, or understand complex relationships within the system. 38. Nominal Data a. Data that consists of names, labels, or categories 39. Ordinal data a. Categorical data with a natural order or ranking 40. Cross-sectional data a. Data collected at a single point in time 41. Time-series data a. Data collected over a sequence of time 42. Continuous data a. Data that can take on an infinite number of values within a given range (height, weight, temperature) 43. Discrete data
a. Data that can only take on specific, separate values, typically integers (counts) 44. Frequency table a. A table for organizing a set of data that shows the number of times each item or number appears 45. Measure of central location a. A central value that best represents a distribution of data b. Measures of central location include the mean, median, and mode. Also called the measure of central tendency 46. Mean a. The average of a distribution, obtained by adding the scores and then dividing by the number of scores 47. Median a. The middle score in a distribution; half the scores are above it and half are below it 48. Mode a. The most frequently occurring score(s) in a distribution 49. Quartiles a. Values that divide a data set into four equal parts 50. Population standard deviation a. The square root of the population variance 51. Interquartile Range (IQR) a. The difference between the first and third quartiles 52. Empirical Rule a. The rule gives the approximate % of observations within 1 standard deviation (68%), 2 standard deviations (95%), and 2 standard deviations (99.7%) of the mean when the histogram is approximately a normal curve 53. Regression analysis asks: a. How a single variable depends on other relevant variables 54. In regression analysis, the variable we are trying to explain or predict is called the a. Dependent variable 55. In regression analysis, which of the following causal relationships are possible?
a. X causes Y to vary b. Y causes X to vary c. Other variables cause X and Y to vary 56. An error term represents the vertical distance from any point to the a. Population regression line 57. A scatterplot that appears as a shapeless mass of data points indicates a. No relationship among the variables 58. Correlation is a summary measure that indicates a. The strength of the linear relationship between pairs of variables 59. A correlation value of zero indicates a. No linear relationship 60. The covariance is not used as much as the correlation because a. It is difficult to interpret 61. A single variable X can explain a large percentage of the variation in some other variable Y when the two variables are a. Highly correlated 62. The term autocorrelation refers to a. Time-series variables are usually related to their own past values 63. The weakness of scatterplots is that they a. Do not actually quantify the relationships between variables 64. In linear regression, we fit the least squares line to a set of values (or points on a scatterplot). The distance from the line to the point is called a. Residual 65. In linear regression, the fitter values is a. The predicted value of the dependent variable 66. In choosing the “best-fitting” line through a set in linear regression, we choose the one with the a. Smallest sum of squared residuals 67. The standard error of the estimate (Se) is essentially the a. Standard deviation of the residuals
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
68. Approximately what percentage of the observed Y values are within one standard error of the estimate of the corresponding fitted Y values a. 67% 69. The percentage of variation can be interpreted as the fraction of variation of the a. Response variable explained by the regression line 70. Given the least squares regression line, which statement is true y=8-3x a. The relationship between X and Y is negative 71. In multiple regression, the constant a. Is the expected values of the dependent variable Y when all of the independent variables have the value zero