Computer Assignment Problem
Answer all FIVE Questions
Some critics raised complaints that the amount of violence shown on television contributes to violence in our society. Others pointed out that the high level of obesity among children may be attributed to their exposure to television. Now, we may have to add financial problems to the list. A sociologist theorised that people who watch television frequently are exposed to many commercials, which in turn lead them to buy, resulting in increasing debt. To test this belief, a researcher plans to survey a sample of families across the country.
QUESTION 1 – 3 marks
Briefly explain (using no more than 300 words in total for this Question 1)
(a) What type of survey method the researcher could use and why? 0.5 marks
(b) What sampling method could the researcher use to select his/her sample and why? 0.5 marks
(c) What are the variables the researcher should consider collecting data for the purpose of the analysis and why? Identify the data type(s) for the variables.
1.5 marks
(d) What kind of issues the researcher may face in this data collection? 0.5 marks
Suppose the researcher collected data from 395 randomly selected families. For each family, the number of hours the television is turned-on per week and the total debt were recorded. The data are stored in file DATA FILE.XLS which is available in the “Assessment”Þ “Computing Assignment” section of the 1304AFE unit website.Using this data and EXCEL, answer questions 2, 3 and 4 below.
QUESTION 2 – 5 marks
First, the researcher wishes to use the graphical descriptive methods to present the data.
(a) He suggests using class intervals such as 0-6, 6-12, 12-18, … for one variable and class intervals 0-30000, 30000-60000, 60000-90000, …. , for the other variable. Explain how he would decide on the number of classes and the above class intervals. 1 marks
(b) Use appropriate BIN values to draw a histogram for each variable and comment on the shape of the two distributions. 2 marks
(c) Use an appropriate plot to investigate the relationship between the two variables. Briefly explain the selection of each variable on the X and Y axes and why? On the same plot, fit a linear trend line including the equation and the coefficient of determination. 2 marks
QUESTION 3 – 4 marks
Second, the researcher wishes to use the numerical descriptive measures to summarize the data.
(a) Prepare a numerical summary report about the data on the two variables the researcher has considered by including the summary measures, mean, median, range, variance, standard deviation, smallest and largest values and the three quartiles, for each variable. 2 marks
(b) Compute a numerical summary measure to identify the direction and to measure the strength of the relationship between the two variables. Interpret this value.
2 marks
QUESTION 4 – 4 marks
The researcher considers using regression analysis to establish a linear relationship between the two variables.
(a) What is his dependent variable and independent variable? Why? 1 mark
(b) Estimate a simple linear regression model and present the estimated linear equation. Interpret the intercept and slope coefficient estimates of the linear equation. 2 marks
(c) Interpret the coefficient of determination, R-squared (R^{2}) value. 1 mark
QUESTION 5 (Show all working in EXCEL by setting up a table) 4 marks
A shopping mall estimates the probability distribution of the number of stores mall customers actually enter (X), as shown below:
X |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
p(x) |
0.03 |
0.25 |
0.14 |
0.30 |
2k |
0.10 |
k |
(a) Calculate the value of k. 1 mark
(b) Calculate the mean of number of stores entered. 1.5 marks
(c) Calculate the standard deviation of the number of stores entered. 1.5 marks
