Stata work | Economics homework help
ECO 480 Econometrics I
Problem Set 5
Due: Monday, November 30, 2015 (beginning of the class)
1
Instruction: The problem sets are designed to be difficult and very time-intensive, so plan ahead. The problem sets consists of solving theoretical problems and analyzing real data. You may discuss the questions with your classmates, but you are required to hand in your own independently written solutions, do-files, and log-files. No late work will be accepted and I do NOT accept any electronic copy. All the data necessary for the problem set is available under UBlearns.
Important: It is extremely important to write a clean well-commented program for transparency and replication purposes. In any empirical work, you should always be able to reproduce your result from raw data to support your claim.
What to hand in: Typed write-up answering the assigned questions and interpreting your findings, do-file, and log-file (you MUST use Stata). For questions involving data analysis, you will NOT get any credit if you do not provide a program code. You may NOT use Excel.
1. How are returns on common stocks in overseas markets related to returns in U.S. markets? Consider measuring U.S. returns by the annual rate of returns on the Standard & Poor’s (S&P) 500 stock index and overseas returns by the annual rate of returns on the Morgan Stanley Europe, Australia, Asia, Far East (EAFE) index. Both are recorded in percent. Regressing the EAFE returns on the S&P 500 returns for the 20 years 1989 to 2008. Here is part of the output for this regression:
The regression equation is EAFE = -2.58 + 0.775 S&P
Analysis of Variance
Source
df
SS
MS
Model
4560.6
Residual
Total
19
8556.0
a. Complete the analysis of variance table by filling in the missing entries. Show (Justify) how you came up with that answer.
b. What are the values of the regression standard error and the R2?
c. Find the standard error for the least-squares slope. Show your work.
d. If I tell you that the standard deviation of the S&P 500 returns for these years is 19.99%. How would you use this information to find the standard error for the least-squares slope you found in part c?
e. Give a 95% confidence interval for the slope β1 of the population regression line. Show your work.
ECO 480 Econometrics I
Problem Set 5
Due: Monday, November 30, 2015 (beginning of the class)
2
2. An electronic survey of 7,061 players of Guitar Hero and Rock Band reported that 67% of those who do not currently play a musical instrument said that they are likely to begin playing a real musical instrument in the next two years. The reports describing the survey do not give the number of respondents who do not currently play a musical instrument.
a. Explain why it is important to know the number of respondents who do not currently play a musical instrument.
b. Assume that half of the respondents do not currently play a musical instrument. Find the count of players who said that they are likely to begin playing a real musical instrument in the next two years.
c. Give a 99% confidence interval for the population proportion who would say that they are likely to begin playing real musical instrument in the next two years.
d. How would the result that you reported in part (c) change if only 25% of the respondents said that they do not currently play a musical instrument?
e. Do the same calculations if 75% of the respondents said that they do not currently play a musical instrument?
f. The main conclusion of the survey that appeared in many news stories was that 67% of players of Guitar Hero and Rock Band who do not currently play a musical instrument said that they are likely to begin playing a real musical instrument in the next two years. What can you conclude about the effect of the three scenarios [part c, d, and e] on the margin of error for the main result?
3. The average undergraduate GPA for American colleges and universities was estimated based on a sample of institutions that published this information. Here are the data for public schools in that report:
Year
1992
1996
2002
2007
GPA
2.85
2.90
2.97
3.01
a. Find the equation of the least-squares regression line by hand or with a calculator for predicting GPA from year. Show your work.
b. Verify your results with a software package. Show the output.
c. Compute a 95% confidence interval for the slope by hand and interpret what this interval tells you about the increase in GPA over time. Show your work.
ECO 480 Econometrics I
Problem Set 5
Due: Monday, November 30, 2015 (beginning of the class)
3
4. A study that evaluated the effects of a reduction in exposure to traffic-related air pollutants compared respiratory symptoms of 283 residents of an area with congested streets with 165 residents in a similar area where the congestion was removed because a bypass was constructed. The symptoms of the residents of both areas were evaluated at baseline and again a year after the bypass was completed. For the residents of the congested streets, 17 reported that their symptoms of wheezing improved between baseline and one year later, while 35 of the residents of the bypass streets reported improvement.
a. Find the two sample proportions.
b. Report the difference in the proportions and the standard error of the difference.
c. What are the appropriate null and alternative hypotheses for examining the question of interest? Be sure to explain your choice of the alternative hypothesis.
d. Find the test statistic. Construct a sketch of the distribution of the test statistic under the assumption that the null hypothesis is true. Find the P-value and use your sketch to explain its meaning.
e. Is no evidence of an effect the same as evidence that there is no effect? Use a 95% confidence interval to answer this question. Summarize your ideas in a way that could be understood by someone who has very little experience with statistics.
5. According to literature on brand loyalty, consumers who are loyal to a brand are likely to consistently select the same product. This type of consistency could come from a positive childhood association. To examine brand loyalty among fans of the Chicago Cubs, 371 Cubs fans among patrons of a restaurant located in Wrigleyville were surveyed prior to a game at Wrigley Field, the Cubs’ home field. The respondents were classified as “die-hard fans” or “less loyal fans.” Of the 134 die-hard fans, 90.3% reported that they had watched or listened to cubs games when they were children. Among the 237 less loyal fans, 67.9% said that they had watched or listened as children.
a. Find the numbers of die-hard Cubs fans who watched or listened to games when they were children. Do the same for the less loyal fans.
b. Use a significance test to compare the die-hard fans with the less loyal fans with respect to their childhood experiences relative to the team.
c. Express the results with a 95% confidence interval for the difference in proportions.
ECO 480 Econometrics I
Problem Set 5
Due: Monday, November 30, 2015 (beginning of the class)
4
6. In a survey of 1430 undergraduate students, 1087 reported that they had one or more credit cards. Give a 95% confidence interval for the proportion of all college students who have at least one credit card.
7. The summary of the surveyed described in the question #6 reported that 43% of undergraduates reported that they had four or more credit cards.
a. Give a 95% confidence interval for the proportion of all college students who have four or more credit cards.
b. Would a 99% confidence interval be wider or narrower than the one that you found in part a? Verify your results by computing the interval.
c. Would a 90% confidence interval be wider or narrower than the one that you found in part a? Verify your results by computing the interval.
8. Use the data in WAGE2.dta to estimate a simple regression explaining monthly salary(wage) in terms of IQ score (IQ).
a. Find the average salary and average IQ in the sample. What is the sample standard deviation of IQ? (IQ scores are standardized so that the average in the population is 100 with a standard deviation equal to 15.)
b. Estimate a simple regression model where a one-point increase in IQ changes wage by a constant dollar amount. Use this model to find the predicted increase in wage for an increase in IQ of 15 points. Does IQ explain most of the variation in wage? ? Is the coefficient statistically significantly different from zero at the 1% level?
c. Now, estimate a model where each one-point increase in IQ has the same percentage effect on wage. If IQ increases by 15 points, what is the approximate percentage increase in predicted wage? Use the 95% confidence interval to conclude whether the coefficient statistically significantly different from zero at the 5% level and explain.
9. Use MEAP93.dta to explore the relationship between the math pass rate (math10) and spending per student (expend). math10 denote the percentage of tenth graders at a high school receiving a passing
score on a standardized mathematics exam. Suppose we wish to estimate the effect of spending per student (expend) on student performance. If anything, we expect the spending per student to have a positive ceteris paribus (i.e., all other factors being equal) effect on performance because money can go toward purchasing more books and computers, hiring better qualified teachers, implementing nicer facilities, and so on.
a. Do you think each additional dollar spent has the same effect on the pass rate, or does a diminishing effect seem more appropriate? Explain.
ECO 480 Econometrics I
Problem Set 5
Due: Monday, November 30, 2015 (beginning of the class)
5
b. In the population model math10 = β0 + β1log(expend) + u, argue that β1/10 is the percentage point change in math10 given a 10% increase in expend.
c. Use the data in MEAP93.RAW to estimate the model from part (b). Report the estimated equation in the usual way, including the standard error, sample size and R-squared.
d. Is the effect statistically significant at the 10, 5, and 5% level? Explain.
e. Interpret the coefficient and the R-squared.
f. How big is the estimated spending effect? Namely, if spending increases by 10%, what is the estimated percentage point increase in math10?
g. One might worry that regression analysis can produce fitted values for math10 that are greater than 100. Why is this not much of a worry in this data set? [Hint: What is the largest value of math10?]