Computational Astronomy Tut 2 ==================================== Due date: 1500 Wed 10 Mar. - Proof 1: Suppose you plan to fit a zeroth-order polynomial to N Gaussian-random data values y_i, which all have the same variance. You plan to do this by minimizing the Sum of Squared Residuals (aka chi squared, or chi^2 for short) objective function. Show that this is the same as calculating the simple average of the y_i values. - Question 1: How many model parameters are there? - Question 2: So how many degrees of freedom have the data for such a model? - Proof 2: Use the propagation of variables formula to show that the variance of chi^2 is 2 times chi^2, when all the y_i are independent (= uncorrelated), and when they all have the same variance (ie when sigma_i = sigma for all i). - Write a python program to perform a Monte Carlo of 10^6 iterations. The purpose of the MC is to generate a million samples of chi^2. Within each iteration of the MC you should do the following: * Generate a set of 10 independent random values, these each following a Gaussian PDF of variance=1. Both scipy and numpy I believe have functions which will return Gaussian randoms (sometimes called normally-distributed randoms). * Calculate the average (which, as we saw above, is the same as fitting a zeroth-order polynomial to the data). * Calculate chi^2 for this 'best-fit' value of the model parameter. At the end of the MC, you should have accumulated 10^6 values of chi^2. - Now make a frequency histogram of these values. Choose bin boundaries which result in the most informative graph (in your opinion). * Hint: there is not much point plotting histogram values which are less than 3 or 4. # Question 3: Say why you think this might be. - Plot a graph which compares the FH with the 'perfect model' theoretical curve for chi^2 at the given degrees of freedom. This graph should have: * A logarithmic Y scale # Question 4: You should say why you think this is best. * Correct error bars on the histogram. * So long as you include the error information in some way, you can plot the histogram in the way which seems best to you. I haven't found how to make pylab plot error bars on a log Y scale. (If you find out, please tell me!) Workarounds include: # Using another plotting package entirely; # Using pylab, but plotting log10(FH_i-sigma_FH_i) and log10(FH_i+sigma_FH_i) separately. - On a separate graph, but over the same range of chi^2, and also using a log Y scale, plot the 'perfect model' survival function for chi^2 at the same value of degrees of freedom. * Question 5: By inspecting this graph, estimate the value of chi^2 which would be exceeded, just by random fluctuations, 1% of the time. ##################################################### You should submit the following for full assessment: 1 python program. Note that your submitted program must contain the MC and also the creation of the FH; however I leave the plotting up to you. I don't require to see the code you use to make the plots, and I won't assess it if you choose to include it with the MC etc. 2 files containing plots (postscript, png, pdf, any reasonable format) 1 text file containing 2 proofs and 5 answers.