Results 1 to 2 of 2

Math Help - Guidance to solve House Prices Data Set Statistical Analysis Problem

  1. #1
    Newbie
    Joined
    Sep 2013
    From
    hyderabad
    Posts
    1

    Guidance to solve House Prices Data Set Statistical Analysis Problem

    Dear All..

    Thankyou for reading my thread and experts please pardon me if this question seems too silly or an outlier or too lengthy to answer ..but your valuable time is highly appreciated as it would guide me to solve my problem domain (a) and (b).

    Attached is description of data set:
    A real estate agent is trying to understand the nature of housing stock and
    home prices in and around a medium sized town in upstate New York. She
    has collected data from a random sample of 1047 homes sold in the last 12
    months. Data was collected on the following variables, and is available in the
    attached houseprices.csv file.
    • Price – the sale price of the house in $
    • Living Area – in Sq. ft.
    • Bathrooms – number of bathrooms in the house (powder rooms with
    no tub or shower area are considered 0.5 baths)
    • Bedrooms – the number of bedrooms
    • Lot Size – size of the property on which the house sits (in acres).
    • Age – of the house in years
    • Fireplace – whether or not the house has a fireplace (Yes = 1, No = 0)

    ===============
    Part (A)
    1. Prepare a brief report summarizing the home values (prices) in this area.
    Use both graphical and numerical summaries. Your report should briefly
    describe what those summaries tell you, and anything of particular
    note/interest.
    2. Does the normal model provide a good description of the prices? Use a
    Normal Quantile plot to frame your response.
    3. Irrespective of your response to Q2, assume that Price ~ N(164K, (68K)2).
    Given this:
    A. Calculate the following probabilities – P(Price > 92.8K), P(Price <
    255.5K). Do these numbers agree with what you see in the data?
    B. Once again, assuming the above normal distribution, what
    percentage of houses should have a value less than 232K? Does that
    agree with the data?
    C. Based on the theoretical model, what do you expect should be the
    price of a house that is exactly on the 3rd quartile (75th percentile,).
    How does that compare to the actual?
    4. Create a histogram and boxplot for the Living Area variable. What does
    the histogram tell you that the boxplot does not, and vice-versa? Is the
    distribution symmetric? Check the skewness measure to see if it is
    consistent with your observation.
    5. Create a new column in the dataset by taking the logarithm of the Living
    Area variable. Is the normal distribution a better fit for this variable or the
    original (Living Area) variable? Why do you think this is the case?


    ===========

    Part (B)
    1. Create the 90%, 95%, and 99% confidence intervals for the average home
    price and explain what these mean. How do the margins of error for these
    three confidence intervals compare? Does that make sense? Before
    creating the confidence intervals, be sure to check the conditions
    necessary to create confidence intervals (and briefly describe this in your
    submission).
    2. Your friend has asked you to provide an estimate for the 95th percentile of
    home prices in this market. Which (if any) of the above confidence
    intervals can you use to give an answer? Describe briefly.
    3. The sample data given to you all come from home sales within the past 12
    months. Suppose you had sample data of the same size each year going
    back several years, and calculated the average sale price for each year.
    What kind of distribution do you expect to see for these averages and
    why? (Include the parameters of the distribution in your response,
    assuming that the house prices don’t change i.e. go up or down, over
    time. Clearly, this is not a great assumption but make it anyway.)
    4. The architecture changed significantly in this geographical area about
    30 years ago. So any houses aged more than 30 years are considered
    “old” houses. What proportion of the houses in the sample is old?
    Provide the 95% and 99% confidence intervals for the proportion of
    old houses in this area, and interpret them. Once again, make sure
    that the necessary conditions are satisfied before creating confidence
    intervals.

    Warm regards
    Ravin
    Attached Files Attached Files
    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor
    Joined
    Sep 2012
    From
    Australia
    Posts
    4,172
    Thanks
    765

    Re: Guidance to solve House Prices Data Set Statistical Analysis Problem

    Hey mymat.

    Obviously this looks like an assignment (or homework) where you have not shown any effort whatsoever. Why don't you show us some effort?
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Statistical Data
    Posted in the New Users Forum
    Replies: 1
    Last Post: July 29th 2012, 06:31 PM
  2. Help with statistical analysis of data
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: September 7th 2011, 12:43 AM
  3. Statistical Data Analysis
    Posted in the Statistics Forum
    Replies: 0
    Last Post: March 22nd 2011, 11:07 PM
  4. Replies: 0
    Last Post: February 1st 2010, 06:09 AM
  5. Statistical Analysis that I can do with the current data
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: February 14th 2009, 02:46 AM

Search Tags


/mathhelpforum @mathhelpforum