Results 1 to 8 of 8
Like Tree1Thanks
  • 1 Post By chiro

Math Help - Regression Analysis - Transformations and Non-Normality

  1. #1
    Newbie
    Joined
    Oct 2012
    From
    UK
    Posts
    7

    Exclamation Regression Analysis - Transformations and Non-Normality

    Hello,

    I am fairly new to regression analysis and have a few questions..

    I have a fairly large dataset containing information about 144 municipalities in a country. I am trying to create a model that uses a number of independent variables (urban ratio, literacy rate etc.) to model the migration in each municipality. My independent variables are a mix of binary (0,1) and continuous. I was wondering if anyone could point me in the right direction of where to start with analysing my variables? I have done a simple linear regression and that has produced no significant results. It has been suggested to me that I could try and transform some of the variables (as they are non normal) or to use a model that does not assume normality.

    What sort of transformations are there and why might they be useful? What are the other models available that do not assume normality?

    I have also been told to look at GLMs....

    Sorry for the long message.. but I desperately need help!
    Follow Math Help Forum on Facebook and Google+

  2. #2
    MHF Contributor
    Joined
    Sep 2012
    From
    Australia
    Posts
    4,163
    Thanks
    761

    Re: Regression Analysis - Transformations and Non-Normality

    Hey larrikinlover.

    There are what we call GLM's with link functions. You can also use additive models which transform your input random variables (which sounds like something you should check out).

    The thing though is that you need to describe what you are modelling: what are the variable types of your response and predictor variables? This answer will make a big difference on what you will choose with regards to analytic techniques as well as whether the assumptions are valid for those techniques.

    With normal multiple linear regression, you can transform variables using various transformations (as is commonly done), but ultimately you need to say what you will be doing with the results and how you will use the model.
    Thanks from larrikinlover
    Follow Math Help Forum on Facebook and Google+

  3. #3
    Senior Member MaxJasper's Avatar
    Joined
    Aug 2012
    From
    Canada
    Posts
    482
    Thanks
    55

    Re: Regression Analysis - Transformations and Non-Normality

    Attach a sample of complete variables, definitions, etc...and required stat info.
    Follow Math Help Forum on Facebook and Google+

  4. #4
    Newbie
    Joined
    Oct 2012
    From
    UK
    Posts
    7

    Re: Regression Analysis - Transformations and Non-Normality

    Thank you for your help chiro

    I have data from 2000 and 2010. I am trying to find out whether an increase in an industrial activity has had a more than proportionate effect on population growth (or migration). I am going to do this by comparing municipalities that contain employment in an industrial activity to those without. I have been asked to construct a simple model first with population change between 2000 and 2010 (percentage increase) as the dependent variable and the presence of the industrial activity (as a percentage of total employment), the distance from the capital (in km) and the presence of a federal road (binary - 0,1) as independent variables. I am still unsure how this will help in deciding whether there is a more than proportionate population increase in industrial municipalities but perhaps that is something I will get to later.

    I suppose my problem is deciding what analytical techniques to use for the data I have. Could you possibly explain a little more about additive models? This is something that has been brought up but I am having a hard time understanding what they are. Also, is there a systematic way of finding out which model/transformations are best or am I just trying to max the R2 value?

    Thank you again, this is a big help!
    Follow Math Help Forum on Facebook and Google+

  5. #5
    Newbie
    Joined
    Oct 2012
    From
    UK
    Posts
    7

    Re: Regression Analysis - Transformations and Non-Normality

    Also, I have uploaded a sample of my data (PopChange is DV)
    Attached Files Attached Files
    Follow Math Help Forum on Facebook and Google+

  6. #6
    MHF Contributor
    Joined
    Sep 2012
    From
    Australia
    Posts
    4,163
    Thanks
    761

    Re: Regression Analysis - Transformations and Non-Normality

    Since you are using count data, you probably want to use a GLM with your Y being Poisson distributed with the appropriate link function (we will discuss this later).

    As for having categorical variables, you simply setup the right dummy variables in your model to take care of these: it will depend on how many categories, how independent/dependent they are (if they are dependent you will have interaction terms and you can check whether these are significant or not)

    Additive models are basically Y = f1(X1) + f2(X2) + ... + fn(Xn) so they try to find general functions to fit the best model given the data.

    Personally I think you should think about transforming your data rather than looking at these initially since the above is more complex and if you are new to this, you want to go with the simplest thing that are comfortable with understanding: I have to stress that it's better to use a simpler technique that you understand than a complex one that you have no idea about because stuff you don't understand tends to blow up in peoples faces.

    I've just noticed your pop-change data is in decimal form (so not integers): Can you explain what this variable is referring to? (Is it some kind of relative change)?

    I would take a look at either the Poisson or the exponential distribution for rate distributions and see what kind of link functions are used for these as well as what these link functions are used for in the context of your experiment (and please share your thoughts with us). The link function will relate directly to the model you are using (i.e. the predictor variables).

    The main difference with the Poisson and exponential relates to continuous or discrete changes.
    Follow Math Help Forum on Facebook and Google+

  7. #7
    Newbie
    Joined
    Oct 2012
    From
    UK
    Posts
    7

    Re: Regression Analysis - Transformations and Non-Normality

    I would like to use a Poisson distruibution with a log link function. However, when I run the model in R it comes up with a warning because some of the DV data is negative... could I get round this by transforming the DV (log, sqrt etc.)?

    The PopChange data is the percentage increase in population between 2000 and 2010. This could always be changed to an integer value if this is easier.

    I do not know how to determine which transformations are appropriate for the IVs... what do I need to look at?

    Thanks again, and apologies, I'm really new to all this.
    Follow Math Help Forum on Facebook and Google+

  8. #8
    Senior Member MaxJasper's Avatar
    Joined
    Aug 2012
    From
    Canada
    Posts
    482
    Thanks
    55

    Lightbulb Re: Regression Analysis - Transformations and Non-Normality

    Industy employment values are positive only and I suspect you have not included layoffs! So I use this fact and use it as dependent var then you can find the inverse function and parameters.

    Case numbers are indicated on outliers but they are not excluded in this model.



    A quick check indicates a pattern with Tweedie+log distribution of industry employment in terms of other variables.



    Each parameter seems to be significant in this model:




    It is interesting to note that industry employment is higher in localities with no fed roads
    This is a quick & crude analysis & model just as a guide and should be properly checked in detail.
    Attached Thumbnails Attached Thumbnails Regression Analysis - Transformations and Non-Normality-scatterplot.png   Regression Analysis - Transformations and Non-Normality-parameters1.png   Regression Analysis - Transformations and Non-Normality-parameters2.png  
    Last edited by MaxJasper; October 18th 2012 at 11:58 AM.
    Follow Math Help Forum on Facebook and Google+

Similar Math Help Forum Discussions

  1. Appropriate Regression Analysis?
    Posted in the Advanced Statistics Forum
    Replies: 2
    Last Post: June 29th 2012, 04:30 PM
  2. Regression analysis
    Posted in the Algebra Forum
    Replies: 8
    Last Post: December 10th 2011, 08:59 PM
  3. Regression analysis
    Posted in the Advanced Statistics Forum
    Replies: 3
    Last Post: March 7th 2010, 11:20 AM
  4. Regression analysis
    Posted in the Advanced Statistics Forum
    Replies: 1
    Last Post: October 4th 2009, 06:50 AM
  5. Regression analysis
    Posted in the Advanced Statistics Forum
    Replies: 0
    Last Post: October 30th 2008, 08:07 AM

Search Tags


/mathhelpforum @mathhelpforum