# Thread: Bayes classifier using Parzen windows

1. ## Bayes classifier using Parzen windows

Hello, I have to make a project where I have 2 given classes of data (males, females) which are described by 2 variables (height, weight). What I have to do is build a bayesian classifier which uses Parzen windows (kernel density estimator) to classify the data.

What I did is applied the kernel density estimator first for females described by height, then I applied it for females described by weight and I multiplied these results, thinking it would show the probability of being a female described by height and weight (but I guess I was wrong). For me, it is not clear what probability is the result of the kernel density estimation. I dont really know how to apply the bayes classifier next because I dont understand what probability (prior,conditional, etc) I get as the result of kernel density estimator.

The formula I used is the one in the definition here Kernel density estimation - Wikipedia, the free encyclopedia , where the kernel is the gaussian function.

I hope my point is understood, as English is not my first language.

What do you think?

2. Originally Posted by Andreea Hello, I have to make a project where I have 2 given classes of data (males, females) which are described by 2 variables (height, weight). What I have to do is build a bayesian classifier which uses Parzen windows (kernel density estimator) to classify the data.

What I did is applied the kernel density estimator first for females described by height, then I applied it for females described by weight and I multiplied these results, thinking it would show the probability of being a female described by height and weight (but I guess I was wrong). For me, it is not clear what probability is the result of the kernel density estimation. I don’t really know how to apply the bayes classifier next because I don’t understand what probability (prior,conditional, etc) I get as the result of kernel density estimator.

The formula I used is the one in the definition here Kernel density estimation - Wikipedia, the free encyclopedia , where the kernel is the gaussian function.

I hope my point is understood, as English is not my first language.

What do you think?
Because height and weight are correlated you need to do the KDE with a bivariate Gaussian Kernel.

CB

3. I did that also, but my confusion is how do I use that result from KDE in my Bayes formula? What probability is it?

4. Originally Posted by Andreea I did that also, but my confusion is how do I use that result from KDE in my Bayes formula? What probability is it?
What the KDE gives you are estimates of the conditional distributions p((h,w)|m) and p((h,w)|f), you use these in your classifier as though they are the conditionals. In simple Bayes this would allow you to say:

p(m|(h,w))=[p((h,w)|m) p(m)]/[p((h,w)|m)p(m)+p((h,w)|f)p(f)]

CB

5. I am programming this bayesian classifier using C. I have the feeling I'm doing something wrong. I will post the code here, could you take a look?

This is the KDE:

double estimantFDP(double x, double hn, double *bufVal, unsigned int n)
{
double val = 0;
double inv_hn = 1/hn;
unsigned int k;

for (k = 0; k < n; k++)
val += fctExp ((x-bufVal[k])/hn);

return (val*inv_hn/n);
}

//functia fereastra de tip exponential
double fctExp(double x)
{
return (exp(-x*x/2)/SQRT2PI);
}

This is the class function returner, where cateC1/cate is the prior probability P(F), x=height, x=weight, valoriX01= a buffer with female height values, X02 - male height, Y01 - female weight, Y02 - male weight:

int Bayes_Parzen_Classifier(double x, double *valoriX, double *valoriY)
{
double inm1, inm2,a;
inm1=(cateC1/cate) * estimantFDP(x, hn, valoriX01, n) * estimantFDP(x, hn, valoriY01, n);
inm2=(cateC2/cate) * estimantFDP(x, hn, valoriX02, n) * estimantFDP(x, hn, valoriY02, n);

if (inm1>inm2)
a=0;
else a=1;
printf("%lf\t%lf\t%lf\n",inm1,inm2,a);
return a;

}

6. Originally Posted by Andreea I am programming this bayesian classifier using C. I have the feeling I'm doing something wrong. I will post the code here, could you take a look?

This is the KDE:

double estimantFDP(double x, double hn, double *bufVal, unsigned int n)
{
double val = 0;
double inv_hn = 1/hn;
unsigned int k;

for (k = 0; k < n; k++)
val += fctExp ((x-bufVal[k])/hn);

return (val*inv_hn/n);
}

//functia fereastra de tip exponential
double fctExp(double x)
{
return (exp(-x*x/2)/SQRT2PI);
}

This is the class function returner, where cateC1/cate is the prior probability P(F), x=height, x=weight, valoriX01= a buffer with female height values, X02 - male height, Y01 - female weight, Y02 - male weight:

int Bayes_Parzen_Classifier(double x, double *valoriX, double *valoriY)
{
double inm1, inm2,a;
inm1=(cateC1/cate) * estimantFDP(x, hn, valoriX01, n) * estimantFDP(x, hn, valoriY01, n);
inm2=(cateC2/cate) * estimantFDP(x, hn, valoriX02, n) * estimantFDP(x, hn, valoriY02, n);

if (inm1>inm2)
a=0;
else a=1;
printf("%lf\t%lf\t%lf\n",inm1,inm2,a);
return a;

}
I have already indicated what is wrong with this, you are treating the weight and height as independed when they are not. So you are summing Gaussian weights for height and Gaussian weights for weight and the multiplying them. The summing and multiplication should be in the other order.

(that is as far as I can tell, I am not going to go through your 'C' code in detail)

CB

7. Yes, you are right. Thanks for the help bayes, classifier, density, kernel, parzen, windows 