I'm running STATA and have a bit of a problem. I have employment data for minority-owned firms and non-minority owned firms. Employment data is all in one column, and I've separated it by dummy variables so I have "majorityemp" with about 350,000 observations and "minorityemp" with about 1,000 observations. The "minority" column has Y or N, with a blank for companies that did not disclose that data.
I want to test if there's a statistically-significant difference in the means of the two groups, but whenever I run "ttest majorityemp=minorityemp" I get "zero observations (r2000)" error.
When I try running "ttest minority, by(emp)" I get "more than 2 groups found, only 2 allowed r(420);" I assume this is because the non-reporting companies are tacked on as a third group.
My question: is there any way I can do a ttest by employment for JUST minority=Y or N, thus filtering out the non-responses?