Hello! I conduct a survey in which I have a very large (approximately
800,000 records) set of numerical data (the data is financial data
collected from various businesses). All of the data is run through a
current-year/prior year data check (i.e., if the current year data is
out of tolerance from what was reported in the prior year then the
data is said to have "failed" the CY/PY edit check). I also keep a
record of which data amounts are changed from what was originally
reported by the businesses. I wanted to conduct a statistical test to
confirm that the proportion of data changed in amounts that "passed"
the CY/PY edit check is significantly different from the proportion of
data changed in amounts that "failed" the CY/PY edit check. What kind
of a statistical test should I use... would it just be your very basic
two-sample proportion test or should I use something else?

What (I think) complicates things is that the proportions here are
extremely low (i.e., less than 1 in 500 amounts end up getting
changed, both within amounts that "pass" and amounts that "fail" the
CY/PY edit check). Would the low proportion value effect the accuracy
of the test? Also, I have doubts this data comes from a normal
distribution.... what test would be best if the data is normal and
what test would be best if the data is not normal? Is it possible to
do any of these proportions tests in SAS or would I have to do them
manually?


I'm really just looking for some names of tests/proc procedures that
would be appropriate to use given the above and I can take it from
there, thanks!
Julie