I have a sample of data for a number of cities in the UK that describes how commonly they are mentioned in the media (and I also have the associated city population size) :
City, Population, Mentions
I'd like to be able to compare the mentions on a like-for-like basis, adjusting so that city population size isn't a variable when comparing mentions.
I think the correct way to normalise the 'mentions' for each city is :
city_mentions * (1-(city_population/total_population_of_all_cities))
I'm unable to explain if this is correct though. I'd appreciate any alternate suggestions as to how this could/should be done or if the above method is correct.