I am unclear about a detail of this process. Since each boy has to wait their turn, does a boy dip into their stock whenever they need to throw a ball. Do the balls, then, "disappear" with probability 1-p? Maybe you should detail a few iterations of this process. For instance,
A has 5 balls, B has 5 balls.
A throws ball and has 4 balls remaining.
B receives A's ball and has 6 balls remaining.
B throws ball and has 5 remaining.
A does not receive ball and has 4 remaining.
I would keep a running tally in an ordered pair (A, B) = (k, n-k). Then at each iteration just adjust this object as needed. If you run a simulation of this many times, having it stop once a term of (A, B) reaches 0, scoring the number of turns it took to reach that point, you should get a good sample that represents q. Are you looking for a formal closed form answer or is an approximation from such a simulation alright?