# Thread: Drawing black tiles! Tough one

1. Originally Posted by dervast
I would like to thank you in advance for your help and support.
I think at first we should try to chance your reasoning a little just for the 100*100 area scenario. I will start studying now and I ll try to post back later on this evening. I would like to thank you in advance for your help.
Best Regards
Alex
I believe I'm making progress on the exact problem. I have come up with a flawed algorithm that gives answers that are "close to" correct. The algorithm is for a general m-by-n board (it is easier to test that way). Also I wrote a brute-forcer for smaller boards, to test against. The flawed algorithm is simple (in a manner of speaking), concise, and very fast, but the answers given are always a bit high (except for 2-by-2 board), for example, to your 8-by-8 question the answer given is 0.00030602613...(more decimal places). I'm working on a way to fix it.

Monte Carlo is at about 1.2 * 10^11 trials (!) and the answer seems to be in the range [0.000305835, 0.000305875] again as a rough estimate.

I'll hold off on explaining how it works for the time being; I'm just focused on trying to fix it.

Here is the complete code up to this point:

Code:
import java.util.Random;

public class ChessBlackSquare22 {
static Random g=new Random();
static double g1;

public static void main(String[] args) {
//monteCarlo();
solve(0.05,3,4);
naive(0.05,3,4);
}

static void solve(double p,int m,int n) {
int i,j;
long t=time();
double z[][]=new double[m][n],q,r,a,b,c,x=0,y=1;
for(i=0;i<m;i++)
z[i][0]=p;
for(i=1;i<n;i++)
z[0][i]=p;
for(i=1;i<m;i++)
for(j=1;j<n;j++) {
a=z[i-1][j-1];
b=z[i-1][j];
c=z[i][j-1];
x+=y*(q=a*b*c*p);
z[i][j]=(p-q)/(r=1-q);
z[i-1][j]=(b-q)/r;
z[i][j-1]=(c-q)/r;
y*=r;
}
System.out.println(x);
System.out.println("Elapsed: "+(time()-t)/1000.0+" s");
}

static void naive(double p,int m,int n) {
long t=time();
if(m*n>25) return;
g1=0;
naive2(m,n,p,0,0,1,new boolean[m][n]);
System.out.println(g1);
System.out.println("Elapsed: "+(time()-t)/1000.0+" s");
}

static void naive2(int m,int n,double p,int a,int b,double q,boolean[][] z) {
int i,j,a2,b2;
boolean[][] y;
if(a==m) {
outer:
for(i=0;i<m-1;i++)
for(j=0;j<n-1;j++)
if(z[i][j]&&z[i][j+1]&&z[i+1][j]&&z[i+1][j+1]) {
g1+=q;
break outer;
}
return;
}
if(b<n-1) {
a2=a;
b2=b+1;
}
else {
a2=a+1;
b2=0;
}
naive2(m,n,p,a2,b2,q*(1-p),deepcopy(z));
y=deepcopy(z);
y[a][b]=true;
naive2(m,n,p,a2,b2,q*p,y);
}

static boolean[][] deepcopy(boolean[][] a) {
int i;
boolean[][] b=new boolean[a.length][];
for(i=0;i<a.length;i++)
b[i]=a[i].clone();
return b;
}

static void monteCarlo() {
int i,j;
long s=0,t,t1=time();
boolean b,c[][];
for(t=0;;t++) {
if(t>0&&(t%500000)==0) System.out.println("Trials: "+t+
", Probability: "+(s/(double)t)+", Elapsed:"+((time()-t1)/1000.0));
b=false;
c=new boolean[8][8];
for(i=0;i<8;i++)
for(j=0;j<8;j++)
if(g.nextInt(20)==0) c[i][j]=true;
for(i=0;i<7;i++)
for(j=0;j<7;j++)
if(c[i][j]&&c[i][j+1]&&c[i+1][j]&&c[i+1][j+1]) {
b=true;
break;
}
if(b) s++;
}
}

static long time() {
return System.currentTimeMillis();
}
}

2. Just an update since it's been a while. I've stopped my Monte Carlo after about 2.5 * 10^11 trials and seems my latest interval wasn't reliable since the values fluctuated more than I guessed they would. But the previous interval should be okay still, and could be narrowed a bit if desired. I took screenshots of the program output and can upload them sometime.

As for the main problem; I believe my approach involves more conditional probabilities than I accounted for originally, and currently I have code that quickly gets the answer for an m-by-n board if m=2 or n=2, but gets wrong answers otherwise. It is significant that m=2 and n=2 both give right answers, since it essentially means I get the same answer using two different sub-algorithms within my algorithm (it's a sanity check). I can think of two potential ways to remedy the program's flaw at the moment; one involves some brute force and will work for 8-by-8 but not 100-by-100. The other is recursive and could work for 100-by-100 (assuming it will work at all), but would be harder to code. And if there's a better way I'm also trying to find it. Anyway I only have so much time per day I can allot to this, so I'll keep everyone posted.

A brief overview of my approach: Let $A_1$ be the event that a8,b8,a7,b7 are all black. Let $A_2$ be the event that b8,c8,b7,c7 are all black. Etc. So I'm looking at

$P(\text{desired})=P(A_1)+P(\overline{A_1})P(A_2|\o verline{A_1}) +P(\overline{A_1}\cap\overline{A_2})P(A_3|\overlin e{A_1}\cap\overline{A_2})+\dots$

3. Thanks again for the update. Actually according to my reading this is a spatially chaos exercise. There are so many spatial dependencies in these problem.

Can I ask you something more?
Assume that the possiblity of having one square formed is 0.0003 . What is the possiblity to have at least one square formed after 10 years, 100 years, 1000 years. This is a poisson problem. Is nt it? I assume that in every turn I kill previous versions of this game.

Originally Posted by undefined
Just an update since it's been a while. I've stopped my Monte Carlo after about 2.5 * 10^11 trials and seems my latest interval wasn't reliable since the values fluctuated more than I guessed they would. But the previous interval should be okay still, and could be narrowed a bit if desired. I took screenshots of the program output and can upload them sometime.

As for the main problem; I believe my approach involves more conditional probabilities than I accounted for originally, and currently I have code that quickly gets the answer for an m-by-n board if m=2 or n=2, but gets wrong answers otherwise. It is significant that m=2 and n=2 both give right answers, since it essentially means I get the same answer using two different sub-algorithms within my algorithm (it's a sanity check). I can think of two potential ways to remedy the program's flaw at the moment; one involves some brute force and will work for 8-by-8 but not 100-by-100. The other is recursive and could work for 100-by-100 (assuming it will work at all), but would be harder to code. And if there's a better way I'm also trying to find it. Anyway I only have so much time per day I can allot to this, so I'll keep everyone posted.

A brief overview of my approach: Let $A_1$ be the event that a8,b8,a7,b7 are all black. Let $A_2$ be the event that b8,c8,b7,c7 are all black. Etc. So I'm looking at

$P(\text{desired})=P(A_1)+P(\overline{A_1})P(A_2|\o verline{A_1}) +P(\overline{A_1}\cap\overline{A_2})P(A_3|\overlin e{A_1}\cap\overline{A_2})+\dots$

4. Originally Posted by dervast
Thanks again for the update. Actually according to my reading this is a spatially chaos exercise. There are so many spatial dependencies in these problem.

Can I ask you something more?
Assume that the possiblity of having one square formed is 0.0003 . What is the possiblity to have at least one square formed after 10 years, 100 years, 1000 years. This is a poisson problem. Is nt it? I assume that in every turn I kill previous versions of this game.
Well I'm pretty convinced I can come up with an exact solution to 8-by-8; the brute force I was referring to is considering two adjacent rows, which results in a reduction from (about) 2^64 to 7*2^16 if I'm thinking about it properly.

I think regarding Poisson you're correct that it applies (but maybe not for the 10 years case), and I selected this quote from Wikipedia as relevant:

"The Poisson distribution can be derived as a limiting case to the binomial distribution as the number of trials goes to infinity and the expected number of successes remains fixed — see law of rare events below. Therefore it can be used as an approximation of the binomial distribution if n is sufficiently large and p is sufficiently small. There is a rule of thumb stating that the Poisson distribution is a good approximation of the binomial distribution if n is at least 20 and p is smaller than or equal to 0.05, and an excellent approximation if n ≥ 100 and np ≤ 10."

Edit: After more consideration, I don't think the recursive approach I had in mind would work for 100-by-100 since it seems it would require visiting 2^100 states per row (or maybe 2^200, same as the brute force idea). But I'll keep thinking about it, the problem interests me.

Edit 2: Actually, there might be something I overlooked that would render the idea I have in mind unworkable, haha. Anyway I'll keep working on it when I have time.

5. Okay, major update. I believe I've solved the 8-by-8 problem, although the value obtained is pretty far away from the Monte Carlo estimate. This could be due to: a bug, a misinterpretation of Monte Carlo output, a lack of precision arising from double floating point arithmetic. A remote possibility is some issue with Java's pseudo-random generator, but I mention this only in passing because I think it highly unlikely. I decided to make a blog entry for this, and the code can be obtained here.

Output:

Code:
3.048504586321702E-4
Elapsed: 0.055 s
In terms of style, I wrote the program more for compactness and speed than legibility, but the way it works is similar to what I wrote in the last post, although I changed my approach a little. Now it's: Let $A_1$ be the event that there is a 2x2 black square in rows 7 and 8. Let $A_2$ be the event that there is a 2x2 black square in rows 6 and 7. Etc. So

$P(\text{desired})=P(A_1)+P(\overline{A_1})P(A_2|\o verline{A_1}) +P(\overline{A_1}\cap\overline{A_2})P(A_3|\overlin e{A_1}\cap\overline{A_2})+\dots$

I have some confidence in the algorithm because it passes a bunch of sanity tests: allowing some loss of precision, values obtained by solve() are identical with values obtained by brute(), and when testMode is set to true, it is possible verify that solve(p,m,n)=solve(p,n,m). I'm thinking of rewriting using Big Decimal to investigate precision loss, and maybe looking more closely at Monte Carlo results.

A bit sad it doesn't scale to 100-by-100.. always interested in whether there's a better algorithm..

6. I see this thread is going on.

Is the problem to determine the probability of at least one 2*2 square formed in a 100*100 box? Or is it to determine the probability of exactly one 2*2 box being formed in a 100*100 box?

7. Originally Posted by jamix
I see this thread is going on.

Is the problem to determine the probability of at least one 2*2 square formed in a 100*100 box? Or is it to determine the probability of exactly one 2*2 box being formed in a 100*100 box?
At least.

8. Originally Posted by undefined
At least.
Yea at least which will be interpreted as the lower bound(worst case). We would like to solve first the 8*8 problem and then try to change it to be applied in a 100*100 scenario.

I would like to thank all of you.
Best Regards
Alex

9. Originally Posted by undefined
Okay, major update. I believe I've solved the 8-by-8 problem, although the value obtained is pretty far away from the Monte Carlo estimate. This could be due to: a bug, a misinterpretation of Monte Carlo output, a lack of precision arising from double floating point arithmetic. A remote possibility is some issue with Java's pseudo-random generator, but I mention this only in passing because I think it highly unlikely. I decided to make a blog entry for this, and the code can be obtained here.

Output:

Code:
3.048504586321702E-4
Elapsed: 0.055 s
In terms of style, I wrote the program more for compactness and speed than legibility, but the way it works is similar to what I wrote in the last post, although I changed my approach a little. Now it's: Let $A_1$ be the event that there is a 2x2 black square in rows 7 and 8. Let $A_2$ be the event that there is a 2x2 black square in rows 6 and 7. Etc. So

$P(\text{desired})=P(A_1)+P(\overline{A_1})P(A_2|\o verline{A_1}) +P(\overline{A_1}\cap\overline{A_2})P(A_3|\overlin e{A_1}\cap\overline{A_2})+\dots$

I have some confidence in the algorithm because it passes a bunch of sanity tests: allowing some loss of precision, values obtained by solve() are identical with values obtained by brute(), and when testMode is set to true, it is possible verify that solve(p,m,n)=solve(p,n,m). I'm thinking of rewriting using Big Decimal to investigate precision loss, and maybe looking more closely at Monte Carlo results.

A bit sad it doesn't scale to 100-by-100.. always interested in whether there's a better algorithm..

I would like for your contribution. It seems that your blog post is down at the moment. I was trying yesterday to make your calculation to scale in the 100x100 scenario but I didn't manage to do it.
Perhaps the monte carlo simulation might do that.
What do you think?

10. Originally Posted by dervast
I would like for your contribution. It seems that your blog post is down at the moment. I was trying yesterday to make your calculation to scale in the 100x100 scenario but I didn't manage to do it.
Perhaps the monte carlo simulation might do that.
What do you think?
Well, I put many hours into this previously and when I found a solution that seemed to work I deprioritised it.. Still thinking about it but haven't found a way to solve the 100x100 case.. The blog post is currently up and has a Monte Carlo routine that allows you to input m and n for a general m-by-n board, and 100x100 will work, but the results would have to be analysed carefully to get a proper confidence interval (unlike my estimations earlier which were just from reading the output without further analysis). Haven't gotten to the Big Decimal thing and re-examining Monte Carlo results but expect to do so within a week.

Page 2 of 2 First 12