I believe this coding does assign unique numbers to programs, but it can be made more economical. First, you can encode a program by (this allows unambiguous recovering of ). Also, instead of , , ..., you can have , , ..., .
Does this procedure give an effective way of assigning unique Godel numbers to Turing programs?
Let denote the th prime number. Let be internal states, let denote tape symbols and let denote direction symbols. Assign a code number to each of these symbols as follows: .
Now suppose is a line of a Turing program . Then where and . Assign a code number to as follows:
Lastly suppose that is a Turing program. Assign a code number to as follows: .
I believe this coding does assign unique numbers to programs, but it can be made more economical. First, you can encode a program by (this allows unambiguous recovering of ). Also, instead of , , ..., you can have , , ..., .
So when we are assigning a Godel numbering to the Turing program, the only important thing is that each program be assigned a unique natural number, correct? My text is really vauge on this point, so I was uncertain whether each symbol, formula, etc. needed a unique Godel number.
We need two facts: (1) each program is assigned a unique natural number and (2) we can determine the same properties given a program and given its number. The first fact is well-defined mathematically while the second is not. We have intuition about what a computer can find out about a Turing machine (TM): e.g., it can find the number of characters and instructions, run it for 100 steps to see if it stops, determine if it loops within the first 10^5 steps, etc. However, we don't have a mathematical definition of a computer yet; that's why we are working with TMs to begin with, and TMs work with numbers, or, rather, with two-character strings. Therefore, we have to encode a machine as such a string, but we want another machine to be able to discover all facts starting from this string code that our intuition says a computer is able to discover about the original TM. In other words, a TM as a sequence of instructions, i.e., tuples , must contain the same information as its two-character encoding. This claim is not a theorem; rather, it is the idea behind the definition of the encoding. Note that (1) above is, in fact, a consequence of (2).
To get an example of a bad encoding, consider an injection whose inverse is not computable and encode instructions as . Then another TM would not be able to parse such instructions and, therefore, determine the run-time properties of the TM. So, we can extract fewer properties from the code than from the original TM. For another example, let if a TM stops on the empty tape and otherwise. Then let where is your encoding. Now, we can get more information from encodings than from TMs: we can find out if they halt, which is impossible (halting problem).
Ideally, instead of TMs we should have a model of computation that works not with binary strings but with arbitrary datatypes (lists, trees, etc.). Then we could feed TMs as input to other TMs directly, without encoding through prime numbers and so on.