# Orthogonality Question

• Dec 3rd 2009, 08:02 PM
bambamm
Orthogonality Question
Suppose you are given a set of
n documents numbered 1 through n. Suppose there is
a list of
m words numbered 1 through m that are of interest to you. Define vectors x1

through
xn as follows: the jth entry of xi is the number of times the jth word appears
in the
ith document. You can think of xi as a summary of the information in the ith
document.

I selected three articles from Wikipedia and counted the number of times the words
fur
, blood, bone, feather appeared in each. Here are the results:
Article
fur blood bone feather

Mammal 6 4 15 0
Reptile 1 5 0 3
Bird 0 7 5 43
Suppose you want to find the document that puts the most emphasis on word
j.
One way of doing this is to find the
i that gives the largest value of

(x
i · ej)/||xi||.

Do this with the word
blood. Note that the document you obtain is not the one in
which the word appears the most number of times.

I think I know how to approach this question. But I don't know how to define ej. I thought it meant the vector for blood. But then the vector would only contain 3 entries and each document vector contains 4 entries. Could somebody please help me. Thank you very much.