Word Embedding for Computer Vision

Kalyanaraman_Nemam · December 18, 2020, 12:22pm

Embedding is method of converting the word into numeric value with meaning associated with it. Is embedding in computer vison means converting photos with numeric value ? if yes

How different is pixel value from embedding, since pixel value is also numeric?

sgiri · December 19, 2020, 1:13pm

If you take the two photos of one person - p11, p12 and one from a different person p21 and calculate the corresponding embedding say: ep11, ep12 and ep21.

Then, the embedding of the same person will be nearby. This is how you visualize it in 2d space (though it is going to be around 128-dimensional space).

Pixels comparison would not yield anything while by comparing embeddings you can identify people.

Kalyanaraman_Nemam · December 19, 2020, 3:04pm

Thanks for your answer. I understand now