When creating a vector index in Upstash Vector, you have the flexibility to choose from different vector similarity functions. Each function yields distinct query results, catering to specific use cases. Here are the three supported similarity functions:
The score returned from query requests is a normalized value between 0 and 1, where 1 indicates the highest similarity and 0 the lowest regardless of the similarity function used.
Cosine similarity measures the cosine of the angle between two vectors. It is particularly useful when the magnitude of the vectors is not essential, and the focus is on the orientation.
Use Cases:
Score calculation:
(1 + cosine_similarity(v1, v2)) / 2;
Euclidean distance calculates the straight-line distance between two vectors in a multi-dimensional space. It is well-suited for scenarios where the magnitude of vectors is crucial, providing a measure of their spatial separation.
Use Cases:
Score calculation:
1 / (1 + squared_distance(v1, v2))
The dot product measures the similarity by multiplying the corresponding components of two vectors and summing the results. It provides a measure of alignment between vectors. Note that to use dot product, the vectors needs to be normalized to be of unit length.
Use Cases:
Score calculation:
(1 + dot_product(v1, v2)) / 2