Why Does My Snapchat AI Have a Story? Has Snapchat AI Been Hacked?

Image
Explore the curious case of Snapchat AI’s sudden story appearance. Delve into the possibilities of hacking and the true story behind the phenomenon. Curious about why your Snapchat AI suddenly has a story? Uncover the truth behind the phenomenon and put to rest concerns about whether Snapchat AI has been hacked. Explore the evolution of AI-generated stories, debunking hacking myths, and gain insights into how technology is reshaping social media experiences. Decoding the Mystery of Snapchat AI’s Unusual Story The Enigma Unveiled: Why Does My Snapchat AI Have a Story? Snapchat AI’s Evolutionary Journey Personalization through Data Analysis Exploring the Hacker Hypothesis: Did Snapchat AI Get Hacked? The Hacking Panic Unveiling the Truth Behind the Scenes: The Reality of AI-Generated Stories Algorithmic Advancements User Empowerment and Control FAQs Why did My AI post a Story? Did Snapchat AI get hacked? What should I do if I’m concerned about My AI? What is My AI...

A Gentle Introduction to Vector Space Models


Last Updated on October 23, 2023

Vector space fashions are to consider the connection between data which is likely to be represented by vectors. It is well-liked in information retrieval strategies however as well as useful for various features. Generally, this allows us to match the similarity of two vectors from a geometrical perspective.

In this tutorial, we’re going to see what’s a vector space model and what it should most likely do.

After ending this tutorial, you will know:

  • What is a vector space model and the properties of cosine similarity
  • How cosine similarity could enable you consider two vectors
  • What is the excellence between cosine similarity and L2 distance

Let’s get started.

A Gentle Introduction to Sparse Matrices for Machine Learning

A Gentle Introduction to Vector Space Models
Photo by liamfletch, some rights reserved.

Tutorial overview

This tutorial is break up into 3 parts; they’re:

  1. Vector space and cosine system
  2. Using vector space model for similarity
  3. Common use of vector space fashions and cosine distance

Vector space and cosine system

A vector space is a mathematical time interval that defines some vector operations. In layman’s time interval, we’re ready to consider it is a $n$-dimensional metric space the place each stage is represented by a $n$-dimensional vector. In this space, we’re in a position to do any vector addition or scalar-vector multiplications.

It is useful to consider a vector space because of it is useful to indicate points as a vector. For occasion in machine learning, we usually have an data stage with a lot of choices. Therefore, it is useful for us to indicate an data stage as a vector.

With a vector, we’re in a position to compute its norm. The commonest one is the L2-norm or the scale of the vector. With two vectors within the an identical vector space, we’re in a position to uncover their distinction. Assume it is a three-d vector space, the two vectors are $(x_1, x_2, x_3)$ and $(y_1, y_2, y_3)$. Their distinction is the vector $(y_1-x_1, y_2-x_2, y_3-x_3)$, and the L2-norm of the excellence is the distance or additional precisely the Euclidean distance between these two vectors:

$$
sqrt{(y_1-x_1)^2+(y_2-x_2)^2+(y_3-x_3)^2}
$$

Besides distance, we’re in a position to moreover have in mind the angle between two vectors. If we have in mind the vector $(x_1, x_2, x_3)$ as a line section from the aim $(0,0,0)$ to $(x_1,x_2,x_3)$ inside the 3D coordinate system, then there could also be one different line section from $(0,0,0)$ to $(y_1,y_2, y_3)$. They make an angle at their intersection:

The angle between the two line segments may very well be found using the cosine system:

$$
cos theta = frac{acdot b} {lVert arVert_2lVert brVert_2}
$$

the place $acdot b$ is the vector dot-product and $lVert arVert_2$ is the L2-norm of vector $a$. This system arises from considering the dot-product as a result of the projection of vector $a$ onto the course as pointed by vector $b$. The nature of cosine tells that, as a result of the angle $theta$ will improve from 0 to 90 ranges, cosine decreases from 1 to 0. Sometimes we would identify $1-costheta$ the cosine distance because of it runs from 0 to 1 as the two vectors are shifting further away from each other. This is an important property that we’ll exploit inside the vector space model.

Using vector space model for similarity

Let’s check out an occasion of how the vector space model is useful.

World Bank collects different details about worldwide areas and areas on the planet. While every nation is completely completely different, we’re in a position to try to guage worldwide areas beneath vector space model. For consolation, we’re going to use the pandas_datareader module in Python to study data from World Bank. You may arrange pandas_datareader using pip or conda command:

The data sequence collected by World Bank are named by an identifier. For occasion, “SP.URB.TOTL” is the complete metropolis inhabitants of a country. Many of the sequence are yearly. When we get hold of a sequence, we have now now to put inside the start and end years. Usually the information are often not updated on time. Hence it is best to take a look on the information a lot of years once more pretty than the most recent yr to steer clear of missing data.

In underneath, we try to accumulate some monetary data of every nation in 2010:

In the above we obtained some monetary metrics of each nation in 2010. The carry out wb.get hold of() will get hold of the information from World Bank and return a pandas dataframe. Similarly wb.get_countries() will get the determine of the worldwide areas and areas as acknowledged by World Bank, which we’re going to use this to filter out the non-countries aggregates equivalent to “East Asia” and “World”. Pandas permits filtering rows by boolean indexing, which df["country"].isin(non_aggregates) affords a boolean vector of which row is inside the document of non_aggregates and based mostly totally on that, df[df["country"].isin(non_aggregates)] selects solely these. For different causes not all worldwide areas can have all data. Hence we use dropna() to remove these with missing data. In observe, we would want to use some imputation methods in its place of merely eradicating them. But for instance, we proceed with the 174 remaining data elements.

To increased illustrate the thought pretty than hiding the exact manipulation in pandas or numpy options, we first extract the information for each nation as a vector:

The Python dictionary we created has the determine of each nation as a key and the monetary metrics as a numpy array. There are 5 metrics, due to this fact each is a vector of 5 dimensions.

What this helps us is that, we’re ready to make use of the vector illustration of each nation to see how comparable it is to a unique. Let’s attempt every the L2-norm of the excellence (the Euclidean distance) and the cosine distance. We select one nation, equivalent to Australia, and consider it to all completely different worldwide areas on the document based mostly totally on the chosen monetary metrics.

In the for-loop above, we set vecA as a result of the vector of the aim nation (i.e., Australia) and vecB as that of the other nation. Then we compute the L2-norm of their distinction as a result of the Euclidean distance between the two vectors. We moreover compute the cosine similarity using the system and minus it from 1 to get the cosine distance. With higher than 100 worldwide areas, we’re in a position to see which one has the shortest Euclidean distance to Australia:

By sorting the result, we’re in a position to see that Mexico is the closest to Australia beneath Euclidean distance. However, with cosine distance, it is Colombia the closest to Australia.

To understand why the two distances give completely completely different final result, we’re in a position to observe how the three worldwide areas’ metric consider to 1 one other:

From this desk, we see that the metrics of Australia and Mexico are very shut to 1 one other in magnitude. However, in case you consider the ratio of each metric all through the an identical nation, it is Colombia that match Australia increased. In reality from the cosine system, we’re in a position to see that

$$
cos theta = frac{acdot b} {lVert arVert_2lVert brVert_2} = frac{a}{lVert arVert_2} cdot frac{b} {lVert brVert_2}
$$

which suggests the cosine of the angle between the two vector is the dot-product of the corresponding vectors after they’ve been normalized to measurement of 1. Hence cosine distance is almost making use of a scaler to the information sooner than computing the area.

Putting these altogether, the subsequent is the whole code

Common use of vector space fashions and cosine distance

Vector space fashions are widespread in information retrieval strategies. We can present paperwork (e.g., a paragraph, a protracted passage, a e ebook, or maybe a sentence) as vectors. This vector may very well be as simple as counting of the phrases that the doc accommodates (i.e., a bag-of-word model) or a classy embedding vector (e.g., Doc2Vec). Then a query to go looking out most likely essentially the most associated doc may very well be answered by score all paperwork by the cosine distance. Cosine distance must be used because of we needn’t favor longer or shorter paperwork, nonetheless to cope with what it accommodates. Hence we leverage the normalization comes with it to consider how associated are the paperwork to the query pretty than what variety of events the phrases on the query are talked about in a doc.

If we have in mind each phrase in a doc as a attribute and compute the cosine distance, it is the “hard” distance because of we do not care about phrases with comparable meanings (e.g. “document” and “passage” have comparable meanings nonetheless not “distance”). Embedding vectors equivalent to word2vec would allow us to consider the ontology. Computing the cosine distance with the which suggests of phrases considered is the “mushy cosine distance“. Libraries equivalent to gensim affords a method to try this.

Another use case of the cosine distance and vector space model is in computer imaginative and prescient. Imagine the responsibility of recognizing hand gesture, we’re ready to ensure parts of the hand (e.g. 5 fingers) the necessary factor elements. Then with the (x,y) coordinates of the necessary factor elements lay out as a vector, we’re in a position to consider with our present database to see which cosine distance is the closest and determine which hand gesture it is. We need cosine distance because of all people’s hand has a particular dimension. We do not want that to affect our decision on what gesture it is displaying.

As it is doable you will take into consideration, there are way more examples you must use this method.

Further learning

This half affords additional belongings on the topic should you’re in search of to go deeper.

Books

Software

Articles

Summary

In this tutorial, you discovered the vector space model for measuring the similarities of vectors.

Specifically, you realized:

  • How to assemble a vector space model
  • How to compute the cosine similarity and due to this fact the cosine distance between two vectors inside the vector space model
  • How to interpret the excellence between cosine distance and completely different distance metrics equivalent to Euclidean distance
  • What are the utilization of the vector space model

 

Get a Handle on Linear Algebra for Machine Learning!

Linear Algebra for Machine Learning

Develop a working understand of linear algebra

…by writing strains of code in python

Discover how in my new Ebook:
Linear Algebra for Machine Learning

It affords self-study tutorials on topics like:
Vector Norms, Matrix Multiplication, Tensors, Eigendecomposition, SVD, PCA and way more…

Finally Understand the Mathematics of Data

Skip the Academics. Just Results.

See What’s Inside





Comments

Popular posts from this blog

7 Things to Consider Before Buying Auto Insurance

TransformX by Scale AI is Oct 19-21: Register with out spending a dime!

Why Does My Snapchat AI Have a Story? Has Snapchat AI Been Hacked?