Internet search engines have evolved exponentially since the start, leaving behind its function of merely looking for words. According to web application security course specialists from the IICS, the development of these tools has reached the point where the user can search Google: “What is the height of the tower in Paris? ” and the search engine will know that the question refers to the Eiffel tower (which is over 320 meters tall), and the user should not even know the name of the tower, a reference will be enough; the browser will do the rest.
The secret behind the improvement of these tools is machine learning; machine learning algorithms are used to build long sequences of numbers (vectors) that represent data such as text on a web page or multimedia content. Tools like the Microsoft’s search engine Bing capture millions of these vectors for all different kinds of content that the browser indexes.
According to web application security course specialists, to look for these vectors, Microsoft developed an algorithm called “Space Partition Tree and Graph” (SPTAG) which finds what the company calls “approximate nearest neighbors” (ANN), a way of referring to the vectors more similar to what the user entered in the search engine.
Going back to the example of the Eiffel tower, this is the way in which the user gets the desired answer to their search without explicitly mentioning the name of the tower. The vectors closest to the search entry will come by themselves to sites related to constructions in Paris and, by default, to contents related to the Eiffel tower.
Microsoft has just launched the SPTAG algorithm as an open-source licensed MIT development through GitHub. In this way, the developers will be able to use the algorithm to find their own sets of vectors in a simple and fast way, according to the specialists of the web application security course specialists, a single computer can work with 250 million vectors and respond to more than a thousand searches per second.
Experts from the International Institute of Cyber Security (IICS) say that Microsoft has been trying to put the artificial intelligence within the reach of all developers to consolidate a single tool capable of applying to a wide range of problems and tasks; thus, as the work of the developers increases, they will be able to use the SPTAG algorithm to improve their own projects and services.