Alternative Local Search Algorithm Designs
Our Search Engineering team has been researching alternative algorithms for delivering highly relevant local search results for web and mobile users, and we’re pretty excited about our progress.
The existing approaches, in a simplified sense, rely upon matching keywords and/or category names based on just the name, keyword and category data associated with the available business listings. While we’ve implemented a fair number of mechanisms to identify near matches, related matches and possible disambiguation choices, there were things that we felt we could do to search our data in a more effective way.
This current round of development has yielded an approach based on a couple of interesting methods: Free Search and Geo Density. Free Search is our TF-IDF (Term Frequency and Inverse Document Frequency) based algorithm that includes all available data in our listing index and applies weights to fields based on our internal search logic. This gives our searches much greater penetration into our data set than the existing search method, providing for an information retrieval method that is more responsive to how people might casually describe what they are searching for.
If Free Search represents our approach to delivering a better “what” component of our searches, Geo Density is our new approach to improving the “where” component. When we build our indexes of our listing data, we are also calculating the relative density of various listing types in the named geographies we can search in. For this beta example, we are calculating these densities against the geography of California, but we will soon expand this to support a much more granular breakdown of density of information (i.e., the search model will be able to understand that the density of coffee shops in Santa Monica, CA is different than the density of coffee shops in Mojave, CA). This density value allows us to dynamically set the radius of our search and apply a weighting (not a hard filter) to results within that radius.
This sort of smart and dynamic distance determination is key to a friendly local search system, as it provides a more intuitive scaling of the results, based on the availability of the thing being searched for and the specificity of the search.
For example, a search for “furniture” in Pasadena, CA will return a focused and relevant list of results in and around Pasadena, while a search for “IKEA” in Pasadena, CA will automatically scale out to include more IKEA stores in the results. The search system can do this quickly and efficiently because it understands the relative density of stores of this type in California and can infer the specificity of the search from the search term used.
As another example, take this search for “coffee shop” in Pasadena, CA and compare it to a similar search for “Starbucks” in Pasadena, CA. Both searches remain well focused on the intended geography, though the area will change slightly due to the specificity of the search term used.
It is important to note that, as can be seen in the examples above, businesses of related categories sometimes occur in the result sets. This is due to the way the system includes and weights the first few categories associated with the initial query. This expansion of the search domain is an experiment to determine whether or not we can provide a more intuitive scope for searches, and will be evolving as we update the system in the near future.
We will be updating the engine in the near future to increase its accuracy and sensitivity, so please leave us feedback and check back soon for an update.
Note: Only California listings are included in our testing at this time.