Sorting Search Results

How to use scoring functions to customize the order of search results

About Sorting Search Results

Are the most useful documents listed first? This one simple question sums up the field of search result ranking—a field to which considerable engineering brain-power has been devoted in recent years. Starting from the relatively simple matter of matching the search terms entered by the user, a wide variety of criteria can be brought to bear in order to calculate the optimal order in which to display the documents that contain the search terms.

Why Customize Sorting?

The Searchify server applies its own sorting techniques and returns results in order by age (most recent first). In many cases, this default behavior will be sufficient.

When the requirements of your application, user needs, or business logic demand it, you can customize Searchify's sorting technique to take control of the order in which search results are displayed. By fine-tuning the search output, you can improve the experience for users and also direct them towards the results you would prefer them to see.

How It Works

In a word: mathematics. With every search query, you can pass a scoring function, a mathematical formula that encapsulates your ranking preferences. This function overrides Searchify's default scoring formula. The formula can be as simple or as complex as you need it to be.

The implementation involves two parts:

  • Scoring Function:a custom sorting algorithm is expressed in a formula that is constructed using Searchify's built-in functions and values. If you have defined custom document variables, you can use them as well. For each document that matches the query's search terms, the scoring function is evaluated, substituting any needed values from that particular document. For details about scoring function syntax, see Scoring Function Formulas.
  • Query:the ID number of a scoring function can be passed as a parameter to any query. You can pass only one scoring function with each query. If needed, you can redefine scoring functions in real time between queries to reflect changing circumstances.

Anatomy of a Scoring Function

At any given time, your application can have up to six scoring functions defined. The functions are named with the integers from 0 to 5. Function 0 is the default and will be applied if no other is specified; it starts out with an initial definition of-age, which sorts query results from most recently indexed to least recently indexed (newest to oldest).

Syntax Notes: Variable and function names are case sensitive. The following are all floats: expressions (except conditions), variable values,

A scoring function is an expression built up from some or all of the following components:

  • relevance:a numeric score indicating how closely the document matches the search terms. Calculated by the Searchify server based on how often each search term appears in the text, and whether the text contains all of the terms.
  • age:a number that tells how fresh the document is. Calculated based on the document's timestamp field, which contains either Searchify's automatic timestamp (indicating the time when the document was indexed) or a custom timestamp you have applied yourself.
  • Document variables:custom values that you have associated with documents in the index. For example, you might store a count of how many users commented on a document.
  • Query variables:values that are passed in with a query. For example, the user's location.
  • Functions:built-in mathematical and flow control functions provided by Searchify, including max(), min(), miles(), km(), if(), and more.
  • Operators:+ - * /

Quick Start: Copy & Paste

Tweak the following code snippets for your needs, and you're ready to go

Before You Start:
  • Download and instantiate the Java client, if you have not already done so.
  • Know your index's public URL. You'll need it to instantiate the client. Find the public URL on the Dashboard.

Define the Scoring Function

The possibilities are limitless, but here are a few ideas to get you started.

// Make function 0 sort most recent first
index.addFunction(0, "-age");

// Make function 1 sort by textual relevance
index.addFunction(1, "relevance");

// Make function 2 sort by distance between document and
// a geographic location passed in the query, sorting
// from nearest to farthest.
// doc.var[0] and doc.var[1] have the lat/long of the document.
// query.var[0] and query.var[1] have the latitude and
// longitude of an outside location (probably the user's locale).
index.addFunction(2, "-miles(query.var[0], query.var[1], doc.var[0], doc.var[1])");

// Make function 3 sort by a combination of textual
// relevance and two document variables, which are
// given different weights by the use of the log() function
index.addFunction(3, "relevance * log(doc.var[0]) * doc.var[1]");

// Make function 4 sort by the greater of document variable 0 or 1
index.addFunction(4, "max(doc.var[0], doc.var[1])");

Pass the Scoring Function in a Query

To indicate which scoring function to use, pass its ID number as a parameter to the query. This query will use scoring function 2:

index.search(Query.forString(query).withScoringFunction(2));

More Information