What is Candidate Answer Passage?
A Candidate Answer Passage is a text segment that a search engine, like Google, identifies as a potential answer to a userβs search query. π·οΈπ‘
π‘ How does it work?
- πΉ User Query: A user enters a question or query into the search bar. βπ
- πΉ Google Scans: It scans millions of web pages to find relevant information. ππ
- πΉ Extraction: It extracts candidate answer passagesβtext snippets that might contain the answer. πβ¨
-
πΉ Evaluation Criteria:
- β Relevance to the query π―
- β Authority of the webpage π
- β Overall Context & Clarity ποΈ
π Why is this important?
- β Candidate answer passages power Featured Snippets π
- β They help search engines deliver quick, accurate answers π
π Example:
Imagine you're searching on Google:
π "How to bake a chocolate cake?" π
Google scans millions of web pages looking for text segments that might answer your question. Each of these segments is a Candidate Answer Passage.
π Examples of Candidate Answer Passages:
- πΉ A recipe blog might say: "To bake a chocolate cake, first preheat your oven to 350 degrees..." π₯π§
- πΉ A cooking forum might say: "When I bake a chocolate cake, I always make sure to..." π½οΈπ©βπ³
π‘ How does Google choose the best one?
- β Relevance β Does it directly answer the query? π―
- β Source authority β Is it from a trusted website? π
π Reference from Google Patent No: US9940367B1

Search Engine Answer Scoring Process
When someone enters a question (query) into a search engine, the system needs to find the most relevant answers from multiple resources (web pages, documents, etc.). The challenge is determining which of the possible answer passages from those resources is the best match for the question.
To solve this, the system assigns scores to the candidate answer passages by evaluating several factors, which help it decide the best answer to show to the user.
Steps in the Process:
Receiving the Question (Query) and Identifying Resources: π€
The system first identifies that the user's input is a question (e.g., "What is the capital of France?").
It then finds resources (like web pages or documents) that seem relevant to this question.
Example: The user searches for "What is the capital of France?"
The search system finds several web pages that mention France, capitals, geography, etc.
Receiving Candidate Answer Passages: π
For each relevant resource, the system extracts specific passages that might contain an answer.
Example: The system picks a sentence from a web page: "Paris is the capital of France."
3. Scoring the Answer Passages: π―
For each passage, the system computes two main scores:
- Query Term Match Score: How well the words in the question match the words in the passage.
- Answer Term Match Score: How well the potential answer (e.g., "Paris") matches what the system expects based on past similar questions.
Example: For the sentence "Paris is the capital of France," the system sees a strong match for the terms "capital" and "France" (from the question), so it gets a high Query Term Match Score.
It also recognizes "Paris" as a likely answer for capital cities, so it gets a high Answer Term Match Score.
4. Generating the Final Answer Score: π
The system combines these scores to produce a final Answer Score. The higher the score, the more confident the system is that this passage contains the correct answer.
Example: The passage "Paris is the capital of France" might receive a very high score because the words match well, and the answer is known to be correct.
Once the system scores all the potential answer passages, it presents the one with the highest score to the user.
Example Continued: If the user searches for "What is the capital of France?", the system will likely return "Paris is the capital of France" as the top result because that passage had the highest score based on the query and answer term match.

Visual Representation of Key Components π
Query Question Processor: Recognizes the user input as a question and sends it to the system to retrieve relevant resources.
Answer Passage Generator: Extracts possible answer passages from the resources.
Answer Passage Scorer: Scores the passages using methods like term matching.
Query Dependent Scorer and Query Independent Scorer: Evaluate the passage from different perspectives (e.g., how well it matches the query or general quality of the passage).
Score Combiner: Combines the different scores to produce the final answer score.

Search Result Interface Example π
This figure illustrates a typical search result interface based on a user query, "How far away is the moon?" It shows a featured answer passage at the top, as well as links to other related web pages below.
User Query Example: "How far away is the moon?"
Relevant Passage: The answer passage shown at 208 gives the distance: "238,900 miles (384,400 km)".
This is the candidate answer passage that the system has selected as the most relevant based on several factors:
- Query Term Match Score: The words "how far," "moon," and "Earth" in the query match the terms in the passage.
- Answer Term Match Score: Since the query is asking for a distance, the system looks for a passage containing a distance measurement, such as "238,900 miles". This match would result in a high Answer Term Match Score.
Finally, the system presents this passage at the top of the search results because it has the highest Answer Score.

About the Moon Section π
This figure shows a section titled "About the Moon", providing different passages about the moonβs orbit, distance from the Earth, and other facts. This example serves as an illustration of how a search system could handle a user query related to the moon.
User Query Example: "How long does it take for the moon to orbit Earth?"
Relevant Passage: The passage that answers this question is labeled 334: "It takes about 27 days (27 days, 7 hours, 43 minutes, and 11.6 seconds) for the Moon to orbit the Earth."
This passage would be identified as a candidate answer passage because it matches terms from the query ("moon," "orbit," "Earth"). The system would compute the following scores for it:
- Query Term Match Score: How well the words in the query ("how long," "moon," "orbit," "Earth") match the words in the passage.
- Answer Term Match Score: Since the query is asking for a time duration, the system checks whether the passage contains an expected type of answer (in this case, a time period of 27 days). If this matches the system's understanding of the query's expected answer, the Answer Term Match Score would be high.
The final Answer Score for this passage would be high because both the query and answer terms are closely aligned, meaning this passage is likely to be presented to the user.

Detailed Process Steps
1. Receive a query that seeks an answer (Step 802): π¨
The system identifies that the input query is a question query. This means the system knows the user is asking a question that requires a specific answer.
The system also gathers resources (e.g., web pages, documents) that are determined to be potentially responsive to this query.
Example: The user asks, βWhat is the capital of France?β
The system identifies that this is a question and fetches resources (like articles, web pages) about France or capital cities.
2. Receive candidate answer passages from the resources (Step 804): π
From the fetched resources, the system extracts passages that are likely to contain an answer. These are called candidate answer passages.
Example: From a web page about France, a candidate answer passage might be: "Paris is the capital of France."
3. Determine a query term match score (Step 806): π
The system calculates a query term match score for each candidate passage. This score measures how well the words in the query match the words in the passage.
Example: In the passage βParis is the capital of France,β the terms "capital" and "France" match the query terms exactly, resulting in a high query term match score.
4. Determine an answer term match score (Step 808): β
Next, the system determines an answer term match score. This score evaluates whether the passage contains a likely answer based on the nature of the question.
Example: The passage contains the term "Paris", which is known to be the answer to the question βWhat is the capital of France?β Therefore, the passage receives a high answer term match score.
5. Calculate a query dependent score (Step 810): π’
The system then combines the query term match score and the answer term match score to create a query dependent score for each passage. This score reflects how well the passage matches the specific query and its expected answer.
Example: Since the passage matches both the question and the expected type of answer, it gets a high query dependent score.
6. Determine a query independent score (Step 812): π
The system also calculates a query independent score. This score is based on factors that donβt depend directly on the query, such as the general quality or relevance of the passage (e.g., how authoritative the source is, or whether the passage is commonly used to answer similar questions).
Example: The system might consider how frequently the passage has been selected as an answer in the past, giving it an additional boost.
7. Generate the final answer score (Step 814): π₯
Finally, the system combines the query dependent score and the query independent score to generate an overall answer score for each passage. This score determines how likely it is that the passage is the correct answer to the query.
Declarative statements (e.g., βThe moon is approximately 238,900 miles from Earth.β) get a higher score β .

Query Independent Scoring π
FIG. 9 illustrates a flow diagram of an example process 900 for scoring answer passages based on query-independent features. This process is implemented in a data processing system, such as one or more computers in a search system π₯οΈ. These computers execute the operations of the answer passage scorer π .
The features in FIG. 9 are illustrative, and different numbers of scoring features can be used when calculating a query-independent score.
Process Overview βοΈ
-
1οΈβ£ Accessing Data (902) π
The system accesses candidate answer passages π, resources π, and resource data π. These passages are generated from the top N ranked resources for a search query π. N can vary but is often the number of search results displayed on the first page π.
-
2οΈβ£ Determining Passage Unit Position Score (904) π
The position of the answer passage within a resource affects the score. Higher placement π (e.g., at the top of the page) results in a higher score.
-
3οΈβ£ Determining Language Model Score (906) π£οΈ
This score evaluates whether the passage follows a language model (e.g., grammar & sentence structure β ). Complete sentences receive higher scores than partial sentences. Structured content (like tables) may not be subject to this scoring.
Additional Language Model Consideration:
- The query-independent scorer π checks whether the passage's text resembles historical answer passages π.
- Uses an n-gram model (often a tri-gram model) to compare phrases.
- More matches to historical answers = higher quality score π.
-
4οΈβ£ Determining Section Boundary Score (908) π§
Passages that cross formatting boundaries (e.g., paragraphs or section breaks) receive a penalty β.
-
5οΈβ£ Determining Interrogative Score (910) β
If a passage contains a question (e.g., βHow far is the moon from Earth?β), itβs generally less helpful β οΈ. Declarative statements (e.g., βThe moon is approximately 238,900 miles from Earth.β) get a higher score β .
-
6οΈβ£ Determining Discourse Boundary Term Position Score (912) π
A passage starting with contradictory/modifying words (e.g., βHowever,β βConversely,β βOn the other handβ) gets a lower score β¬οΈ. If these terms appear within the passage but not at the beginning, the score is higher π‘. Passages without such terms get the highest score π’.
-
7οΈβ£ Determining Resource Scores (914) π
The system scores the resource from which the passage originates using:
- Ranking Score π: Based on the resourceβs position in search rankings.
- Reputation Score β: Measures trustworthiness & expertise.
- Site Quality Score π : Evaluates the overall website quality.
Higher scores = Higher answer passage ranking π.
Combining Scores ποΈ
The query-independent scores can be combined in different ways, such as:
- βοΈ Summing scores β
- βοΈ Multiplying scores βοΈ
- βοΈ Using other combination methods π οΈ
This process ensures that high-quality and reliable answer passages are prioritized for better search results πβ .

FIG. 10: Answer Passage Scoring Process π
This process is executed by a data processing system (e.g., one or more computers in a search system π₯οΈ) that performs the operations of the answer passage scorer.
π Understanding Answer Term Match Score
The answer term match score measures the similarity between the answer terms and the candidate answer passage.
- Users don't explicitly define what they seek in queries.
-
The query-dependent scorer 142 helps by:
- Finding a set of likely answer terms π.
- Comparing these terms to the candidate answer passage to generate a score.
βοΈ Process Breakdown
-
1οΈβ£ Generating a List of Terms (1002) π
Extracts terms from the top-ranked resources π and creates a term vector. -
2οΈβ£ Calculating Term Weights (1004) βοΈ
Each term receives a weight based on:- Number of resources in which it appears π.
- Inverse Document Frequency (IDF) value π’.
-
3οΈβ£ Counting Term Occurrences (1006) π’
The system counts how often each term appears in the candidate answer passage. -
4οΈβ£ Multiplying Term Weights (1008) βοΈ
Each term's weight is multiplied by its occurrence count. -
5οΈβ£ Calculating the Answer Term Match Score (1010) π―
The final score is determined by summing or multiplying the weighted values.
π·οΈ Additional Answer Term Features
-
πΉ Entity Type Identification π
Example: Query: "Who is the fastest man?" Expected entity type: "man" π¨βπ¦±. -
πΉ Matching Entities in Answer Passages π
If no matching entity type is found, the answer term match score is reduced π. -
πΉ Scoring Techniques ποΈ
The score can be binary (1 = match, 0 = no match) or probabilistic.