What is Query Expansion? 💡
Query Expansion is the process of enhancing a user’s original search query by adding synonyms or related terms. This makes the query more comprehensive, allowing the search engine to retrieve more relevant results.
📌 In Our System:
- 🔹 Closely linked to Statistical Machine Translation.
- 🔹 Driven by a context map storing synonyms, usage contexts, and relevance scores. ✅
📊🗺️ The Role of the Context Map & Synonym Scoring
What Is the Context Map?
The context map (580) is a data structure that contains words from past queries along with potential synonyms.
- 🔹 Associated Context: Information about words appearing before and after the original word.
- 🔹 A Score: A numerical value representing the likelihood that a synonym is appropriate in a given context.
How Is the Score Determined?
The score is derived from the translation likelihood produced by the machine translation model, which includes:
- 🌍 Language Probability: How natural the output is in the target language.
- 🔀 Translation Probability: The likelihood that the output text is a valid translation of the input.
Example:
For the word “tie”, the context map might list:
- 🪢 “knot” (high score, fits well in “how to tie a”)
- 👔 “windsor” (lower score, more relevant in “tie a windsor knot”)
🔎 The Query Expansion Process
Step 5: Match Contexts and Select the Best Synonym (Step 650)
- 🔹 Context Matching: The system compares the word’s context in the original query to those in the context map.
- 🔹 Left Context: Example: “how to tie a bow” → “how to tie a”
- 🔹 Right Context: Example: “how to tie a bow” → “a bow”
The synonym with the highest score is selected.
Step 6: Expand the Query with the Selected Synonym (Step 660)
- 🔹 Appending: Simply add the synonym to the query.
- 🔹 Reformulation: Use logical OR to include synonyms.
Example: “how to tie a bow” → “how to (tie OR knot) a bow”
Step 7: Use the Expanded Query to Search the Corpus (Step 670)
The expanded query retrieves more comprehensive and relevant results by including synonyms.
📌 On-Line vs. Off-Line Translation for Query Expansion
🖥️ On-Line (Synchronous) Translation
The search query is translated on the fly.
- 🔹 How It Works: The system identifies synonyms by comparing original and translated queries.
- 🔹 Example: “how to become a mason” → “how to be a bricklayer” (revealing synonyms).
📁 Off-Line (Asynchronous) Translation
Translations for batches of queries are pre-computed and stored.
- 🔹 How It Works: New queries use pre-existing translations for expansion.
- 🔹 Benefit: More efficient for resource-intensive translation tasks.
🏆 Real-World Example: Expanding a Query
Example Query:
Original Query: “how to tie a bow”
- 🔹 Word Selected for Expansion: “tie”
- 🔹 Context Map Lookup: “knot” (high score), “windsor” (lower score).
- 🔹 Reformulated Query: “how to (tie OR knot) a bow”
- 🔹 Outcome: Improved search results including both terms.