Foreword - Relevance Ranking for Vertical Search Engines, FIRST EDITION (2014)

Relevance Ranking for Vertical Search Engines, FIRST EDITION (2014)

Foreword

As the information available on the Internet continues to grow, Web searchers increasingly run into a critical but not broadly discussed challenge: finding relevant, rich results for highly targeted and specialized queries.

General Web searching has been with us for a long time, has spawned international powerhouse companies like Yahoo! and Google, and is a staple of everyday life for hundreds of millions of people around the world. It is probable that literally billions of generic searches—for general information, celebrities, sports scores, common products, and other items of interest—are well satisfied by the commonly known major search engines whose names are so familiar as to have become part of the vernacular. But since the Internet has become the business- and daily-life critical worldwide resource it is today, increasingly diverse groups of people are relying on it to look for more and more diverse things—things not so easily found by generic Web search engines.

For example, searches focused on travel planning may often have very specific but implicit assumptions about results, such as the expectation of itineraries listed in order of departure or cost. They might also benefit from additional nonobvious information that could be critically helpful, such as changes in checked baggage policies, road construction near relevant airports, or State Department travel warnings. General search engines have no more clue about these things than they do about dosages of medications, celebrity divorces, slugging percentages, and other narrow, domain-specific information; they don’t have access to site-specific signals. As a result, the role of so-called “vertical” search engines, which focus on specific segments of online content and deep site-specific information, has quietly increased to become essential to most people looking for key items online.

Somewhat hidden from view but no less important than general Web search technology, vertical search algorithms have been key to helping users find household products they care about, movies they want to see, potential dating partners, their perfect automobiles, well-matched insurance policies, and thousands of other things for which generalized text processing algorithms are not well tuned and cannot make the right judgments of relevance for results.

In this context, the term vertical is usually taken to connote in-depth treatment of fairly narrow domains, e.g., medical information, rather than broad ranges of information that meet a very wide range of needs (as you will see, for the purposes of this book, verticalcan also refer to a limited range of search result types, such as entities or measurements or dates, or specific types of information access modalities, such as mobile search). What sets vertical search apart from more general, broad-based search is the fact that relatively specific domain knowledge can be leveraged to find the right pieces of information. Further, an understanding of a more limited set of information-seeking tasks (for example, looking for specific kinds of football statistics for your fantasy team) can also play an important role in satisfying narrower information needs. With fewer but very common tasks carried out by users, it may be easier to infer a user’s intent for a particular vertical search, which could dramatically improve the quality and value of the results for the user.

In addition to the opportunity to provide highly relevant and richer results to information seekers, vertical search engines that focus on specific segments of online content have shown great potential to offer advertisers more contextually relevant, better-targeted audiences for their ads. Given the dependence of the Internet search industry on advertising, this makes vertical search an economically central part of the Internet’s future. There is no doubt that vertical search is starting to play a role of which the significance was probably never imagined in the early days of Internet searching.

As with other forms of search, the heart of successful vertical search is relevance ranking. Specialized understanding of the domain and sophisticated ranking algorithms is critical. Algorithms that work hard to infer a user’s intention when doing a search are the ones that are successful. The ability to use the right signals and successfully compare various aspects of a query and potential retrieved results will make or break a search engine. And that is the focus of this book: introducing and evaluating the critical ranking technology needed to make vertical searching successful.

Although there exist many books on general Web search technology, this new volume is a unique resource, dedicated to vertical search technologies and the relevance ranking technology that makes them successful. The book takes a comprehensive view of this area and aims to become an authoritative source of information for search scientists, engineers, and other interested readers with a technical bent. Despite many years of research on algorithms and methods of general Web search, vertical search deserves its own dedicated study and in-depth treatment because of the unique nature of its structures and applications. This volume provides that focused treatment, covering key issues such as cross-vertical searching, vertical selection and aggregation, news searches, object searches, image searches, and medical domain searches.

The authors represented in this book are active researchers who cover many different aspects of vertical search technology and who have made tangible contributions to the progress of what is clearly a dynamic research frontier. This ensures that the book is authoritative and reflects the current state of the art. Nevertheless—and importantly—the book gives a balanced treatment of a wide spectrum of topics, well beyond the individual authors’ own methodologies and research specialties.

The book presents in-depth and systematic discussions of theories and practices for vertical search ranking. It covers the obvious major fields as well as recently emerging areas for vertical search, including news search ranking, local search ranking, object search ranking, image search ranking, medical domain search ranking, cross-vertical ranking, and vertical selection and aggregation. For each field, the book provides state-of-the-art algorithms with detailed discussions, including background, derivation, and comparisons. The book also presents extensive experimental results on various real application datasets to demonstrate the performance of various algorithms as well as guidelines for practical use of those algorithms. It introduces ranking algorithms for various vertical search ranking applications and teaches readers how to manipulate ranking algorithms to achieve better results in real applications. Finally, the book provides thorough theoretical analysis of various algorithms and problems to lay a solid foundation for future advances in the field.

Vertical search is still a fairly young and dynamic research field. This volume offers researchers and application developers a comprehensive overview of the general concepts, techniques, and applications of vertical search and helps them explore this exciting field and develop new methods and applications. It may also serve graduate students and other interested readers with a general introduction to the state of the art of this promising research area. It uses plain language with detailed examples, including case studies and real-world, hands-on examples to explain the key concepts, models, and algorithms used in vertical search ranking. I think that overall you will find it quite readable and highly informative.

Although not widely known, vertical search is an essential part of our everyday lives on the Internet. It is increasingly critical to users’ satisfaction and increasing reliance on online data sources, and it provides extraordinary new opportunities for advertisers. Given recent growth in the application of vertical search and our increasing daily reliance on it, you hold in your hands the first guidebook to the next generation of information access on the Internet. I hope you enjoy it.

—Ron Brachman

Chief Scientist and Head, Yahoo! Labs