![]() |
Second ACM International Conference |
Sponsored by: |
| Call for Papers | Proceedings | Important Dates | Program | Submissions | Organization & Prog. Committee |
Registration & Venue |
Late Breaking-Results |
AbstractBuilding and operating large-scale information retrieval systems used by hundreds of millions of people around the world provides a number of interesting challenges. Designing such systems requires making complex design tradeoffs in a number of dimensions, including (a) the number of user queries that must be handled per second and the response latency to these requests, (b) the number and size of various corpora that are searched, (c) the latency and frequency with which documents are updated or added to the corpora, and (d) the quality and cost of the ranking algorithms that are used for retrieval. In this talk I'll discuss the evolution of Google's hardware infrastructure and information retrieval systems and some of the design challenges that arise from ever-increasing demands in all of these dimensions. I'll also describe how we use various pieces of distributed systems infrastructure when building these retrieval systems. Finally, I'll describe some future challenges and open research problems in this area. About the speaker
Jeff joined Google in 1999 and is currently a Google Fellow in Google's Systems Infrastructure Group. He has co-designed/implemented five generations of Google's crawling, indexing, and query serving systems, and co-designed/implemented major pieces of Google's initial advertising and AdSense for Content systems. He has also been heavily involved in the design and implementation of Google's distributed computing infrastructure, including the MapReduce and BigTable systems, dabbled with system software for statistical machine translation, and worked on a variety of internal and external developer tools. |