Curriculum Vitae

BERKANT BARLA CAMBAZOGLU

Work address Home address
Yahoo Labs Barcelona
Avda. Diagonal 177, 8th Floor 08018, Barcelona, Catalunya, Spainphone: +34 93 183 8830 fax: +34 93 183 8901
  Atlantida, 18, 1st floor, no 4, 08003 Barcelona, Catalunya, Spain
email: barla@yahoo-inc.com web: http://labs.yahoo.com/author/barla/
Work Experience
  • Senior Manager Research (Web Retrieval Group) Yahoo Labs, April 2013 – April 2015.
  • Senior Research Scientist Yahoo Labs, March 2012 – April 2013.
  • Research Scientist Yahoo Labs, July 2010 – March 2012.
  • Postdoctoral Researcher Yahoo Labs, January 2008 – July 2010. Supervisor: Ricardo Baeza-Yates.
  • Postdoctoral Researcher
    Dept. of Biomedical Informatics, Ohio State University, September 2006 – December 2007.
    Supervisor: Joel H. Saltz.
  • Postdoctoral Researcher
    Dept. of Computer Engineering, Bilkent University, January 2006 – September 2006. Supervisor: Cevdet Aykanat.
Education
  • PhD in Computer Science
    Dept. of Computer Engineering, Bilkent University, February 2000 – January 2006. Thesis: Models and algorithms for parallel text retrieval.
    Supervisor: Cevdet Aykanat.
  • MSc in Computer Science
    Dept. of Computer Engineering, Bilkent University, December 1997 – February 2000. Thesis: A hypergraph-partitioning-based remapping model for image-space parallel volume rendering.
    Supervisor: Cevdet Aykanat.
  • BSc in Computer Science
    Dept. of Computer Engineering, Bilkent University, September 1992 – December 1997.
Funded Projects
  • “Mining and understanding of multilingual content for intelligent sentiment enriched context and social oriented interpretation”, funded by the European Union 7th Frame- work Program (FP7), Principle Investigator, November 2013 – October 2016.
  • “Content aware searching, retrieval and streaming”, funded by the European Union 7th Framework Program (FP7), Principle Investigator, January 2010 – December 2012.
  • “In vivo imaging core middleware” developed under the caBIG project, funded by the National Cancer Institute (NCI), Investigator, September 2006 – September 2007.
  • “Search Engine for South-East Europe (SE4SEE)” application developed under the SEE-GRID project, funded by the European Union 6th Framework Program (FP6), Investigator, May 2004 – April 2006.
  • “Efficient parallel crawling of Web content”, funded by The Scientific & Technological Research Council of Turkey (TU ̈BI ̇TAK) under project EEAG-103E028, Investigator, April 2004 – March 2006.
  • “Task scheduling algorithms for PC clusters”, funded by The Scientific & Technological Research Council of Turkey (TU ̈BI ̇TAK) under project EEAG-199E013, Investigator, September 1999 – March 2002.
Tutorials
  • B. B. Cambazoglu and R. Baeza-Yates, “Scalability and efficiency challenges in large- scale web search engines”, Proceedings of the 24th ACM International Conference on Information and Knowledge Management, in press, Melbourne, Australia, October 2015.
  • B. B. Cambazoglu, “Effectiveness and efficiency issues in web retrieval systems”, 10th European Summer School in Information Retrieval, Thessaloniki, Greece, August/ September 2015.
  • B. B. Cambazoglu and R. Baeza-Yates, “Scalability and efficiency challenges in large- scale web search engines”, Proceedings of the 8th ACM International Conference on Web Search and Data Mining, pp. 411–412, Shanghai, China, February 2015.
  • B. B. Cambazoglu and R. Baeza-Yates, “Scalability and efficiency challenges in large- scale web search engines”, Proceedings of the 37th International ACM SIGIR Con- ference on Research and Development in Information Retrieval, p. 1285, Gold Coast, Australia, July 2014.
  • B. B. Cambazoglu and R. Baeza-Yates, “Scalability and efficiency challenges in large- scale web search engines”, 3rd MUMIA Training School on Information Retrieval, Heraklion, Crete, Greece, July 2014.
  • R. Baeza-Yates and B. B. Cambazoglu, “Scalability and efficiency challenges in large- scale web search engines”, Proceedings of the 23rd International World Wide Web Conference, pp. 185–186, Seul, South Korea, April 2014.
  • B. B. Cambazoglu and R. Baeza-Yates, “Scalability and efficiency challenges in com- mercial web search engines”, Proceedings of the 35th International ACM SIGIR Confer- ence on Research and Development in Information Retrieval, p. 1124, Dublin, Ireland, July 2013.
  • B. B. Cambazoglu, “Algorithmic techniques for reducing energy consumption of com- mercial web search engines”, The 2nd COST IC804 Training School on Energy Effi- ciency in Large Scale Distributed Systems, Palma de Mallorca, Spain, April 2012.
  • R. Baeza-Yates and B. B. Cambazoglu, “Distributed web retrieval”, The 20th Interna- tional Conference on Advanced Information Systems Engineering, Montpellier, France, June 2008.
Posters
  • F. B. Sazoglu, I. S. Altingovde, R. Ozcan, B. B. Cambazoglu, and O. Ulusoy, “Prop- agating expiration decisions in a search engine result cache”, Proceedings of the 24th International World Wide Web Conference (Companion Volume), in press, Florence, Italy, May 2015.
  • X. Bai, B. B. Cambazoglu, F. Gullo, A. Mantrach, and F. Silvestri, “Exploiting search history of users for news personalization”, TechPulse’14 (internal Yahoo conference), Santa Clara, CA, December 2014.
  • F. B. Sazoglu, B. B. Cambazoglu, R. Ozcan, I. S. Altingovde, and O. Ulusoy, “Strate- gies for setting time-to-live values in result caches”, Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 1881–1884, San Francisco, CA, October 2013
  • H. Guler, B. B. Cambazoglu, and O. Ozkasap, “Task allocation in volunteer computing networks under monetary budget constraints”, The Joint Workshop on Pricing and Incentives in Networks and Systems, Pittsburgh, PA, June 2013.
  • I. K. Paparrizos, B. B. Cambazoglu, and A. Gionis, “Machine learned job recom- mendation”, Proceedings of the 2011 ACM Conference on Recommender Systems, pp. 325–328, Chicago, IL, October 2011
  • S. Alici, I. S. Altingovde, R. Ozcan, B. B. Cambazoglu, and O. Ulusoy, “Timestamp- based cache invalidation for search engines”, Proceedings of the 20th International World Wide Web Conference (Companion Volume), pp. 3–4, Hyderabad, India, March/ April 2011
  • S. Tatikonda, F. Junqueira, B. B. Cambazoglu, and V. Plachouras, “On efficient posting list intersection with multicore processors”, Proceedings of the 32nd Interna- tional ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 738–739, Boston, MA, July 2009
  • J. Kong, O. Sertel, J. Prescott, A. Ruiz, H. Shimada, G. Lozanski, J. Wang, D. Zhang, N. Mayr, W. Yuh, A. Shanaah, F. Racke, B. B. Cambazoglu, M. Ujaldon, U. Catalyurek, K. L. Boyer, J. Saltz, and M. N. Gurcan, “Medical image understanding: applications to histopathology and radiology”, A One-Day Ohio Imaging Symposium, Columbus, OH, October 2007
  • A. Sharma, T. Pan, B. B. Cambazoglu, J. Permar, S. Hastings, M. N. Gurcan, T. M. Kurc, and J. Saltz, “IVI toolkit: grid-enabling biomedical imaging applications and image archives in Cancer Biomedical Informatics Grid (caBIG)”, Joint NCRI Informatics – caBIG Conference, London, UK, July 2007
  • M. N. Gurcan, T. Pan, A. Sharma, B. Rutt, B. B. Cambazoglu, and J. Saltz, “GridIm- age: a grid computing platform for radiological and pathological image analysis”, The Annual Meeting of the Society for Imaging Informatics in Medicine, Providence, RI, June 2007.
  • O. Sertel, J. Kong, B. B. Cambazoglu, H. Shimada, U. V. Catalyurek, J. Saltz, and M. N. Gurcan, “Computerized analysis of pathological images: detection of stroma re- gions for neuroblastoma classification”, The 6th Annual OSUMC Graduate and Post- graduate Research Day, Columbus, OH, March 2007
  • J. Kong, O. Sertel, B. B. Cambazoglu, H. Shimada, K. Boyer, J. Saltz, and M. N. Gur- can, “Computerized analysis of pathological images: grading of neuroblastic differenti- ation”, The 6th Annual OSUMC Graduate and Postgraduate Research Day, Columbus, OH, March 2007.
Academic Services
  • Workshop co-organizer: LSDS-IR’10, LSDS-IR’11, LSDS-IR’14. ⋄ Proceedings chair: WSDM’09, ECIR’12.
  • Poster chair: ECIR’12.
  • Area chair: SIGIR’13, SIGIR’14.
  • Session chair: CIKM’11, CIKM’13.
  • Mentor: SIGIR’11, ECIR’12.
  • Program committee member: ISCIS’08, ISCIS’09, WSDM’09, WWW’11, AICCSA’11, CIKM’11, TPDL’11, ECIR’12, WSDM’12, SIGIR’12, WSDM’13, ECIR’13 (poster track), LSDS-IR’13, OAIR’13, CIKM’13 (demo track), SIGIR’13, KDD’13, ICDM’13 (PhD student forum), HPC4BD’14, ECIR’14 (tutorials), WSDM’14, WWW’14, SI- GIR’14, KDD’14, KDD’14 (industry & government track), HPC4BD’15, WWW’15, SIGIR’15 (short papers track), SIGIR’15 (demo track), KDD’15, KDD’15 (indus- try & government track), ECML-PKDD’15 (industry, government & NGO track), EMNLP’15.
  • Journal paper reviewer: IEEE TPDS, IEEE TKDE, ACM TOIS, ACM TOIT, ACM TWEB, IP&M, KAIS, WWW, COSREV, KER, CS&I
Talks, Demos and Invited Lectures
  • B. B. Cambazoglu, “Query processing optimizations for multi-site web search engines”, invited talk at the Ohio State University, Columbus, OH, March 4, 2014.
  • B. B. Cambazoglu, “Spark: An entity ranking system for web search”, invited talk at Norwegian University of Science and Technology, Trondheim, Norway, February 21, 2014.
  • B. B. Cambazoglu, “Introduction to supervised machine learning concepts”, invited lec- ture at Norwegian University of Science and Technology, Trondheim, Norway, February 21, 2014.
  • B. B. Cambazoglu, “Spark: Entity ranking system in web search”, talk at TechPulse’13 (internal Yahoo conference), Santa Clara, CA, December 17–19, 2013.
  • B. B. Cambazoglu, “Entity ranking in Spark: An architectural overview”, talk at the Search Science Workshop, Sunnyvale, CA, March 8, 2013.
  • B. B. Cambazoglu, “Yahoo Boxes”, demonstration at TechPulse’12 (internal Yahoo conference), Santa Clara, CA, December 11–13, 2012.
  • B. B. Cambazoglu, “Performance challenges in large-scale Web search engines”, invited talk at Jornadas T ́ecnicas de RedIRIS 2010, C ́ordoba, Spain, November 19, 2010.
  • B. B. Cambazoglu, “Improving search”, talk at Yahoo Search Science Workshop, Santa Clara, CA, February 25, 2009.
  • B. B. Cambazoglu, “YMiss!”, talk at Yahoo Search Workshop, Santa Clara, CA, July 30, 2008.
  • B. B. Cambazoglu, “In vivo imaging middleware and its applications”, talk at the 93rd Scientific Assembly and Annual Meeting of the Radiological Society of North America, Chicago, IL, November 27, 2007.
  • B. B. Cambazoglu, A. Sharma, T. Pan, “Federated query processing infrastructure”, demonstration at the caBIG In Vivo Imaging Workspace Meeting, Saint Louis, MO, October 11, 2007.
  • B. B. Cambazoglu, A. Sharma, T. Pan, “caMicroscope: remote pathology image viewer”, demonstration at the caBIG In Vivo Imaging Workspace Meeting, Saint Louis, MO, October 11, 2007.
  • T. Pan, J. Permar, A. Sharma, and B. B. Cambazoglu, “In vivo imaging middleware security infrastructure”, demonstration at the caBIG In Vivo Imaging Workspace Meet- ing, Washington, D.C., April 13, 2007.
  • T. Pan, A. Sharma, and B. B. Cambazoglu, “GridImage: grid computing to support human markup and multiple CAD systems”, talk and demonstration at the Annual caBIG Meeting, Washington, D.C., February 6, 2007.
  • T. Pan, A. Sharma, B. B. Cambazoglu, M. N. Gurcan, and J. Saltz, “VirtualPACS federation toolkit”, demonstration at the 92nd Scientific Assembly and Annual Meeting of the Radiological Society of North America, Chicago, IL, Nov. 27–Dec. 1, 2006.
  • B. B. Cambazoglu and E. Karaca, “SEE-GRID project regional application: search engine for South-East Europe”, talk and demonstration at the SEE-GRID Project Final Review Meeting, Istanbul, Turkey, May 3, 2006.
  • B. B. Cambazoglu, “The SE4SEE grid application”, invited talk at the TUBITAK/ ULAKBIM National Grid Workshop, Ankara, Turkey, September 21, 2005.
  • B. B. Cambazoglu, “SE4SEE: a grid-enabled search engine”, talk at the SEE-GRID Project Review Meeting, Opatija, Croatia, May 30, 2005.
Patents
  • X. Bai, B. B. Cambazoglu, F. Gullo, A. Mantrach, and F. Silvestri, “Exploiting Search History of Users for News Personalization”, Filed as a patent application to the US Patent and Trademark Office.
  • J. Seedorf, M. Stiemerling, S. Niccolini, F. Junqueira, B. B. Cambazoglu, I. Kelly, V. Leroy, and M. Serafini, “Search engine and method for performing a search for objects that correspond to a search request”, Filed as a patent application to the US Patent and Trademark Office.
  • F. Junqueira, B. B. Cambazoglu, V. Plachouras, and S. Tatikonda, “Low-latency post- ing list intersection on multi-core architectures”, Patent granted by the US Patent and Trademark Office.
Defensive Publications
  • X. Bai, B. B. Cambazoglu, R. Baeza-Yates, and G. F. Medina, “A machine learned query forwarder to improve efficiency of distributed search engines”.
  • R. Blanco, I. Kelly, B. B. Cambazoglu, F. P. Junqueira, and V. Leroy, “Using cache invalidation to assign documents to indexes in a distributed search engine”.
  • U. Brefeld, B. B. Cambazoglu, and F. P. Junqueira, “Document assignment in multi- site search engines”.
  • X. Bai, B. B. Cambazoglu, and F. P. Junqueira, “Passively crawling through user feedback”.
Contributed Software
  • Gemini Search, 2014.
    Sponsored search system of Yahoo. Status: Under continuous development.
  • Bucket X, 2013.
    A search product to answer list queries. Status: Under bucket test.
  • Spark, 2012.
    Entity ranking system for Yahoo web search.
    Status: Deployed in production (http://search.yahoo.com).
  • Yahoo Audio Visual Answers, 2011
    Web service for generating audio-visual answers to questions posted in Yahoo Answers. Status: Research prototype (internally available).
  • Yahoo Boxes, 2011
    Software platform for injecting content from Yahoo web properties into web sites. Status: Research prototype (internally available).
  • Yahoo Virtual Toolbar, 2011
    Within-page toolbar for federating different Yahoo search verticals. Status: Research prototype (internally available).
  • SciRank, 2009.
    Scientific collaboration network.
    Status: Under development (internally available).
  • YMiss!, 2008.
    Search engine relevance debugging tool.
    Status: Deployed in production (internally available).
  • caMicroscope, 2007.
    Grid-enabled remote pathology image viewer and image processing infrastructure. Status: Internally available.
  • In vivo imaging middleware, 2007.
    Middleware for interoperability between DICOM and caGrid.
    Status: The latest release available at http://gforge.nci.nih.gov/projects/middleware/.
  • VirtualPACS, 2006.
    Federation toolkit for DICOM databases distributed over the grid. Status: The latest release available at http://www.virtualpacs.org/.
  • SE4SEE (Search Engine for South-East Europe) search engine, 2006. Grid-enabled search engine, running on the SEE-GRID infrastructure. Status: Currently inactive.
  • kPaToH, 2005.
    Efficient direct K-way hypergraph partitioning toolkit. Status: Executables are available upon request.
  • Harbinger, 2003.
    Machine learning toolkit supporting a wide range of classifiers. Status: Available at http://bmi.osu.edu/~barla/coding/HMLT/.
  • Skynet, 2003.
    Experimental search engine, running on a 48-node PC cluster. Status: Internally available.
  • Heaven bulletin board service, 1998. Early social networking site.
    Status: Currently inactive.