Monte Carlo Methods for Managing Uncertain Enterprise Data Fei Xu Microsoft Research Date: May 24, 2010 (Monday) Place: E & C S BUILDING Second Floor Room 2120 TIME: 10:00 (DONUTS) 10:10 (TALK) Modern enterprises must manage uncertain data for purposes of risk assessment and decision making under uncertainty. In this talk, I will first discuss MCDB (the Monte Carlo Database System), a system for managing uncertain data that is based on a Monte Carlo approach. The Monte Carlo approach is well suited for managing uncertain enterprise data. MCDB can support industrial strength business-intelligence queries over uncertain warehouse data. Moreover, MCDB's extensible approach to specifying uncertainty can also capture complex stochastic prediction models, allowing sophisticated ``what-if'' analyses within the DBMS. However, the Monte Carlo computations can be highly CPU intensive. But it offers the potential for massive parallelization. To realize this potential, we developed a new system, called MC3 (Monte Carlo Computation on a Cluster), that extends the MCDB approach to the map-reduce processing framework. MC3 can exploit the robustness and scalability of map-reduce, and can handle data stored in non-relational formats. In the second half of the talk, I will discuss the new system. (Joint work with Peter Haas, Kevin Beyer, Vuk Ercegovac, Bo Shekita, Chris Jermaine, Ravi Jampani, Luis Perez, Mingxi Wu) Biography: Fei Xu obtained his PhD degrees from University of Florida in 2009. Since then he has been a research software development engineer in Bing Infrastructure Team Microsoft. His research interests are in large scale data management. He is especially interested in managing data uncertainty, Cloud Computing and managing biological data