学术活动
Statistical Aggregation in Massive Data Environments
2013-06-06
来源:科技处 点击次数:主讲人:Prof. Lin Nan
(Department of Mathematics, College of Arts & Sciences Division of Biostatistics, School of Medicine Washington University in St. Louis)
时 间:6月6日(周四)16:00-17:00
地 点:威尼斯欢乐娱人城1099北一区文科楼 708 教室
摘 要: Due to their size and complexity, massive data sets bring many computational challenges for statistical analysis, such as overcoming the memory limitation and improving computational efficiency of traditional statistical methods. In this talk, I will discuss the statistical aggregation strategy to conquer such challenges posed by massive data sets. Statistical aggregation partitions the entire data set into smaller subsets, compresses each subset into certain low-dimensional summary statistics and aggregates the summary statistics to approximate the desired computation based on the entire data. Results from statistical aggregation are required to be asymptotically equivalent. Statistical aggregation is particularly useful to support sophisticated statistical analyses for online analytical processing in data cubes. We will detail its application to two large families of statistical methods, estimating equation estimation and U-statistics.