- Yi Li (Nanyang Technological University)
- David P. Woodruff (Carnegie Mellon University)
The more data we have, the more data we need to process. Whether it is internet traffic or biological data, the hardware is never fast enough. The aim of this workshop is to focus on analysing data under new models: when the data cannot be stored (e.g., identifying viruses in the Internet traffic), when we use multiple cores to analyse the data, and when we are generating short sketches of the data to be sent and analysed by someone else. We believe a new set of algorithmic techniques (which rely mostly on statistics and on data structures) can be used in these models, and wish to find such techniques and employ them.
This workshop intends to bring together researchers working on algorithmic and mathematical aspects of modern data science, with the aim of identifying a set of core techniques and principles that form a foundation for the subject. Our emphasis will be on topics such as dimensionality reduction, randomized numerical linear algebra, probability in high dimensions, sparse recovery, streaming and sublinear-time algorithms.