I am now a member of the Doctoral Faculty of The Graduate School and University Center’s PhD Program in Computer Science. If you are interested in doing research in Big Data Analytics or Data Mining (Sampling and Filtering Massive datasets), please shoot me an email.
I will be teaching “Information Retrieval” course in Fall 2017.
Paper titled “Introducing computational thinking through hands-on projects using R with applications to calculus, probability and data analysis” is now published. This is work done in collaboration with Math professors (Nadia Benakli, Boyan Kostadinov, Satyanand Singh) at CityTech.
Abstract: The goal of this paper is to promote computational thinking among mathematics, engineering, science and technology students, through hands-on computer experiments. These activities have the potential to empower students to learn, create and invent with technology, and they engage computational thinking through simulations, visualizations and data analysis. We present nine computer experiments and suggest a few more, with applications to calculus, probability and data analysis, which engage computational thinking through simulations, visualizations and data analysis. We are using the free (open-source) statistical programming language R. Our goal is to give a taste of what R offers rather than to present a comprehensive tutorial on the R language. In our experience, these kinds of interactive computer activities can be easily integrated into a smart classroom. Furthermore, these activities do tend to keep students motivated and actively engaged in the process of learning, problem solving and developing a better intuition for understanding complex mathematical concepts.
Congratulations to my student Fatima for getting into the Microsoft summer research program. It was an extremely competitive process, with hundreds of students applying for just eight positions. She will be doing research on Big Data, and learning from experts in the field. DS3 includes both course work in data science and group research projects. The summer school is taught by leading scientists at Microsoft Research, and is held at the new Microsoft Research office in the heart of New York City. More details can be found at their site: https://ds3.research.microsoft.com
Paper titled “Performance modeling of CMOS inverters using support vector machines (SVM) and adaptive sampling” is accepted at the Journal of Microprocessors and Microsystems (Elsevier).
Abstract: Integrated circuit designs are verified through the use of circuit simulators before being reproduced in real silicon. In order for any circuit simulation tool to accurately predict the performance of a CMOS design, it should generate models to predict the transistor’s electrical characteristics. The circuit simulation tools have access to massive amounts of data that are not only dynamic but generated at high speed in real time, hence making fast simulation a bottleneck in integrated circuit design. Using all the available data is prohibitive due to memory and time constraints. Accurate and fast sampling has been shown to enhance processing of large datasets without knowing all of the data. However, it is difficult to know in advance what size of the sample to choose in order to guarantee good performance. Thus, determining the smallest sufficient dataset size that obtains the same accurate model as the entire available dataset remains an important research question. This paper focuses on adaptively determining how many instances to present to the simulation tool for creating accurate models. We use Support Vector Machines (SVMs) with Chernoff inequality to come up with an efficient adaptive sampling technique, for scaling down the data. We then empirically show that the adaptive approach is faster and produces accurate models for circuit simulators as compared to other techniques such as progressive sampling and Artificial Neural Networks.