A major obstacle for developing machine learning models using big data is the challenging and time consuming process of identifying and training an adequate predictive model. Therefore, machine learning mode building is a highly iterative exploratory process where most scientists work hard to find the best model or algorithm that meets their performance requirement. In practice, there is no one-model-fits-all solutions, thus, there is no single model or algorithm that can handle all data set varieties and changes in data that may occur over time. All machine learning algorithms require user defined inputs to achieve a balance between accuracy and generalizability, this is referred to as hyperparameter optimization. This iterative and explorative nature of the building of distributed process is prohibitively expensive with big datasets. In this project, we addressed different issues about the hyperparamter optimization problem including the scalability, and controlability.
Although the research area of automated feature engineering has attracted much interest lately, both in academia and industry, the scalability and efficiency of the existing systems and tools are still practically unsatisfactory. This project focuses on scalable and interpretable automated feature engineering, that optimizes input features' quality to maximize the predictive performance according to a user-defined metric.
Project Publications:
- S. Amashukeli, R. Elshawi, S. Sakr. iSmartML: An Interactive and User-Guided Framework for Automated Machine Learning. In HILDA 2020 : Workshop on Human-In-the-Loop Data Analytics. link
- A. Abd Elrahman, M. El Helw, R. Elshawi, S. Sakr. D-SmartML: A Distributed Automated Machine Learning Framework. In2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS) 2020 Nov 1 (pp. 1215-1218). link
- S. Dyrmishi, R. Elshawi, S. Sakr. A decision support framework for automl systems: A meta-learning approach. In 2019 International Conference on Data Mining Workshops (ICDMW) 2019 Nov 8 (pp. 97-106). IEEE. link
- R. Elshawi, S. Sakr. Automated Machine Learning: Techniques and Frameworks. InEuropean Big Data Management and Analytics Summer School 2019 Jun 30 (pp. 40-69). Springer, Cham. link
- R. Elshawi, M. Maher, S. Sakr. Automated machine learning: State-of-the-art and open challenges. arXiv preprint arXiv:1906.02287. 2019 Jun 5. link
- H.Eldeeb , S. Amashukeli, R. El Shawi. BigFeat: Scalable and Interpretable Automated Feature Engineering Framework. IEEE BigData 2022.
Contact Information
Radwa El Shawi
Radwa [dot] elshawi [at] ut [dot] ee