PAKDD Tutorial -- Harnessing the Power of Generative Adversarial Networks Style Learning for Tabular Data Generation
Generative Adversarial network (GAN) model and its variants have shown to be effective in producing high-quality data in areas of Computer Vision, Text Mining and Natural Language Processing. GAN constitutes of two parts -- generator and discriminator, trained in an end-to-end manner in a game-theoretic manner. Tremendous success of GANs in producing high-quality structured data has inspired many researchers to utilize similar modelling for producing tabular data. Tabular data is a combination of apparently unrelated columns of types numeric, rank, and categorical features which makes the direct application of GAN-based deep learning methods quite challenging. This tutorial is aimed at discussing recent advancements in tabular data generation with GAN-style learning.
In this tutorial, we will start by providing a brief review of recent literature of various GAN-based techniques for tabular data generation. We will discuss various characteristics of tabular data and highlight the challenges of tabular data generation. We will also discuss the need for standard evaluation by proposing a centralized repository for comparing various tabular data generation methods. We will conclude this tutorial with a discussion of applications of tabular data generation in privacy-preserving analytics, robustness analysis (concept drift analysis, adversarial attacks analysis) and anomaly detection.
Tutorial Presenters
Dr. Nayyar Zaidi
Yishuo Zhang
A/Prof. Gang Li
Biographical Sketch of the Presenters
Dr. Nayyar Zaidi
Dr. Zaidi is currently a Senior Lecturer at Deakin University. He received the B.S. degree in computer science and engineering from the University of Engineering and Technology, Lahore, Pakistan, in 2005, and the Ph.D. degree in Artificial Intelligence from Monash University, Melbourne, VIC, Australia, in 2011. He worked as a Research Fellow, a Lecturer, and a Research Fellow, from 2011 to 2013, from 2013 to 2014, and from 2014 to 2017, respectively, at the Faculty of Information Technology, Monash University. From 2017 to 2019, he worked as Research Scientist at Credit AI (Trusting Social) Melbourne Lab. His research interests include effective feature engineering, explainable model, uncertainty prediction, and reinforcement learning. He is also interested in practical data science, machine learning engineering, and data science training. He was a recipient of the Gold Medal for graduating top of the class at the University of Engineering and Technology.
Yishuo Zhang
Yishuo Zhang received his B.S. degree in computer science from the University of Zhengzhou, China in 2010, and the M.S degree in information technology from Monash University, Melbourne, VIC, Australia, in 2013. He currently is the second year Ph.D. student at the School of Information Technology, Deakin University and his research interests include big data feature engineering, tabular data generation, the trust-able and explainable model and tourism demand forecasting.
A/Prof. Gang Li
A/Prof Gang Li, IEEE senior member, received his Ph.D. in computer science in 2005. He joined the School of Information Technology at Deakin University (Australia)as an associate lecturer (2004-2006), lecturer (2007-2011), senior lecturer (2012-2016). His research interests are in the area of data mining, machine learning, and business intelligence. He serves on the IEEE Data Mining and Big Data Analytics Technical Committee (2017-2018 Vice Chair), and IEEE Enterprise Information Systems Technical Committee, IEEE Enterprise Architecture and Engineering Technical Committee, and serves as chair for IEEE Task force on Educational Data Mining (2020-2023 Chair). He acts as an associate editor for Decision Support Systems (Elsevier), IEEE Access (IEEE), Journal of Travel Research (Sage), and Information Discovery & Delivery (Emerald), and Human-Centric Computing and Information Sciences (Springer) etc. He has been the guest editor for IEEE Access, the Chinese Journal of Computer, Journal of Networks, Future Generation Computer Systems (Elsevier), Concurrency and Computation: Practice and Experience (Wiley) and Enterprise Information Systems (Taylor & Francis). He has co-authored 8 papers that won best paper prizes, including KSEM 2018 Best Paper award, IFITT Journal Paper of the Year (2017, 1st prize), IEEE Trustcom 2016 best student paper award, Journal Paper of the Year (2015, 3rd award) from IFITT, the PAKDD2014 best student paper award, ACM/IEEE ASONAM2012 best paper award, the 2007 Nightingale Prize by Springer journal Medical and Biological Engineering and Computing. He has also conducted research projects on tourism and hospitality management. He served on the Program Committee for over 150 international conferences in artificial intelligence, data mining, machine learning, tourism and hospitality management, and is a regular reviewer for International Journals in the areas of data science, privacy protection, recommendation system, and business intelligence.
Acknowledgements
Jiahui Zhou. Final year Master student in Xi'an Shiyou university, her research interests are in big data feature engineering, adversarial data defense
Haiyang Xia. Second year PHD student in Australian National University, his research interests are in tourism competitiveness analysis, casual inference, and the big data feature engineering.