Please use this identifier to cite or link to this item: https://doi.org/10.1145/3563357.3566147
DC FieldValue
dc.titleTrimming outliers using trees: Winning solution of the Large-scale Energy Anomaly Detection (LEAD) competition
dc.contributor.authorFu, C
dc.contributor.authorArjunan, P
dc.contributor.authorMiller, C
dc.date.accessioned2023-01-30T01:11:26Z
dc.date.available2023-01-30T01:11:26Z
dc.date.issued2022-11-09
dc.identifier.citationFu, C, Arjunan, P, Miller, C (2022-11-09). Trimming outliers using trees: Winning solution of the Large-scale Energy Anomaly Detection (LEAD) competition. BuildSys '22: The 9th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation : 456-461. ScholarBank@NUS Repository. https://doi.org/10.1145/3563357.3566147
dc.identifier.isbn9781450398909
dc.identifier.urihttps://scholarbank.nus.edu.sg/handle/10635/236539
dc.description.abstractPrediction of building energy consumption using machine learning models has been a focal point of research for decades. However, some causes of forecast errors, particularly data quality, have not been adequately addressed, which may affect the accuracy of forecasting models and subsequent energy management. To solve the issue of data quality, a classifier that can automatically detect time series anomalies is the goal that researchers have been pursuing. Large-scale Energy Anomaly Detection (LEAD), a community competition hosted on the Kaggle platform, was created for this purpose as well as to provide a foundation for benchmarking solutions. In this competition, 200 energy time series worldwide with labeled anomalies were provided to train a classification model to predict anomalies of another 206 unseen time series. The proposed winning solution is a tree-based supervised learning anomaly classifier with ROC-AUC score as high as 0.9866 on private leaderboard. This article describes and analyzes in depth a variety of commonly employed techniques for improving the classification model. Among these strategies, feature engineering requires the most effort and dominates all other techniques; value-changing features that can represent the level of time-series variation have a particularly positive impact. Besides, the classification accuracy of solutions in the competition can serve as a benchmark for future research on supervised learning of energy anomaly detection.
dc.publisherACM
dc.sourceElements
dc.typeConference Paper
dc.date.updated2023-01-29T12:05:33Z
dc.contributor.departmentBUILDING
dc.description.doi10.1145/3563357.3566147
dc.description.sourcetitleBuildSys '22: The 9th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation
dc.description.page456-461
dc.published.statePublished
Appears in Collections:Staff Publications
Elements
Students Publications

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
benchsys22-final100.pdfAccepted version1.14 MBAdobe PDF

OPEN

Post-printView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.