Limitations of machine learning for building energy prediction: ASHRAE Great Energy Predictor III Kaggle competition error analysis

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/236540

DC Field	Value
dc.title	Limitations of machine learning for building energy prediction: ASHRAE Great Energy Predictor III Kaggle competition error analysis
dc.contributor.author	Miller, Clayton
dc.contributor.author	Picchetti, Bianca
dc.contributor.author	Fu, Chun
dc.contributor.author	Pantelic, Jovan
dc.date.accessioned	2023-01-30T01:25:35Z
dc.date.available	2023-01-30T01:25:35Z
dc.date.issued	2021-06-25
dc.identifier.citation	Miller, Clayton, Picchetti, Bianca, Fu, Chun, Pantelic, Jovan (2021-06-25). Limitations of machine learning for building energy prediction: ASHRAE Great Energy Predictor III Kaggle competition error analysis. Science and Technology for the Built Environment. Volume 28, 2022, Issue 5. 1-18. ScholarBank@NUS Repository.
dc.identifier.uri	https://scholarbank.nus.edu.sg/handle/10635/236540
dc.description.abstract	Research is needed to explore the limitations and potential for improvement of machine learning for building energy prediction. With this aim, the ASHRAE Great Energy Predictor III (GEPIII) Kaggle competition was launched in 2019. This effort was the largest building energy meter machine learning competition of its kind, with 4,370 participants who submitted 39,403 predictions. The test data set included two years of hourly whole building readings from 2,380 meters in 1,448 buildings at 16 locations. This paper analyzes the various sources and types of residual model error from an aggregation of the competition's top 50 solutions. This analysis reveals the limitations for machine learning using the standard model inputs of historical meter, weather, and basic building metadata. The errors are classified according to timeframe, behavior, magnitude, and incidence in single buildings or across a campus. The results show machine learning models have errors within a range of acceptability (RMSLE_scaled =< 0.1) on 79.1% of the test data. Lower magnitude (in-range) model errors (0.1 < RMSLE_scaled =< 0.3) occur in 16.1% of the test data. These errors could be remedied using innovative training data from onsite and web-based sources. Higher magnitude (out-of-range) errors (RMSLE_scaled > 0.3) occur in 4.8% of the test data and are unlikely to be accurately predicted.
dc.source	Elements
dc.subject	cs.LG
dc.subject	cs.LG
dc.subject	cs.CY
dc.type	Article
dc.date.updated	2023-01-29T12:15:33Z
dc.contributor.department	BUILDING
dc.description.sourcetitle	Science and Technology for the Built Environment. Volume 28, 2022, Issue 5. 1-18
dc.published.state	Unpublished
Appears in Collections:	Staff Publications Elements Students Publications

Show simple item record

Files in This Item:

File	Description	Size	Format	Access Settings	Version
2106.13475v3.pdf	Accepted version	4.45 MB	Adobe PDF	OPEN	Pre-print	View/Download

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Google Scholar^TM