Today I am pleased to announce the release of a new public dataset called the Rainforest Automation Energy Dataset (RAE). There is an accompanying paper that describes the dataset in detail. Here is a brief summary:
Datasets are important for researchers to build models and test how well their machine learning algorithms perform. This paper presents the Rainforest Automation Energy (RAE) dataset to help smart grid researchers test their algorithms which make use of smart meter data. RAE contains 72 days of 1Hz data from a residential house’s mains and 24 sub-meters resulting in 6.2 million samples for each sub-meter. In addition to power data, environmental and sensor data from the house’s thermostat is included. Sub-meter data includes heat pump and rental suite captures which is of interest to power utilities. We also show (by example) how RAE can be used to test non-intrusive load monitoring (NILM) algorithms.
Hopefully, the non-intrusive load monitoring (NILM) research community will be pleased!
RAE can be freely downloaded from Harvard Dataverse: doi:10.7910/DVN/ZJW4LC.