(2016). Feature extraction and filtering for house-
hold classification based on smart electricity meter
data. Computer Science - Research and Development,
31:141–148.
Humeau, S., Wijaya, T. K., Vasirani, M., and Aberer, K.
(2013). Electricity load forecasting for residential cus-
tomers: Exploiting aggregation and correlation be-
tween households. In Sustainable Internet and ICT
for Sustainability, pages 1–6. IEEE.
Iftikhar, N., Liu, X., Danalachi, S., Nordbjerg, F. E., and
Vollesen, J. H. (2017). A scalable smart meter data
generator using spark. In On the Move to Meaningful
Internet Systems.OTM 2017 Conferences, pages 21–
36. Springer.
Iftikhar, N., Liu, X., Nordbjerg, F. E., and Danalachi, S.
(2016). A prediction-based smart meter data gener-
ator. In 19th International Conference on Network-
Based Information Systems, pages 173–180. IEEE.
Kwac, J., Flora, J., and Rajagopal, R. (2014). Household
energy consumption segmentation using hourly data.
IEEE Transactions on Smart Grid, 5(1):420–430.
Lange, D., Ribalta, M., Echeverria, L., and Pocock, J.
(2023). Profiling urban water consumption using au-
toencoders and time-series clustering techniques. In
14th International Conference on Hydroinformatics,
page 1136 012005. IOP Publishing.
Laouafi, A., Laouafi, F., and Boukelia, T. E. (2022). An
adaptive hybrid ensemble with pattern similarity anal-
ysis and error correction for short-term load forecast-
ing. Applied Energy, 322:119525.
Okereke, G. E., Bali, M. C., Okwueze, C. N., Ukekwe,
E. C., Echezona, S. C., and Ugwu, C. I. (2023). K-
means clustering of electricity consumers using time-
domain features from smart meter data. Journal
of Electrical Systems and Information Technology,
10(1):1–18.
Rahim, M. S., Nguyen, K. A., Stewart, R. A., Ahmed, T.,
Giurco, D., and Blumenstein, M. (2021). A clustering
solution for analyzing residential water consumption
patterns. Knowledge-Based Systems, 233:107522.
Tang, W., Wang, H., Lee, X. L., and Yang, H. T. (2022).
Machine learning approach to uncovering residential
energy consumption patterns based on socioeconomic
and smart meter data. Energy, 240:122500.
Trotta, G., Gram-Hanssen, K., and rgensen, P. L. J. (2020).
Heterogeneity of electricity consumption patterns in
vulnerable households. Energies, 13(18):4713.
Wang, Y., Jia, M., Gao, N., Krannichfeldt, L. V., Sun, M.,
and Hug, G. (2022). Federated clustering for elec-
tricity consumption pattern extraction. IEEE Transac-
tions on Smart Grid, 13(3):2425–2439.
Wen, L., Zhou, K., and Yang, S. (2019). A shape-based
clustering method for pattern recognition of residen-
tial electricity consumption. Journal of cleaner pro-
duction, 212:475–488.
Wu, J., Niu, Z., Li, X., Huang, L., Nielsen, P. S., and Liu,
X. (2023). Understanding multi-scale spatiotemporal
energy consumption data: A visual analysis approach.
Energy, 263:125939.
Yang, T., Ren, M., and Zhou, K. (2018). Identifying house-
hold electricity consumption patterns: A case study of
kunshan, china. Renewable and Sustainable Energy
Reviews, 91:861–868.
APPENDIX
The implementation of the modified brute force al-
gorithm, as outlined in Section 4.2 has been realised
through the following Python code.
#create hashMap
def createhashMap(feature_set):
hashmap = {}
#fill hashmap with keys from feature columns
train_instance_num = feature_set.shape[0]
#for each row in dataset
for i, row in enumerate(feature_set):
#for each feature in row
for j, feature in enumerate(row):
if (i == 0):
hashmap[j] = []
if (feature != 0):
hashmap[j].append([i,
feature])
return hashmap
#Given a test query (xq) return the k most
#similar consumption patterns
def knnSearch(xq, k, hashmap):
S = {}
heap = []
#loop through all the query features
for j,feature in enumerate(xq):
#loop through all the tuples for the
#non-zero query feature in the hashmap
if (feature != 0):
for tuples in hashmap[j]:
#if it is the first feature,
#then create an empty array
if(j == 0):
S[tuples[0]] = [[],[]]
#Append to S at [tuples[0]]
S[tuples[0]][0].append(tuples[1])
S[tuples[0]][1].append(feature)
counter = 0
for rowid in S:
if(counter < k):
heapq.heappush(heap,
(cosine_sim(S[rowid][0],S[rowid][1])
,rowid))
else:
heapq.heappushpop(heap,
(cosine_sim(S[rowid][0],S[rowid][1])
,rowid))
counter += 1
heaplist = heapq.nlargest(k,heap)
return heaplist
KDIR 2023 - 15th International Conference on Knowledge Discovery and Information Retrieval
174