An intelligent nonintrusive load monitoring scheme based on 2D phase encoding of power signals

Nonintrusive load monitoring (NILM) is the de facto technique for extracting device‐level power consumption fingerprints at (almost) no cost from only aggregated mains readings. Specifically, there is no need to install an individual meter for each appliance. However, a robust NILM system should incorporate a precise appliance identification module that can effectively discriminate between various devices. In this context, this paper proposes a powerful method to extract accurate power fingerprints for electrical appliance identification. Rather than relying solely on time‐domain (TD) analysis, this framework abstracts the phase encoding of the TD description of power signals using a two‐dimensional (2D) representation. This allows mapping power trajectories to a novel 2D binary representation space, and then performing a histogramming process after converting binary codes to new decimal representations. This yields the final histogram of 2D phase encoding of power signals, namely, 2D‐PEP. An empirical performance evaluation conducted with three realistic power consumption databases collected at distinct resolutions indicates that the proposed 2D‐PEP descriptor achieves outperformance for appliance identification in comparison with other recent techniques. Accordingly, high identification accuracies are attained on the GREEND, UK‐DALE, and WHITED data sets, where 99.54%, 98.78%, and 100% rates have been achieved, respectively, using the proposed 2D‐PEP descriptor.


| INTRODUCTION
In smart grid, energy efficiency is mainly based on the adoption of intelligent strategies to reduce wasted energy. Smart metering plays an essential role in developing a reliable smart grid to promote energy saving in buildings via the use of smart meters. [1][2][3] However, the deployment of a large number of smart meters to monitor electrical appliances can increase significantly the implementation cost of the smart grid, and thereby the cost of the implementation can tremendously surpass the energy cost that will be saved when installing these smart meters. 4,5 To that end, the solution to this issue is via adopting a smart low-cost monitoring, called nonintrusive load monitoring (NILM) or energy disaggregation as well. 6 NILM is the process of identifying individual device consumption footprints in a given building from the aggregated power consumption taken at the overall power entry without the need to install a smart meter for each appliance. 7,8 Device-level consumption footprints inferred by NILM can (i) provide end-users statistics with personalized consumption of each appliance through exploiting the aggregated power signal, without the need to install a smart sensor for each device and (ii) help end-users to comprehend their consumption behaviors and provide them with information on how to act for promoting energy saving. 9,10 Henceforth, this allows a cutting down of their energy bills and constitutes a gain for the environment. 11,12 Nonintrusive appliance recognition is among the core steps in designing a robust NILM system. It is responsible for discriminating between the various appliances after segregating aggregated power signals. 13,14 To do so with adequate accuracy, the NILM system should employ a robust feature extraction module followed by an appropriate powerful classifier. Extended efforts have been dedicated to design robust feature extraction descriptors and different NILM systems can be found in the literature based on the nature of feature extraction schemes they used. 15 To design a powerful NILM system, this paper proposes a robust feature extraction descriptor that can derive accurate fingerprints from power consumption signals. In this regard, a set of contributions have been introduced in this framework, which can be summarized as follows: • Second, the obtained TD signal is then moved to two-dimensional (2D) representation, which provides more possibilities to encode and extract power features than 1D representations because each signal sample in 2D space will be surrounded by eight close neighbors. • Third, a local phase encoding process is performed on the frequency representation of the obtained matrix using a block splitting process incorporating the neighborhood information. Then, each block is mapped to a new 1D binary representation using a specific encoding procedure. Next, the histogram of 2D phase encoding of power signals (2D-PEP) is generated through a histogramming process after converting the binary codes to decimal representations. The main advantage of the proposed encoding process is that it can not only improve the discrimination ability of classifier models, but it also reduces enormously the dimensions of the appliance signatures before identifying them via removing unnecessary information. • Fourth, a nonintrusive load identification system is designed based on the proposed 2D-PEP descriptor, with reference to different machine learning (ML) classifiers, such as linear discriminant analysis (LDA), principal component analysis (PCA), deep neural networks (DNN), decision tree (DT), ensemble bagging tree (EBT), K-nearest neighbors (KNN), and support vector machine (SVM). • Finally, the proposed solution is evaluated under three different data sets with distinct frequency resolutions, in contrast to the most of NILM frameworks, which are generally assessed on one data set. Moreover, the data sets considered in this framework are collected at two completely different scenarios. For example, consumption footprints in the GREEND 16 and UK-DALE 17 data sets have been gleaned from the same household appliances but for different days. While, in the WHITED data set, 18 power consumption footprints are collected for a large number of appliances but from distinct manufacturers (i.e., each appliance class includes and ensemble of appliances from different brands). This helps in evaluating whether our solution can recognize a specific appliance type even if the former is trained using data of other appliances pertaining to the same class but from different manufacturers (brands). In this context, the evaluation studies conducted in this paper help in consolidating the credibility of the proposed work.
The remainder of this paper is structured as follows. An overview of nonconventional NILM systems is introduced in Section 2 through presenting a taxonomy structure of these frameworks based on the feature extraction analysis and learning models. Moreover, their limitations and drawbacks are identified as well. In Section 3, detailed explanations of the proposed appliance identification system, 2D-PEP feature extraction, and learning models are presented. Empirical results on public realistic databases are exhibited in Section 4 to evaluate the performance of the proposed solution and compare it with recent NILM systems. Furthermore, Section 5 draws the main findings and recommendations emerging from this framework. feature extraction-based categorization. Nonconventional NILM features can be decomposed into the following subclasses: Statistical models: this type of feature addresses mainly the collection of statistical parameters and further the use of statistical models to represent appliances' footprints. By Guedes et al, 19 higher-order statistics drawn from current loads are deployed for modeling appliance load fingerprints. By Krull et al, 20 virtual stochastic sensors are utilized to derive domestic appliance consumption footprints, in which the accurate intervals of turning on/off each appliance are taken into consideration. By Kong et al, 21 the NILM issue is resolved by means of hidden Markov models (HMM). The latter are applied to segregate the main power consumption into various appliance-level data after generating an HMM model for each appliance class. By Liu et al, 22 individual appliance load fingerprints are determined via computing the probabilities of consumption events usually recorded during the operation of an ensemble of appliances. Accordingly, a mixture of Bernoulli distributions is deployed to measure those probabilities. By Ji et al, 23 the authors use a maximum a posteriori rule to draw device-specific consumption signatures through the use of recursive fuzzy c-Means and HMM models.
Graph signal processing (GSP): is an in-demand research field aiming at describing the stochastic properties of power consumption signals by means of graph theory. By He et al, 24 the authors introduce an event-based graph technique for extracting the fine-grained fingerprints of appliance loads, and further for minimizing the learning time and reducing the computation cost of conventional graph-based solutions. By Li and Dick, 25 various graph-based multilabel schemes are proposed to disaggregate the main consumption loads of households into individual appliance consumption footprints using a semi-automatic approach. By Zhao et al, 26 NILM performance is enhanced via the use of a generic GSP-based method, which depends on applying graph-based filters. The latter can help in capturing on/off events through the suppression of the noise generated from electrical devices.
Sparse coding: in this case, the energy disaggregation framework is treated as a blind sourceseparation problem and recent sparse coding schemes are then applied to split an aggregated power consumption signal into specific appliance-based profiles. 27 By Singh and Majumdar, 28 a co-sparse analysis dictionary learning is proposed to segregate the total energy consumption into device-level data and significantly shorten the training process. By Singh and Majumdar, 29 a deep learning architecture is used for designing a multilayer dictionary of each appliance rather than constructing one-level codebook. The multilayer codebooks are then deployed as features for a source-separation algorithm to break down the aggregated energy signal. By Rahimpour et al, 30 an improved nonnegative matrix factorization is used to pick up perceptibly valuable appliance-level signatures from the aggregated mixture.
Binary encoding: most recently, binary descriptors have been investigated for the identification of power signals. By Du et al, 31 power fingerprints are derived by estimating the similarity of voltage-current (V-I) shapes, encoding it using a binary dictionary and then extracting graphical footprints that can be used for appliance identification. By Gao et al, 32 V-I binary representation is employed through converting the normalized V-I magnitude into binary matrices using a thresholding process. More specifically, this approach relies on binary coding of the V-I edges plotted in the new representation. These data are then fed into an ML classifier to identify each appliance class. By Liu et al, 33 a color encoder is proposed to draw V-I signatures that can also be translated to visual plots. These footprints are then fed into a deep learning classifier to identify each appliance. By Baets et al, 34 a siamese-neural network is used that aims at mapping the V-I trajectories into a novel characteristic representation plan.
Even though the aforementioned nonconventional feature extraction-based NILM systems have been attracted a significant attention in recent years to resolve the NLMI issue, most of them are based on synthesis dictionary learning/coding methodologies that require large-scale data sets to learn the electric loads of different kinds of appliances. Therefore, this limits their applicability in real-world application scenarios. Moreover, for the case of statistical schemes, they are generally constrained in a comparatively limited discrete state-space. Thus, the implementation and computational complexity of learning the electric loads are unmanageable and the state-space is usually growing exponentially. Consequently, the exponential spacecomplexity presents a serious drawback when extending the HMM process in the context of time windows. Moreover, statistical models are generally relying on capturing transient states of appliances' consumptions, which could decrease their performance if various devices are switching on/off simultaneously.

| Learning models
Recently, much attention has been devoted to the use of deep learning models in the NILM problem and other applications. Although the tough training procedure of deep learning models, they operate appropriately in most of NILM scenarios, especially in comparison with conventional ML models. Nevertheless, it is still earlier to employ existing deep learning models in real-world NILM systems and even in other realistic applications since they need more improvements on different aspects, such as the optimization issue and acceleration of the learning process. In this regard, various works have been proposed to make them more adequate to real-world applications.
Zheng et al 35 introduce a two-step learning procedure for the aim of optimizing the feature boundary of deep convolutional neural networks (CNN), this results mainly in overcoming the overfitting issue. By Chen et al, 36 a scale-and context-aware deep learning model is proposed to disaggregate power signals. In this line, multiscale features and contextual data are used to train the model after developing a multibranch structure, which includes various branchwise gates and receptive field sizes for connecting the branches in each subnetwork. By Zheng et al, 37 a simple yet effective training method is proposed, which relies on a layerwise and stochastic gradient descent scheme. This helps in implementing DNN models in a simple way with low computation complexity, further it enables to achieve a gradient-based optimization of DNN objective functions. Moreover, aiming at accelerating the learning process, a pruning scheme is introduced in Reference [38], in which the model parameters of 2D deep CNN are reduced. By Zheng et al, 39 the performance of CNN models is improved using full-stage information augmentation strategy. This results in an implicit model ensemble that does not require extra model training costs. Accordingly, information augmentation over the training and test steps could help in optimizing the network in addition to improving its generalization capability.
On the other side, since the deep learning models require a large amount of data to learn the electric loads of several houses, it was very difficult to gather sufficient label data and train the classifiers appropriately. To that end, several frameworks have been proposed recently to deal with this problem. Liu et al 33 propose a transfer learning method using V-I trajectories, which can reduce the number of observed households while maintaining acceptable performance. In this context, a DNN algorithm pretrained using a visual recognition repository is transferred for learning the V-I trajectories based on exploiting the related knowledge between the two different fields. Although this approach could reduce the training computation complexity, however its appliance identification performance still needs to be improved further, especially for the case of some appliance classes, such as air conditioners, light lamps, and hairdryers. By Gaur and Majumdar, 40 instead of using synthesis dictionary learning schemes, the implemented model has been trained with their analysis equivalents. The latter can be trained on a moderate amount of data, in contrast to synthesis-based solutions, which require large-scale data sets in the learning phase. By D'Incecco et al, 9 two schemes based on appliance transfer learning are proposed, in which the latent characteristics learned by a complex device, such as a washing machine, are transferred to a simple device, for example, kettle.

| Motivation of the proposed work
As described in the aforementioned subsections, although special attention has been given recently to nonconventional NILM architectures, the latter requires additional enhancements to increase their identification accuracies. In addition, most of the reviewed NILM systems are only tested on one category of databases with a unique sampling frequency. In addition, most of the binary encoding features-based frameworks extract binary representations of the power signals, which are then fed to deep learning networks classifiers to learn device-specific data. These classifiers are generally computationally intensive and hard to implement on low-cost computing platforms. However, in this paper, we propose a nonconventional solution that is based on a novel descriptor. More specifically, power signals are converted to 2D representations to extract robust features based on a phase encoding using a block splitting process, followed by a binary mapping and a decimal re-encoding. Next, novel 1D histograms are extracted from various blocks. Consequently, the final descriptors are represented as 1D short histograms, which can reduce enormously the quantity of features used to learn EBT appliance loads with EBT classifier. Explicitly, the proposed descriptor can not only enhance the discrimination ability but it operates also as a dimensionality reduction approach, in which it encodes efficiently the class-specific appliance characteristics via eliminating the unnecessary information.
On the other side, to the best of our knowledge, this is the first work that extracts appliance features of power consumption signals after transforming them to 2D space. Moreover, the proposed feature descriptor results in considerably better accuracy and F1 score rates than state-of-the-art techniques when it is applied on three well-known data sets in this framework, which have been collected using distinct frequency resolutions at completely different scenarios. Therefore, this allows one to establish more credibility to our work.

| PROPOSED SYSTEM
The main objective of the proposed scheme is to identify individual appliances from the aggregated energy consumption using a new 2D feature extraction descriptor called 2D-PEP. The latter relies on encoding the phase of appliance power consumption signals after segregating the main signal using an event detection approach. The extracted features from the detected events are then fed into a classifier to identify each electrical device. The proposed appliance identification framework is represented in Figure 1. The rest of this section explains the principal modules of the proposed methodology.

| Event detection
An event detection scheme is applied to the aggregated power signal s A collected from the main supply to capture an individual event vector e e e e L = [ (1), (2), …, ( )] i for each appliance, with M is the length of the event vector. Since we focus mainly in this framework on proposing a novel feature descriptor, the event detection task has been conducted using the edge detector described in the NILM toolkit. 41

| Feature extraction
To collect a robust local description of each appliance, the proposed feature extraction algorithm, namely, 2D-PEP, is applied on each detected event vector e i to extract a feature histogram that will be used in the classification step to efficiently recognize each appliance. The proposed feature extraction algorithm is summarized in the following steps, while ≤ i N with N is the total number of event vectors: Step 1. Extract the TD waveform S F of the event vector e i using a windowing process with a window length n based on Equation (1) as follows: where k K = 1, 2, …, and K is the number of extracted windows. Following, collected S k ( ) F are then concatenated to form the whole feature vector S Architecture of the proposed appliance identification system. 2D, two dimensional Step 2. Reshape S F to 2D representation, denoted f y ( ), where y represents the 2D position in the novel matrix. This is to treat power signals as images, and hence more encoding possibilities will be available since each power sample will be surrounded with eight neighbors.
Step 3. Apply a linear filtering using short time Fourier transform (STFT). This helps in a better analysis of the local frequencies since the STFT uses a windowing process. The linear filtering is defined using Equation (2) as follows: where v describes the 2D frequency, and c: → R C 2 stands for complex valued filter operator, which is defined by Equation (3) as follows: where j = −1 and → w y R R ( ): 2 2 represents a window kernel of size a a × . Also, m M = 1, 2, …, and M represents the number of kernels that can be scanned over the overall matrix f y ( ).
Step 4. Apply the binary encoding using the real (Re{·}) and imaginary (Im{·}) parts of the differences between the central coefficient and its neighbors (as explained in Figure 2) via adopting Equation (4) as follows: where sgn() is defined using Equation (5) as follows: (This step aims at extracting and encoding phase information to construct efficient and robust features.) Step 5. The encoding outputs lead then to a 2-bit integer presentation ( ∈ E {0, 1, 2, 3}) for individual frequency coefficient at each point y. If we consider J components, the total number of bits/point is then J 2 . The obtained data are then re-encoded to a decimal space using Equation (6) as follows: that is ranging in the [0 − 2 − 1]  Step 6. After obtaining the feature vector B b b b = , , …, m a 1 2 2 from all a 2 frequencies in the STFT block, a histogramming procedure is then applied to extract an estimation of their sum. This is via affecting a histogram bin (scalar) b m′ to every a 2 samples using the summation described in Equation (7) as follows (this step will act as a dimensionality reduction as well): Step 7. Repeat Steps 3-6 via shifting the a a × kernel through the overall matrix f y ( ) to construct the entire feature vector . Following a normalization procedure is finally conducted using Equation (8) as follows: A block diagram explaining the proposed 2D-PEP algorithm is portrayed in Figure 2. It describes how the encoding process is performed after (i) detecting electrical events, (ii) capturing TD feature, (iii) transforming extracted features to 2D space, and (iv) conducting an STFT using a kernel of size a a × . Following, the central sample (red) is compared with four of its neighbors (green) using a subtraction process. Moving forward, obtained real and imaginary values are encoded as binary bits using the sign function. This encoding process is adopted because: (i) the phase information is robust to several manipulations, data loss or noise that can affect the signals during the collection campaign, 42 as it is demonstrated with the promising performance obtained in terms of the accuracy and F1 score in Section 4; and (ii) the binary encoding process of real and imaginary values is used to reduce the length of electric load signatures and further to improve the discrimination ability of the proposed descriptor. Specifically, the latter operates not only as a feature extraction algorithm but also as a dimensionality reduction scheme, where it encodes efficiently the appliance class-specific features through eliminating the unnecessary information. Therefore, this helps in improving the correlation ability between the appliances pertaining to the same class, and on the flipside enhancing the discriminative ability between the appliances belonging to distinct classes. After that, a transformation to the decimal space, a histogramming process, and finally a normalization procedure are performed to extract the overall feature vector. This enables to normalize the feature vectors extracted from either the same appliance category or from different groups, and thereby avoids that appliance features have different ranges since the latter can lead to a misclassification.

| Classification
After extracting the feature histograms D 2D-PEP of all the N appliances in a data set, a classifier is deployed to identify each electrical device. In this context, various ML classifiers with different parameters settings are deployed to pinpoint the best one, including: • LDA: is essentially defined as a dimensionality reduction technique, however, it also works as a linear classifier. • PCA: computes the principal components of every appliance category before designing a classifier based on projecting the appliance features on the subgroups expanded by the principal components associated with the various categories in each data set.
• SVM: is built upon the theory of minimizing structural risk. It tries to retrieve the optimum separation hyperplane, which can reduce the distance between the features pertaining to the same appliance class. If the feature patterns cannot be separated linearly in the initial space, the feature data are transformed into a new space with higher dimensions by making use of kernel modules. • KNN: to classify appliance feature patterns, this model computes the distance of a candidate feature vector to detect K close neighbors. Their labels are analyzed and employed to affect a class label to the candidate feature vector based on majority vote, and thus define the class of the respective appliance. • DT: consists of a root node and various internal-nodes and leaf nodes. The main idea behind this classifier is to split a complex problem into various simples ones and then approve a progressive approach to fix the classification issue step by step. • DNN: encompasses multiple hidden layers, each one is fully linked to the previous layer.
Usually, rectified linear units (ReLUs) are deployed at the output of the hidden layers except the last layer to improve nonlinear discrimination. A Softmax function is also employed after the last hidden layer to forecast the appliance category. • EBT: is an ML architecture in which several weak learners are usually deployed in the training/testing steps to classify patterns in various subsets of the initial set and then fuse them to procure improved prediction performance through a bootstrap aggregation.
The overall classification performance is then reported on each overall database using a 10-fold cross-validation process.

| Data sets description
Public databases, namely, the GREEND, 16 UK-DALE, 17 and WHITED, 18 are employed for assessing the performance of the appliance identification system using the proposed feature extraction based on the 2D-PEP procedure. The selected data sets are collected at different resolutions, that is, at both low and high frequencies to conduct a comprehensive study and check the efficiency of the proposed solution when the frequency is changed.
The GREEND contains the electricity consumption signatures of several domestic appliances collected at a sampling frequency of 1 Hz from eight households in Italy and Austria. Consumption profiles are recorded for time periods ranging from 6 to 12 months. To validate the proposed system, we use the energy usage footprints gathered from a typical house that include six different appliances. The UK-DALE repository includes energy consumption fingerprints of five UK households, in which the current and voltage data are gathered at a sampling frequency of 16 kHz at the aggregated level and with a sampling rate of 6 s at the appliance level across three houses, as well as 1 s at the aggregated level and 6 s at the appliance level for two additional homes. In this framework, we focus on analyzing the consumption footprints collected at 16 kHz. The WITHED includes the power consumption signatures of the device start-ups collected from 110 electrical devices, categorizing up to 47 appliance groups. For each appliance category, a set of power consumption fingerprints is collected from various appliances from different manufacturers and all of them are gathered at a sampling rate of 44,000 Hz. In this framework, we use 11 appliance categories to validate the proposed system end each class includes 15 consumption observations. Table 1 outlines the observed devices and the number of observed days for both GREEND and UK-DALE data sets along with the observed devices and their number in each appliance category for WHITED data set.
First of all, Figure 3 shows an example of six power signals from the WHITED data set, their 2D representations, and the final histograms collected using the proposed 2D-PEP descriptor. It is clear that by moving to a high-dimensional space, the power signals are considered as images and any image feature extraction technique can be applied accordingly.

| Effect of parameter settings on the classification
To assess the effect of different parameter settings on the final classification performance, we evaluate how the window length n and the kernel size a a × affect the accuracy and F1 score of the proposed NILM system. Figure 4 portrays obtained results when the window length n and the kernel size a a × are varied. It has clearly been seen that for window length n = 5 and the kernel size 5 × 5 provides the best accuracy and F1 score performance. Therefore, these parameters are adopted in the rest of this framework, which can help in collecting a histogram of M = 256 bins from each appliance event vector e i .

| Correlation study
To clearly understand why the proposed feature extraction approach based on 2D-PEP can improve appliance identification, normalized correlation (NC) rates between power consumption signals belonging to the same class are measured and compared in this section. To that end, 6 extracted from the aggregated power signal in the WHITED data set are selected randomly from four appliance classes. Therefore, the NCs between the various signals are thus measured to illustrate how the 2D-PEP can help correlate signals pertaining to the same class and decrease the distance between them as well. Figure 5 portrays the NC estimated between different signals belonging to four appliance groups defined as: (a) coffee machine, (b) LED lamp, (c) microwave, and (d) fan, when only the raw signals are considered and then when the 2D-PEP features are exploited. Accordingly, it is easily noticed that by using the 2D-PEP features, the NC is highly increased, its values are much higher than those obtained from raw signals. This means that the 2D-PEP can significantly help in correlating and reducing the distance between the signals belonging to the same appliance class, thanks to the local phase encoding process that can capture neighboring characteristics at the 2D representation and thus give rise to a precise description.

| Classification performance
This section investigates the performance of the proposed appliance identification using various classifiers with reference to the accuracy and F1 score metrics. In this line, the classifiers are F I G U R E 4 Effect of parameters settings on the classification performance: (left) accuracy and (right) F1 score deployed to identify appliance-level fingerprints using different classifier parameters because the latter can significantly influence the classification performance. For instance, KNN is implemented with regard to different distance metrics that are used to measure the similarity of appliance features, that is, Euclidean distance, weighted Euclidean distance, and cosine distance. Furthermore, the K parameter representing the number of nearest neighbors used in the majority voting process is varied as well. For the case of DT, three different scenarios are considered, defined as: fine (100 splits), medium (20 splits), and coarse (4 splits), in which the number of splits (subsets) has been varied and their performance is assessed. For DNN, the number of hidden layers is set to 50, because even if it is increased the performance remains the same. This is also the same for the case of EBT, in which the number of learners is set to 30 and the number of splits (subsets) is fixed to 42k since the best performance is obtained with these parameters and their variation has very slight changes. Finally, SVM is implemented using three different kernels, defined as linear, Gaussian, and quadratic, these kernels can significantly impact the classification performance. Table 2 summarizes the results obtained on the GREEND, UK-DALE, and WHITED data sets using the aforementioned classifiers.
It is clearly shown that SVM with the quadratic kernel provides the best results for the three data sets. For example, 99.54% accuracy and 99.53% F1 score are obtained under GREEND while 100% is obtained for both the accuracy and F1 score under WHITED. This is because SVM pertains to a class of models that use a kernel-based learning that can discriminate between feature vectors through analyzing the relations between feature patterns by considering the kernel trick. Moreover, it is obviously seen that the performance of the other classifiers is significantly dropped in the case of UK-DALE and only SVM with the quadratic kernel can maintain a good performance. Specifically, 98.78% accuracy and 98.66% F1 score are obtained, which are much higher than the results achieved with the other classifiers. All in all, the obtained results point out a good stability robustness of SVM with the quadratic kernel under different data sets with distinct frequency sampling rates.
Furthermore, it is worth noting that both LDA and PCA perform well on GREEND and WHITED while their performance has dropped under UK-DALE. On the other hand, it is shown that LDA slightly outperforms PCA under the three data sets. This is mainly due to the fact that LDA is a statistical learning approach that takes the class labels into consideration when it learns the appliance signatures. This allows one to accurately draw the boundaries around the clusters of appliance categories. By contrast, PCA operates on eigenvalues and eigenvectors of the covariance matrix and ignores the class labels that help in maximizing the variance in each appliance data set to capture the appropriate directions. Put differently, PCA could be considered as just a summarization of the appliance signatures. All in all, LDA could work better than PCA with large data sets, including various appliance classes (as it is the case with the three data sets considered in this framework); while PCA may work better than LDA with small data sets having less appliance signatures per class, as it is also demonstrated in Reference [43]. | 87 Figure 6 illustrates the confusion matrices of the appliance classification step obtained using the proposed 2D-PEP and SVM classifier with a quadratic kernel for the three databases considered in the evaluation. From these plots, we can extract additional classification statistics of each appliance, such as true-positive, true-negative, false-positive, and false-negative rates. It can be observed that the proposed solution achieves high true-positive rates for all the appliances in three data sets. However, the performance is slightly declined in the case of the UK-DALE since this database is highly unbalanced where the number of instances differs highly from an appliance category to another, as it may be viewed in Figure 6b.

| Comparison with other feature extraction schemes
To deeply investigate the performance of the proposed 2D-PEP descriptor, its performance in terms of accuracy, F1 score, and time computation has been compared with other well-known feature extraction schemes, usually deployed to extract pertinent characteristics of power signals. In this regard, various descriptors are considered in this comparative study, including root mean square (RMS), 44 S-Transform, 45 mean absolute deviation (MAD), 46 multiscale wavelet packet tree (MSWPT), 13 slope sign change (SSC), 47 discrete wavelet transform (DWT), 48 and autoregressive (AR). 47 Table 3 presents the obtained results, in which it can be seen that the 2D-PEP outperforms the other descriptors in terms of accuracy and F1 score with respect to the data sets used in this framework. For the computational complexity, the different descriptors have been implemented using Python 3.7 and running on a desktop, having a Core i7-3770S processor, 16 GB RAM and 3.1 GHz. The obtained time computation results demonstrate that the execution time of 2D-PEP is significantly lower than those required to run the RMS, MADF, SSC, AR, and DWT descriptors, while it is very comparable to those required for running the S-Transform and MSWPT. The performance of the proposed NILM system based on 2D-PEP descriptor has been also compared with existing techniques. Table 4 summarizes the outcomes of the comparative study conducted under the UK-DALE data set with regard to various parameters, including the (i) deployed approach, (ii) learning model, (iii) number of appliance categories used in the validation process, and (iv) achieved accuracy and F1 score rates. It can clearly be seen that when using the proposed solution, outstanding accuracy and F1 score results have been achieved, in which the superiority of the 2D-PEP-based scheme is obvious. For example, more than 3% gain of the accuracy and 3.5% of the F1 score have been reached in comparison with Reference [49], and further more than 20% of F1 score has been achieved compared to Reference [50].

| CONCLUSION
In the present study, a powerful NILM system is introduced using a novel feature extraction technique, which is based on extracting histograms of phase encoding of power consumption signals using 2D representation, denoted as 2D-PEPE. Robust feature signatures are extracted that can highly minimize the distance between appliance features belonging to the same category and on the flipside, maximize the margin between appliance features pertaining to different groups. This is achieved through representing the TD version of power signals into 2D space, conducting a novel local phase encoding process and then generating histograms of binary feature vectors. Moving forward, to evaluate the performance of the proposed system, a comprehensive assessment study has been conducted through a series of tests. Consequently, promising appliance identification results have been obtained on three different data sets (with distinct frequency resolutions) using SVM classifier with the quadratic kernel, in which 99.54%, 98.78%, and 100% accuracy rates and 99.53%, 98.66%, and 100% F1 scores have been achieved on GREEND, UK-DALE, and WHITED data sets, respectively. Moreover, the proposed 2D-PEP has the lowest computation cost in comparison with other state-of-the-art descriptors. However, although the proposed solution based on 2D-PEP has presented promising performance, it should be noted that its minor limitation is related to the use of a supervised learning procedure since the latter necessitates the use of training data sets to learn the electric load signatures of different kinds of appliances. Explicitly, this is not unique for our approach, but for all supervised based NILM solutions. However, the effectiveness of the proposed technique under the different data sets considered in this article demonstrates that it can identify different appliances from distinct manufacturers, that is, the case of the WHITED data set. Moreover, the results obtained from the GREEND and UK-DALE data sets indicate that the end-user can collect its own data for a specific period of time and use them to train the proposed NILM scheme before applying it to online appliance identification tasks.
Finally, it will be part of our future work to focus on implementing an energy-saving recommender system based on analyzing the specific appliance-level consumption data obtained through this framework. Therefore, this will help in increasing the awareness of endusers via providing them with appliance-specific consumption footprints and promoting their behavioral change to a more sustainable and energy-saving behavior.