The power transformer is a valuable asset in transmission and distribution system. Any fault in the transformer may lead to the power outages and blackouts Therefore, power transformer has a direct influence on the safety and reliability of the power system. Furthermore, replacement of a power transformer is very costly and time consuming and hence it is very important to diagnose incipient faults as soon as possible so as to avert major problems in a transformer.
Dissolved gas analysis is the most important test in determining the condition of a transformer. It is the first indicator of a problem and can identify deteriorating insulation and oil, overheating hot spots, partial discharge and arcing. Dissolved gas analysis is made on the basis of the standard IEC60599 [1] and IEEEC 57-104TM [2] standards. A four condition DGA guide to classify risks to transformers with no previous problems has been published in the standard IEEE C57- 104TM.
A number of techniques have been proposed to deal with the problem of transforming fault diagnosis. Some of these techniques include expert systems [3], fuzzy set models [4], multi-layer feedforward artificial neural networks (ANN)
[5] [6], wavelet networks [7], hybrids fuzzy Sets/ANN [8], radial basis function neural networks [9], Support Vector Machines (SVM) [10], Self Organizing Maps (SOM) and Kohonen Neural Networks [12].
In 1998, Wang used the combination of ANN and expert system. He took 210 datas , among them 60 were tested and the percentage of correct diagnosis is 93.3% but he didn't diagnose the Partial Discharge fault[4].In 2008 LX Dong et al applies the rough set classifier and the fusion of 7 wavelet neural networks to diagnose transformer faults but Partial Discharge fault is not diagnoised here and the percentage of correct diagnosis is 88[28]. In 2012 K Bacha et al applies SVM for few samples and the percentage of correct diagnosis through SVM is 90% [27].
In this paper a new diagnostic system based on associative neural networks is introduced. Autoassociative neural networks are special feedforward neural networks designed and trained in such a way that the output reproduces the input. During training autoencoder learn the non-linear manifold where the data lies.
Once learned, the autoencoder may be used as recognition machine - if a new data vector belongs to the manifold, the autoencoder will produce a small error; however, if this vector does not lie on the manifold (which should be the case if the new input vector is distinct from the global pattern of the data used for training), the autoencoder will return in the output a result not matching the input and the error will be high. This property is used in the model proposed in this paper. First, for each fault type, a specific autoencoder is trained so that it learns this fault's characteristic. Then, when data for an unknown fault type is considered, each autoencoder for each
Type of fault will try to match output to input but, hopefully, only one will stay tuned while all the other will display large errors. The fault is thus identified by recognizing which encoder presents minimum error.
II AUTOASSOSIATIVE NEURAL NETWORK
Autoassociation is a means by which a neural network communicates, that it does recognize the pattern that was presented to the network. A neural network that supports autoassociation will pass a pattern directly from its input neurons to the output neurons.
Autoassosiative mapping should reproduce an input vector in the output vector with less error. The outputs produced by mapping should be as close to the input as possible. Ideally, the output should be the same as the input. Autoencoders learns autoassociative mapping by training it in the autoassociative mapping mode with training pat- terns. In the autoassociative mode of training, the input patterns and the target output patterns should be the same. Let Q = xi; i = 1; 2; :::;N: be the set of training vectors.
Untitled-1.jpg
Figure 3. Network architecture for Autoencoders (σ = sigmoidal transfer function, l = linear transfer function).
Let F denote the autoassociative mapping learnt by the network. If fy1; y2; :::; yNg is the set of output vectors produced by the AANN when the training vector set fx1; x2; :::; xNg is given as input, then F minimizes the mean square error given by the equation
(1)
For the network of our interest, the mapping function F can be separated into F1 and F2, so that
F (.) =F2 (F1 (:))
Where
F1 is the transformation in the network from the input layer up to the dimension compressing hidden layer, and F2 the transformation from the dimension compressing hidden layer up to the output layer.
Assuming that the number of units in the input layer is on and the number of units in the dimension compressing hidden layer is r (where are < n), F1 transforms vectors in space R and onto the space R r. That is,
Likewise, F2 transforms vectors from the lower dimensional space Rr back to the space Rnat the output.
Since r < n, F1 is basically a dimension reduction process and F2 a dimension expansion process. Dimension reduction is achieved by projecting the vectors in the input space onto a subspace captured by the set of weights in the network path for F1. The dimension of the subspace is equal to the number of units in the dimension compressing hidden layer. Dimension expansion is achieved by mapping the lower dimensional vectors onto a hypersurface in the higher dimensional output space. Hypersurface is captured by the set of weights in the network path for F2. Subspace and hypersurface are in general nonlinear, because of the nonlinear units in the hidden layers.
Dimension expansion by mapping onto a hyper surface is obvious because, a set of lower dimensional vectors cannot produce higher dimensional vectors of intrinsic dimensionality larger than the dimension of the lower dimensional space. Capturing the nonlinear subspace by the network part for F1 is explained as follows: Let us consider a network of structure XiL-X2N-…Xd-1N-XdL-Xd+1N………X0L, where L denotes the linear units and N the nonlinear units. Xi, Xd and X0 denote the number of units in the input layer, the dimension compressing hidden layer, and the output layer, respectively. If Xi < Xd-1, transformation F1', done in the part of the network from the input layer up to the output of the layer just before the dimension compressing hidden layer, will map the input space onto a nonlinear hypersurface in the Xd-1-D space. The dimension compressing hidden layer will transform vectors in Xd-1-D space into Xd-D space, by projecting the higher dimensional vectors onto a linear subspace. The image of this subspace in the input space will be nonlinear. This can be generalized easily for the networks of any structure.
The nonlinearity level of the subspace and hypersurface depends on the size of the network. By increasing the size of the network in the part of F1, the level of non- linearity of the subspace can be increased. Similarly, by increasing the size of the network in the part for F2, level of nonlinearity of the hypersurface can be increased. As a result of dimension reduction and dimension expansion, autoassociative mapping as a whole is nothing but the projection of vectors in the input space onto a hypersurface in the same space.
Training of the network in autoassociative mapping mode with the training set will make sure that the subspace and hypersurface will be captured along the surface of maximum variance, because, the training error will be minimum in such case. During training, the network will adjust the subspace and hypersurface so that they will finally capture the surface of maximum variance.
III. FAULT DIAGNOSIS IN POWER TRANSFORMERS
When thermal or electrical stresses, which affect the insulating oil and cellulose material in transformers, are higher than the normal permissible value, then certain combustible gases, referred as fault gases, started to be produced inside the transformer. The most significant fault gases produced by oil decomposition are H2 (Hydrogen), C2H6 (Ethane), C2H4 (Ethylene) and C2H2 (Acetylene) as well as Carbon monoxide (CO) and carbon dioxide (CO2) which produce from decomposition of insulated paper (Cellulose). The type of the faults [corona, arcing discharge (both electrical faults) and overheating (thermal fault)] as well as their severity, play an important role in producing different combustible gases.
Based on DGA, many interpretative methods have been introduced to diagnose the nature of the deterioration occurred in transformer. Over the years, several techniques have been developed to facilitate the diagnosis of fault gases such as Dornenberg method [13], Roger's ratio method [14], Key gases method [15, 16], and Duval Triangle method [15] as well as the recently developed techniques such as neural network and fuzzy logic.
One of the well-known diagnostic methods is the one described in the norm IEC 60599 [17] and summarized in Table 1. These rules, when applied to the transformer data set IECTC10 [18], lead to a number of mistaken classifications plus a number of non-classified patterns (non-identified failures) which is shown in Table 2
TABLE 1 Diagnosis using the ratio method (IEC 599)
Case
Fault Type
C2H2/C2H4
CH4/
H2
C2H4/C2H6
PD
Partial discharge
NS
< 0.1
<0.2
DL
Low energy discharge
>1
0.1- 0.5
>1
DH
High energy discharge
0.6-2.5
0.1-1 but NS
>2
T1
Thermal fault <300
<0.1
>1
<1
T2
Thermal fault <300<T<700
< 0.1
>1
1-4
T3
Thermal fault >300
<.2
>1
>4
It is usual to lump together the cases T2 and T3 in many studies, because the number of cases in the database is too small to an adequate training. In the work reported in this paper, the database from [18] was used, Each sample in the database includes information on the dissolved gas concentration of H2 (hydrogen), CH4 (methane), C2H6 (ethane), C2H4 (ethylene) and C2H2 (acetylene) as well as the verified condition of the transformer. In order to have sets with minimally meaningful sizes, the types of faults were organized as five fault type.
Table 2 Results of IEC Methodology
Fault type
Total data's
Correct diagnosis
Wrong diagnosis
Accuracy
PD
9
5
4
55.5%
LD
25
20
5
80%
HD
47
38
9
80.8%
T1
15
11
4
73.33%
T2
17
13
4
76.47%
IV POWER TRANSFORMER FAULT DIAGNOSIS USING AANN
AANN trained with feature vectors derived IECTC 10 database will capture subspace and hyper surface along the surface of maximum variance of the feature vectors. Since the data characteristics are unique, the surface of maximum variance will be unique, and hence the subspace and hyper surface captured will also be unique. If a test feature vector is given to the network, it will give a small error if the test vector is same as the vector used for training (genuine testing). Otherwise (imposter testing) test error will be larger.
Most automatic diagnosis systems based on neural networks and similar approaches rely on a single system that performs classification. When activated by a sample of its input, they produce an output signal indicating the proposed fault classification. In this paper, we describe a distinct and more successful approach. The new idea behind the diagnosis system is to tune an independent autoassociative network for each cluster of data and then, for an unclassified sample, have the tuned autoassociative networks competing for the identification of the fault.
For this model one requires 5 distinct autoencoders, one for each fault. The input vectors were specified as being with exactly the same composition as used in IEC 60559 norm, meaning that we used the concentration ratios (C2 H2) /(C2 H4), (CH4) /(H2) and (C2H4) /(C2H6). Fig. 2 illustrates the competitive parallel architecture for the diagnosis system.
Untitled.jpg
Fig. 2. The General architecture of the new diagnosis system, based on a set of associative neural networks in parallel, each tuned for a specific fault type, and generating competing outputs
Each autoassociative neural network is trained to learn the manifold where data for a specific fault lay, returning the same vector if a new case for the same fault is input and returning a vector with a large deviation from input if a different case is input. Thus, when a gas concentration ratio vector is input, it is expected that only one autoencoder will display a small error, recognizing that the vector is close to a particular learned manifold, while the other autoencoders will not be able to display, at their output, a close reproduction of the input. Therefore, when the competing autoencoders present to the decision module their error value, the winner is the one with minimum error.
V. TRAINING AND TESTING
Building an AANN in MATLAB requires that one builds a custom four layer artificial neural network. For the transformer diagnosis system, the error in each autoencoder was calculated as in Eq. (1), which is equivalent to a Euclidean distance between the two vectors (input and output). Each autoencoder was designed with 3 neurons in the input and output layers, corresponding to the 3 gas ratios, demapping layer of 2 neurons , a bottleneck layer of 1 neuron and a remapping layer of 2 neurons as shown below.
Untitled.jpg
Figure 3 AANN network created in matlab
Looking at the illustration above, notice that if the inputs are used as both inputs and target values, the network will be trained to perform a is equal to mapping.
The activation functions used in the input and hidden layers were hyperbolic tangents, while in the output layer each neuron had a linear activation function. The training procedure adopted the Levenberg-Marquardt algorithm
In Table 3 one finds the results obtained with the new system, as opposed to the ones obtained when applying IEC 60599 sam. to few samples data for two faults. It is remarkable that no errors or misclassification were produced by the new system. The training and test sets were not specially doctored, except that there was careful in having a training set covering as evenly as possible the domain.
Table 3 Performance comparison between IEC 60599 and the autoencoder diagnosis system for two fault cases
C2H2/
C2H4
CH4/
H2
C2H4/
C2H6
Fault
Auto Encoder
IEC 60599
0.0001
0.1102
0.0001
PD
PD
NI
1.1667
1.1065
0.1000
PD
PD
NI
0.0001
0.0476
0.0001
PD
PD
PD
1.0
0.1667
1.0
DL
DL
NI
4.0
0.1607
4.0
DL
DL
DL
2.2233
0.125
1.0605
DL
DL
DL
VI RESULTS AND DISCUSSION
Since Autoassociative network has the same input and output, the input and target data is same for the above network. And all the data are trained using Lavenberg-Marquet algorithm. The performance of the network is measured using 'mean square error'.
The data's for training the above network is taken from IEC TC10 database [18]. Some of the samples used for training the networks are shown in Table 4 . Since two faults are taken are analyses, two Autoassociative networks namely Autoencoder 1 and Autoencoder 2 are created. Autoencoder 1 is trained with the data's of PD fault and Autoencoder 2 is trained with the data's of LD fault .
One can therefore tune an autoencoder to a particular fault mode by training each autoencoder with corresponding data. When activated by a new input vector, an autoencoder will reproduce it in its output with very small error if the input corresponds to the fault for which it was trained, otherwise the output will display a large dissimilarity with input. The results presented in Table 4 show that, in the transformer fault diagnosis based on DGA, autoencoders do discriminate among the distinct fault modes
As mentioned earlier, one with the minimum error is the winning vector. Here five network was created. Each Autoencoder is trained with particular fault data.Here the Autoencoder 1 was trained with the PD fault and the Autoencoder 2 was trained with the LD fault and so on . So when the test vector belongs to PD then error in Autoencoder 1 should be minimum compared to the errors in remaining Autoencoder . The results of data tested is shown in Table 4 . From Table 4, it is clear that the correct fault can be diagnosed by finding the minimum error.
Table 4 Results of few samples tested using Autoassociative Network
C2H2/
C2H4
CH4/
H2
C2H4/
C2H6
Actual Fault
ε in net 1
ε in net 2
0.0185
0.6847
0.0127
PD
0.3228
5.336
0.0001
0.1102
0.0001
PD
0.4752
5.486
1.1667
0.1065
0.1000
PD
0.0184
5.029
4.0
0.1607
4.0
DL
10.196
5.185
3.00
0.1558
3.3602
DL
6.2926
1.281
3.3602
0.1631
3.3602
DL
7.0571
2.046
In Table 3 the results of auto associative network is compared against IEC 60599. From the Table 3, it is clear that autoencoder has good accuracy compared to the existing method. And the autoencoders can able to diagnose the faults which are not identified by the IEC 60599.
VII CONCLUSIONS
The work reported in this paper is based on the novel application of autoassosiative neural networks. The main idea behind the new system is to take advantage of the property of autoencoders that allows them to learn the manifold where data lie, by projecting inputs to a different space and reprojecting back to the input space. One can therefore tune an autoencoder to a particular fault mode. When activated by a new input vector, an autoencoder will reproduce it in its output with very small error if the input corresponds to the fault for which it was trained, otherwise the output will display a large dissimilarity with input.
The architecture proposed for fault diagnosis is completely general and its application is not restricted to transformer fault diagnosis. It may be suitable for any problem with lots of data in to allow proper training of the Autoassociative neural networks.