Research on Transmission Line Small Target Detection and Defect

Recognition Based on Machine Vision

Yanjun Dong

, Zhonghong Ou

, Xiaoyu Yin

, Xin Lu

and Tao Yao

State Grid Hebei Electric Power Company Co., Ltd Information & Telecommunication Branch, Shijiazhuang, China

Beijing University of Posts and Telecommunications, Beijing, China

Keywords: Transmission Line Inspection, Small Size Target, Target Detection.

Abstract: At present, the unmanned aerial vehicle (UAV) is faced with two difficulties in the course of inspection of

power transmission line: (1) it is difficult to find small targets. The existing machine vision methods have

poor performance in the detection of small targets, and there are still deficiencies in multi-scale feature

extraction and fusion. (2) because of the uneven distribution of defect sets and normal sets, the differences

based on the classification of semantic information, and the fusion of shallow location features and deep

semantic features, it is difficult to identify and classify defects effectively.

1 INTRODUCTION

In this project, based on the background of unmanned

inspection of transmission line, taking transmission

line as an example, and on the basis of deep-level

machine vision, the method of small target data

expansion and expansion of transmission line is

studied, and on this basis, the general object detection

framework YOLOV5 is improved to improve the

efficiency of small target detection, and the method

of fault identification and classification is explored.

With the modern society becoming more and

more dependent on electricity, the overhaul of electric

lines has become an important task under

uninterruptible power supply conditions. A power

line is made up of several components, which have

different functions, such as insulators, wires, metal

fittings, etc. The field work environment is complex,

the climate is changeable, and has the foreign material

invasion danger, causes the electric element in the

electric network to be easy to have the damage. A

failure of one component, such as an insulator failure,

or a failure of multiple components, such as a failure

of a metal joint, may result in a power outage.

According to the annual development report of

China's power industry 2021, by the end of 2020, the

number of transmission lines of 220 kV and above

had reached 794,000 km, and it is still increasing at

an annual rate of about 4.6%. Under the goal

of“Improving quality and efficiency”, the traditional

manual inspection mode will be gradually replaced by

new technologies such as UAV inspection and robot

inspection. In the electric power industry, it is

important to speed up the establishment of

information inspection platform, and to make use of

digital technology to assist transmission line

maintenance personnel in line maintenance.

YOLOv5-based improved model for small target

detection on transmission lines YOLOV5 has a

similar structure to Yolov4, but it is a new Focus

(Backbone) structure, which uses hierarchical image

processing, the feature map with low dimension and

multi-scale is obtained. There are some differences in

the selection of Backbone activation function. Yolov5

uses Leaky Re Lu as the activation function of the

hidden layer, and finally the detection layer uses

Sigmoid activation function, yolov4 uses Mish as the

activation function of its Backbone. Leaky's formula

for activating the three functions from Lu, Sigmoid,

and Mish is:

Leaky ReLU:f





=

𝑥 𝑖𝑓𝑥 ≥ 0

𝑥

𝑎

𝑖𝑓𝑥<0





Sigmoid: f





1+𝑒







Mish: f



𝑥



𝑥∙tanh



1+𝑒



 



The overall network structure of YOLOV5 is

similar to that of Yolov4: the network structure

diagram for Yolov5 is shown in Figure 1. Compared

to Yolov4, Yolov5 has lower network complexity and

is more suitable for deployment on edge computing

Dong, Y., Ou, Z., Yin, X., Lu, X. and Yao, T.

Research on Transmission Line Small Target Detection and Defect Recognition Based on Machine Vision.

DOI: 10.5220/0012280600003807

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 2nd International Seminar on Artiﬁcial Intelligence, Networking and Information Technology (ANIT 2023), pages 271-274

ISBN: 978-989-758-677-4

271

devices. The Yolov5 network is then analyzed from

the input side and the Backbone.

Figure 1. Yolov5 network structure diagram.

(1) YOLOv5-based digital image enhancement

algorithm at the input, which randomly scales,

cuts, arranges and synthesizes 4 images into

a new input image. The Mosaic algorithm is

used to divide the large-scale objects into

smaller objects randomly, which balances the

data distribution of small objects, improves

the robustness of the network and the

detection ability of small objects. The

Mosaic-processed input image is shown in

figure 3.2:

Figure 2: input data processed by the Mosaic method.

Backbone part of Yolov5 Backbone part of

Yolov5 is shown in Figure 3. It is based on the Yolov3

Backbone network Darknet53. It is a reference to the

design ideas of CSPNet, thus the backbone structure

of CSP Dark Net53 is formed. CSPDARKNET53

uses the idea of cross-phase local area network (CSP)

to extract multi-level features from the input image to

reduce the number of parameters and the scale of the

model.

Figure 3. Yolov5 Backbone network structure.

2 IMPROVED SMALL TARGET

DETECTION MODEL OF

TRANSMISSION LINE BASED

ON YOLOV5S NETWORK

2.1 Improved Backbone Network

Based on Weighted Bidirectional

Characteristic Pyramid Network

At present, an important problem in small target

detection is the efficient expression and processing of

multi-scale features, the traditional top-down method

based on backbone network organically combines the

multi-scale features, and thus realizes the top-down

organic combination of multi-scale features, when

multi-scale features are fused with each other, only

the weight of the shallow features is considered, while

the rich location information contained in the shallow

features is ignored. In this project, a new multi-scale

feature fusion network (Panet) is proposed by

introducing the“Bottom-up” feature aggregation

network (Panet) on the basis of the existing FPN, and

make it the Backbone of the Yolov5s open source

Backbone. In view of the shortcomings of the existing

Yolov5s network model, this project intends to use

the weighted binary feature cone network to

distinguish the weights of different features by

introducing learnable weights, in order to enhance the

existing small target features in the role of feature

fusion network. Weighted bidirectional feature cone

network (BI-FPN) is a kind of backbone network

which can replace the traditional FPN for small target

detection in transmission lines. In figure 4, you can

see the difference between the weighted bidirectional

feature pyramid (Bi FPN) and the feature pyramid

(FPN) and the path aggregation network (Panet).

ANIT 2023 - The International Seminar on Artiﬁcial Intelligence, Networking and Information Technology

272

Figure 4. structure comparison of FPN, Panet, Bi FPN.

Compared with Panet (Polar-Agency Network),

BiFPN (BiFPN) with weighted bidirectional features

has a different node join pattern from Panet (Panel

Agency Network), the optimization methods of the

cross-scale join include:

(1) deleting the unique input nodes in the PANET

-LRB-Path Agency Network). Because there is no

node with fusion characteristic, the nodes of p 3 and

P 6 are eliminated, and a small simplified binary

network is obtained.

(2) at the same scale, the frequency-hopping

connection between the input and output nodes is

increased, so that the frequency-hopping connection

on the same feature layer can be fused at more levels

with limited computation.

(3) unlike Panet (Patholic Agency Network) ,

which has only one top-down and one bottom-up

feature channels, Bi-FPN (weighted bidirectional

feature cone) treats each bidirectional channel as a

feature Network layer, and through repeated

processing of this layer features, thus achieving a

higher dimension of feature fusion.

Swin-Transformer improves the prediction head

based on Swin Transformer encoder. Swin-

transformer replaces the moving window with the

moving window, performs self-attention computation

on the non-overlapping local feature layer, and

completes the neighbor feature aggregation by using

the method of layer connectivity.

In the field of object detection, due to

Transformer's dependence on high-resolution images,

its attention complexity is about the square of image

size. On this basis, a sparse representation method

based on multi-scale features is proposed.

SWINTRANSFORMER fuses adjacent smaller

image blocks to create a hierarchical feature map for

deep mining. When the number of image blocks in

each feature layer is constant, the computational

complexity is linear with the image size.

This method makes use of the common

hierarchical construction method in convolutional

neural network and the concept of image region to

realize the self-attention computation of inconsistent

image window. Compared to the convolution process

in convolutional neural network (CNN), Swin

Transformer performs a convolution on each window

to get a window's properties, while Swin Transformer

performs a self-focusing calculation on each window,

a new window is obtained, and then the new window

is fused once, and then the fused window is fused

once.

In this model, the traditional long-term attention

mode (MSA) is transformed into a moving window

mode. Swin converter consists of a sliding window-

based Multilayer perceptron (MSA), which connects

two different types of Multilayer perceptron (mlps) in

series.

Instead of the Swin Transformer framework, the

traditional Transformer framework needs to perform

global self-attention computation on the image, which

consumes a lot of computing resources, and it needs

to divide the image into m × m non-overlapping

blocks, on this basis, the computational complexity of

global-based MSA and moving window-based W-

sma are:



𝑀𝑆𝐴



=4hwC





ℎ𝑤











𝑊𝑀𝑆𝐴



=4hwC



+2MhwC





From formula (4)(5) , we can see that the

operation complexity of MSA is the square of the

number of image blocks HW, the operation

complexity of W-sma based on moving window is

linear with the number of image blocks.

3 SUMMARY

With the wide application of deep learning and

machine vision, transmission line inspection is

changing from traditional manual inspection to

intelligent inspection. In this paper, target detection

and fault identification in transmission line inspection

are studied, and the task of small target detection and

fault identification in transmission line inspection is

studied. On this basis, it is improved by using

converter, sven converter, weighted bidirectional

characteristic pyramid, and convolutional attention

model, in this paper, we extend the defective samples

by using saliency map, and adopt the method of

enhanced feature pyramid and deep semantic

embedding.

ACKNOWLEDGEMENTS

This work was supported by the National Key

Research and Development Program of China under

Grant 2020AAA0107500.

Research on Transmission Line Small Target Detection and Defect Recognition Based on Machine Vision

273

REFERENCES

Xu Haiqing, Yu Jiangbin, Liang Chong, et al. An improved

GAN-based defect detection method for small metal

fittings of RPN transmission lines [J]. Electronic

Devices, 2021(006): 044.

Liu Yanmei, Wen Shihua, Chen Zhen, etc. Target

recognition of transmission line anti-vibration hammer

reset robot [J]. Journal of Shenyang Aerospace

University, 2021.

Kong Chenhua, Yang Kang, Li Wang, etc. Automatic

detection method of transmission line wire strand

breakage based on machine vision [J]. Automation

applications, 2022(10): 4.

Yellow Juting. Research on key technologies of

transmission line identification and location based on

machine vision [D]. Southwest Jiaotong University,

2021.

Zhai Bing, Qin Xiongpeng, Zhu Longchang, etc.

Classification and early-warning model for icing

disaster of EHV transmission lines based on machine

vision [J]. Yunnan Electric Power Technology, 2022,

50(6): 6.

Wu Jun, Bai Liangjun, Dong Xiaohu, etc. Small target

defect detection method for transmission line based on

Cascade R-CNN algorithm [J]. Grid and Clean Energy,

2022(038-004)

E. Coser, C. Arthur Ferreira, J. M. Giacomini Angelini, B.

Aragão and I. Perez Almirall, Mechanical analysis of

silicone rubber used on the cover of polymeric

insulators[J], IEEE Latin America Transactions, 2010

8(6), pp. 653-657

https://doi.org/10.1109/TLA.2010.5688091

Ismail Sinan Atlı,Atilla Evcin. Analysing Mechanical

Behaviors of Carbon Fiber Reinforced Silicone Matrix

Composite Materials after Static Folding [J]. Politeknik

Dergisi, 2020: 351-359

Hatamleh MM, Watts DC. Mechanical properties and

bonding of maxillofacial silicone elastomers [J]. Dent

Mater. 2010 26(2):185-91.

https://doi.org/10.1016/j.dental.2009.10.001

Zhao Tingting,Yu Ran,Li Shan,Li Xinpan,Zhang

Ying,Yang Xin,Zhao Xiaojuan,Wang Chen,Liu

Zhichao,Dou Rui,Huang Wei. Superstretchable and

processable silicone elastomers by digital light

processing 3D printing [J]. ACS applied materials &

interfaces. 2019, 11(15): 14391–14398

https://doi.org/10.1021/acsami.9b03156

ANIT 2023 - The International Seminar on Artiﬁcial Intelligence, Networking and Information Technology

274