3.2 Keyword Trend
The consistent structure of a web page makes crawling
data possible. After grasping the keywords, the author
can start analyzing the keywords. First, we can use the
Term Frequency -Inverse Document Frequency (TF-
IDF) method for keyword analysis. Some words are
classified as keywords because they appear more
frequently (Roelleke, 2013). Quantitative data is the
purpose of TF-IDF calculation. After quantifying the
data, the content of the data can be classified by
keywords. News keywords can introduce emotional
classification and category tagging. The author can
then find whether the emotion of news is positive,
neutral or negative by keyword clustering (Topkev,
2016). To judge the tendency of the report. In addition
to using data to express news analysis results of news
analysis, researcher can also use pictures, tables and
colors to express.
On the choice of tools for data analysis and
visualization, the author chose Python and Dychart.
Dychart is a software that can easily visualize data,
and the built-in preset template can greatly reduce the
time of making visual charts. Although the author has
captured the relevant headlines of the news in the data
collection phase, and direct analysis of the title is the
fastest and easiest way, the analysis of news headlines
may not fully reflect the changes in Huawei by news
reports. Hence, in the end, the author chose news
content as the source of keyword capture.
After selecting the text as the source of the
keyword, the author need to face the most intuitive
problem: there are many recurring words. There is not
much practical significance and does not play a
decisive role in our research, such as punctuation and
prepositions or conjunctions. Hence, the author use
the python data cleaning function (data clean) to filter
punctuation. At the same time, English words contain
many changes in tense and plural. At this point, we
need to use morphological repair. Lexicalization can
transform words into general forms and express
complete semantics. For the first time, this will clean
up the keywords and then manually screen out
unnecessary conjunctions such as “and”, “or”, “then”.
Add these words to the stop list and filter the
keywords again to select the first 100 words with the
highest frequency. Subsequent analysis will also be
based on chosen keywords.
3.3 Word Cloud Map
Word cloud analysis is also the primary method of
visualizing news data. Cloud images can convey
information about the importance of words through
the size of terms. The word cloud image can also be
distinguished by adding different colors to different
words. Simultaneously, the word cloud map can
intuitively show the important difference of keyword
and display the key information of news text
(Ponnambalam, 2019). By forming a word cloud, the
author can filter out most meaningless textual
information, and words with specific meanings can be
retained. However, prepositions or meaningless
function words still need to be cleared in advance.
The author chose to make word clouds to visualize
keyword changes in 4 years. The word cloud can
visually display keywords in news reports by the size
and color of the text. Import the obtained keyword and
word frequency data into Dychart. Using the word
cloud template, simply adjust the text size and color to
generate the corresponding word cloud picture. The
word clouds of 2018-2021 to express ABC News
reports’ keywords can be got.
3.4 Text Clustering Analysis
Based on keywords and TF-IDF analysis, the news
can be clustered to discuss the report’s emotion.
Hence, KMeans was used to classify the text. The
author imported the KMeans Library in Python. In the
final output, yellow represents that the article has
positive emotions, and black represents that the article
has negative feelings.
4 RESULTS
The researcher finds that as shown in figure 2. In July
2018, the Australian government took the lead in
banning Huawei from participating in Australia's 5G
network on national security grounds, and the number
of news reports showed a rapid growth trend. In mid-
2019, with the release of the Trump administration's
Defense Authorization Act, the number of reports
peaked at 96 and 94 in a month, which has been called
a high concern for single media. In 2020, the
Australian subsidiary of Huawei announced the
termination of its ten-year sponsorship relationship
with the Australian rugby team Canberra Raiders,
which was also continued reported on ABC news for
a while (Overton, 2020). Thus, ABC News reports on
Huawei are positively related to the Australian and
American governments’ continued attention.