|
1.INTRODUCTIONThe data submitted by residents in the community platform is called community topic data. It is difficult for administrators to filter the data submitted by residents, and they need to classify the topics first and then select the events that residents need to solve urgently. Therefore, it is especially important to design a method that fits the classification of community topics. Topic data is a kind of short textbook. With the development of Deep Learning[1], models based on Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN) have been used in text classification tasks. Kim[2] proposed using convolutional kernels of different sizes to obtain sentence features. Kalchbrenner[3] used wide convolution to extract long-distance text information. Liu[4] proposed three RNN text classification models for different tasks. Li[5] proposed a low-complexity deep CNN architecture. Vaswani[6] Proposed Transformer model based on Attention mechanism and Codec structure. All of the above methods alleviate the reliance on manually constructed features, but cannot extract the original structured information in the graph when processing graph data, while graph representations of text have the advantage of capturing discontinuous and long-range semantics. In recent years, graph neural networks (GNN)[7-9] have attracted extensive academic attention. Defferrard[10] first applied GNN to text classification task by implementing spatial graph convolution operation. Yao[11] proposed Text-GCN model to treat text classification as a node classification task. Based on Yao, Huang[12] proposed to build an independent graph with shared parameters for each document to reduce the storage space consumption. Zhang[13] proposed TextING to update node information using Gated GNN[14]. Literatures[15-16] introduce other models to alleviate the sparsity problem, but increase the model complexity. In addition, all of the above methods ignore the importance of key information to the text when building graph models, and construct a single graph by treating all words equally, increasing the influence of irrelevant data. The topic data has the following characteristics. The number of words is small and the features are sparse. Some of the category labels to which the topic data belong appear in the topic, and local key information is especially important for this part of the topic. The topics are directly posted by residents, and there is a lot of spoken information, and the global information of this part of the topics is especially important. Considering the above aspects, this paper proposes to build a composite complex network[17] containing two kinds of nodes, keywords and topics, to make up for the shortage of graph model construction in existing methods. The keywords are extracted by an integrated algorithm[18], focusing on the influence of semantics and word frequency on the text to strengthen the role of key information. To obtain global information, a bi-directional long and short term memory network (BiLSTM) is added for feature enhancement from the topic itself features. Two graph structures are extracted from the established network, and then the node information is updated based on graph attention network (GAT) and semantic attention is added for feature fusion in order to complete the task of text classification of community topics. 2.A COMMUNITY CLASSIFICATION METHOD BASED ON GRAPH MODELThe proposed graph model-based community topic classification method DGAT includes the following steps. First, the keywords of the topic data are extracted by the integration algorithm, and the features are enhanced by using BiLSTM. Then, a keyword-topic composite complex network is built, from which two graph structures of topic nodes are mapped and then the node features are updated and fused. 2.1Graph model constructionKeywords are the core words that characterize the single topic data. The established complex complex network model is shown in Figure 1, which contains two kinds of nodes, keyword, topic, belonging relationship, and similar relationship with two kinds of connected edges. The two graph structures are generated by keyword-topic affiliation mapping and topic similarity relationship respectively. 2.1.1Model DescriptionDefinition 1 Topic. All the topic data form a topic set, denoted as T. Each sentence in T is called ti, where i=1,2,3… |T|, |T| is the number of topics in the topic set. Definition 2 Keywords. For Ki ∈ T, multiple words can be extracted to characterize the topic, called the keywords of ti, denoted as ki. All keywords extracted in T are noted as K, where K= K1 ∪ K2 ∪ K3 ∪ … ∪ Ki. Definition 3 TopicNet. Denoted as TopicNet =<T,E,S>, where T is the topic set. E = {e|e =< ti,tj >,ti,tj ∈ T} is the set of undirected edges between similar topics.S is the similarity between topics. Definition 4 Key-TopicNet. Denoted as Key-TopicNet=<N, E, W>, where N={k1,k2,…,kn}∪{t1,t2,…,tm} is the set of nodes. E={e|e=<ti,kj>,ti,kj∈N} is the set of edges, e=<ti,kj> indicating the undirected edges of the kj belonging to the ti. W indicates the probability of the word becoming a keyword. 2.1.2Keywords-topic affiliation mapping generates graph structureWe propose to use an integrated algorithm to extract keywords, the integration operation H is as defined in Equation (1). Where the set of weights U ={u1,u2,…,un}, the magnitude of the weights indicates the magnitude of the impact that each algorithm has on the results. F={f1, f2,…, fn}, n is the total number of base algorithm results and requires . Considering that the TextRank algorithm[19] and LTP technique can extract semantic relationships between words, and the TF-IDF algorithm[20] calculates word frequency relationships to complement the former, the three algorithms are integrated to extract keywords and obtained with an integration ratio of 1:1:2. The relationship between keywords and topics is many-to-many. If there exists edge ei=<ti,ki> and edge ej=<tj,ki>, it means that both ti and tj are connected to keyword node ki, so ti and tj may belong to the same category. Therefore, a kind of undirected graph between topics and topic nodes is mapped according to the belonging relationship between topics and keywords. This undirected graph is represented as G=(A, X), where A ∈ Rn×n is a symmetric adjacency matrix, aij is an element in A, aij=1 indicates that there are connected edges between ti and tj, n=|T|, X∈Rn×d is the feature matrix of topic nodes, and d is the dimension of features. 2.1.3Topic similarity generation graph structureTo capture the influence of non-critical information on the topic data, the feature similarity S between topic nodes is first calculated, and the formula is shown in (2). Then the K-nearest neighbor idea is used to obtain the K nodes with the greatest similarity to the current node for concatenating edges. Finally, the undirected graph structure Gk = (Ak, X) is extracted from the composite network, and Ak ∈ Rn×n is the symmetric adjacency matrix of the KNN graph. Where xi、xj∈ Rd×1 are the features of nodes ti and tj respectively, which are one-dimensional vectors. |xi| and |xj| are the modes of nodes ti and tj respectively. In addition, in order to make the initial feature matrix X of topic node e obtain more comprehensive information, we use BiLSTM model for feature enhancement. First, the word vector of ti is initialized using word2vec, so the initialized feature matrix xi of ti is obtained. Then the features in both directions are obtained using the forward and backward LSTMs respectively, and finally the two are spliced. Based on this, the initial feature matrix X={x1,x2,…,xn} of the whole topic set T is obtained. 2.2A community topic classification model based on graph attention mechanismGAT can effectively filter the noise information and preserve the global structure information of the graph, so GAT is used to update the node information. After the previous steps, two graph structures G and Gk are obtained, and different structures have different levels of importance for different topic data. Based on this, this paper proposes to add a semantic attention layer to learn the importance weights of different structures for the current nodes. Figure 2 shows the model architecture of DGAT, which is mainly divided into three parts: graph model construction, feature updating and fusion, and node classification. 2.2.1Node UpdatesUsing the topics as nodes, enter the adjacency matrices A, Ak, and X in the model. First the model calculates the attention fraction αij between node pairs <ti,tj >. Where γ is the shared weight matrix obtained from training. β is the attention parameter vector. zi is the feature of ti after one nonlinear transformation and k denotes the use of a multi-headed attention mechanism at the intermediate layer. Where and γ(k) are the attention coefficients and shared parameter matrices obtained from the training of the kth head attention mechanism, respectively. 2.2.2Feature FusionThe semantic attention added to learn the importance of different structures is shown in Equation (5). Where, θr and θk are the importance weights of different semantic features, respectively. For node ti, is its feature under matrix Zr. Specifically, the attention coefficients for the influence of different semantic features on the current node classification result are obtained using nonlinear transformation and normalization as follows. The features of node ti under Zr are mapped to a real weight by a nonlinear transformation, and similarly is the weight under the feature matrix Zk of node ti. Then the two are normalized to the attention coefficients and by softmax. Finally, the two features are weighted and summed by semantic attention coefficients to obtain Z. 2.3Loss functionThe loss Loss of the model is minimized using the cross-entropy function, while the L2–norm is added to prevent overfitting. Where C is the label of the topic data, yij is the true label of the topic data, pij is the probability value of the model for the predicted label of the topic data. pij is the regularization factor, and υ is the model parameter. 3.EXPERIMENTS AND RESULTS ANALYSIS3.1DatasetSince there is no public dataset of topic data, we use the Qingdao community topic dataset to verify the validity of the method. Firstly, 4000 randomly selected data are labeled and experimented, and the data are divided into 10 categories with labels of epidemic, handling, mask, garbage, maintenance, disinfection, volunteer, virus, quarantine, and environment. The complex complex network built is shown in Figure 3, where the topic nodes are in blue and the keyword nodes are in orange for the topic classification task. 3.2Comparison experimentThe data set was randomly divided into training set, validation set, and test set in the ratio of 8:1:1, and a total of four cross-validations were performed to observe the classification accuracy accuracy (acc) and macro-F1 (F1-score) values for each experiment. If the acc value and F1-score value are higher, the better the model classification effect is. Where T is denoted as the number of samples with all correct predictions, F is denoted as the number of samples with all incorrect predictions, macro-precision is the precision rate, and macro-recall is the recall rate. We use six benchmark models, namely TextCNN, BiLSTM, DPCNN, Transformer, TextGCN and TLGNN. The word vectors are initialized using word2vec with a dimension of 300 dimensions. The results and average values of the cross-validation experiments are shown in Table 1. Table 1.Comparison of classification performance of different models
In Table 1 we can see that the average accuracy as well as the average F1-score are highest on the DGAT model with 0.9002 and 0.8933, respectively, which indicates that DGAT can be applied to community topic classification and can achieve better results. The TextCNN model achieves an average accuracy of 0.8732, which can outperform most of the baseline methods because the convolutional operations can mine the key information in the short text. The Transformer model requires a large training set for training, so it does not work well on this community topic data. The TextGCN, TLGNN and DPCNN models are all designed for long text and do not work well in the face of the data sparsity problem of short texts of topic data. 3.3Ablation experimentsIn this paper, a total of two edge structures were generated when building the graph model structure. Therefore, this experiment is designed to see which of the connected edges contributes more to the final result and whether it is reasonable. The experimental results are shown in Table 2. Table 2.Ablation experiment
As can be seen from Table 2, the acc value is only 0.8058 when using only feature similarity as the edge concatenation method. The acc value can reach 0.8893 for the keyword-generated concatenation method. The F1-score values are also in a sequential increasing relationship. This indicates that the key information plays a greater role, but using both alone does not exceed the performance of using both simultaneously. 3.5Effect of training set ratioTo understand the effect of training set proportions on the classification performance of the model, 10%, 20%, 40%, 60%, and 80% proportions of samples from the dataset of Experiment 2 were randomly selected as the training set for the experiments. The comparison with TextCNN, BiLSTM, and DPCNN was performed to observe the acc values, and the experimental results are visualized using line graphs, as shown in Figure 4. It can be seen that DGAT can achieve an accuracy of 0.8278 with only 10% of the dataset. DGAT establishes connected edges by combining keywords with information about the topic itself, calculates the weights of node neighbors based on attention, and is able to capture local information without the need for the entire graph structure.This combination of information allows DGAT to show good performance with a small training set. 4.CONCLUSIONIn this paper, we propose a community topic classification method based on graph attention mechanism. For the characteristics of community topics, keywords are extracted using an integration algorithm to build a kind of composite complex network containing two kinds of connected edges and two kinds of nodes, and two graph models are extracted from them.Then the features of nodes are updated using graph attention mechanism and semantic attention is added to learn the influence of two graph structures on nodes, and finally the feature representation of topic nodes is obtained.Experiments show that the method proposed in this paper works well for the Chinese community topic classification task. REFERENCESZhao, A. T., LI, J. B. and Dong, J. Y.,
“Multimodal gait recognition for neurodegenerative diseases,”
IEEE transactions on cybernetics,
(2021). Google Scholar
Kim, Y.,
“Convolutional Neural Networks for Sentence Classification,”
Eprint Arxiv,
(2014). Google Scholar
Kalchbrenner,N., Grefenstette, E. and Blunsom, P.,
“A convolutional neural network for modelling sentences,”
in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,
(2014). https://doi.org/10.3115/v1/P14-1 Google Scholar
Liu, P. F., Qiu, X. P. and Huang, X. J.,
“Recurrent neural network for text classification with multi-task learning,”
arXiv preprint arXiv:1605.05101,
(2016). Google Scholar
Johnson, R. and Zhang, T.,
“Deep pyramid convolutional neural networks for text categorization,”
in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,
(2017). Google Scholar
Vaswani, A., Shazeer, N., Parmar, N. and Uszkoreit, J.,
“Attention is all you need,”
Advances in neural information processing systems, 30
(2017). Google Scholar
Kipf, T. N. and Welling, M.,
“Semi-supervised classification with graph convolutional networks,”
arXiv preprint arXiv:1609.02907,
(2016). Google Scholar
Velikovi, P., Cucurull, G., Casanova, A., Romero, A., Lio, P. and Bengio, Y.,
“Graph Attention Networks,”
arXiv preprint arXiv:1710.10903,
(2017). Google Scholar
Gao, H., Yu, X., Sui, Y., Shao, F. J. and Sun, R. C.,
“Topological Graph Convolutional Network Based on Complex Network Characteristics,”
IEEE Access, 10 64465
–64472
(2022). https://doi.org/10.1109/ACCESS.2022.3183103 Google Scholar
Defferrard, M., Xavier B. and Pierre, V.,
“Convolutional neural networks on graphs with fast localized spectral filtering,”
Advances in neural information processing systems, 29
(2016). Google Scholar
Liang, Y., Mao, C. S. and Luo, Y.,
“Graph convolutional networks for text classification,”
in Proceedings of the AAAI conference on artificial intelligence,
(2019). Google Scholar
Huang, L., Ma, D. and Li, S.,
“Text level graph neural network for text classification,”
arXiv preprint arXiv:1910.02356,
(2019). Google Scholar
Zhang, Y. F., Yu, X. L. and Cui, Z. Y.,
“Every document owns its structure: Inductive text classification via graph neural networks,”
arXiv preprint arXiv:2004.13826,
(2020). Google Scholar
Li, Y. J., Tarlow, D. and Brockschmidt, M.,
“Gated graph sequence neural networks,”
arXiv preprint arXiv:1511.05493,
(2015). Google Scholar
Hu, L. M., Yang, T., Shi, C., Ji, H. and Li, X.,
“Heterogeneous graph attention networks for semi-supervised short text classification,”
in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing,
(2019). Google Scholar
Yuan, Z. Y., Gao, S, Cao, J.,
“Small sample short text classification based on heterogeneous graph convolution network,”
Computer Engineering. Papers, 47
(12), 87
–94
(2021). Google Scholar
Sui, Y.,
“Research on multi subnet complex network model and its related properties,”
Qingdao University(2012). Google Scholar
Zhang, S. A., Wang, X., Dai, J. P., Sui, Y. and Sun, R. C.,
“Keywords extraction algorithm based on keyword co-occurrence network,”
Complex systems and complexity science,
(2022). Google Scholar
Mihalcea, R. and Pau,l T.,
“Textrank: Bringing order into text,”
in Proceedings of the 2004 conference on empirical methods in natural language processing,
(2004). Google Scholar
Li, J. Z., Fan, Q. N. and Zhang, K.,
“Keyword extraction based on tf/idf for Chinese news document,”
Wuhan University Journal of Natural Sciences. Papers, 12
(5), 917
–921
(2007). https://doi.org/10.1007/s11859-007-0038-4 Google Scholar
|