基于LDA和word2vec模型的情報學期刊主題挖掘與演化分析
首發時間:2023-05-12
摘要:LDA模型常常被用來探尋情報學學科領域的研究變化趨勢以及某一時期的研究熱點。針對LDA模型主題模型挖掘過程中主題個數選擇問題和主題關聯構建時未考慮主題詞語義含義問題做進一步優化研究,為豐富和完善主題演化分析方法提供參考。以《情報雜志》為例,運行LDA模型對期刊中摘要、標題和關鍵字結合作為語料主題模型。在設置主題個數時,結合使用困惑度和主題平均相似度初步確定主題個數,再運用信息熵進一步優化過濾識別出的主題;在主題演化建立關聯時,提出一種基于LDA和word2vec的主題演化研究方法方法?;贚DA和word2vec主題演化研究方法能夠結合語義表示很好地發現主題內容新生、消亡、繼承、分化、融合關系,這對科研人員判斷學科變化趨勢,決策者發現研究重點有重要意義。
關鍵詞: 管理科學與工程 主題挖掘 主題演化 LDA word2vec 《情報雜志》
For information in English, please click here
LDA and word2vec Based Topic Mining and Evolution Analysis of Chinese Information Science Journal Paper
Abstract:The LDA model is usually adopted to investigate the research dynamics and research hotspots in the field of information science. In the process of topic mining with LDA model, an optimization is made on the determination of the number of topics and taking semantic meaning of topic words into consideration in the construction of topic association, so as to provide a reference for enriching and improving the method of topic evolution.Taking Journal of Intelligence as an example, the LDA model is used to investigate the topics of the journal with abstracts, titles and keywords as the corpus. When setting the number of topics, the number of topics is preliminarily determined by the perplexity and average similarity of topics, and then the information entropy is used to filter the identified topics; when establishing the association of topics, a LDA and word2vec based topic evolution analysis method is proposed. With semantic representation, the proposed method can identify key topic semantic evolution patterns effectively such as rebirth, extinction, inheritance, division and merging of topic content, which is of great significance for researchers to investigate the trend of discipline development.
Keywords: Management Science and Engineering topic mining topic evolution LDA word2vec Journal of Intelligence
基金:
引用
No.****
動態公開評議
共計0人參與
勘誤表
基于LDA和word2vec模型的情報學期刊主題挖掘與演化分析
評論
全部評論