学报首页 >> 期刊浏览 >> 正文

 标    题: 基于文本挖掘和多模块融合的金融数据分类分级方法(创刊号)(2022 第1卷 第1期 12)
作       者

叶强,詹宝强,马笑晨,李永立

文章栏目:
数字经济与管理
摘       要: 随着对金融行业对数据安全保护工作的逐步重视,提升数据分类分级的能力将有效赋能该行业推进数据安全建设。针对既有研究难以获取数据字段的准确表征以及数据不均衡等问题,本文构建了基于文本挖掘和多模块融合的金融数据分类分级方法。具体地,在数据输入模块中,分别基于数据结构和专业语料库对数据字段进行特征增强和语义增强,以准确地表征字段;在模型训练和融合模块中,采用Stacking框架将Adaboost,MLP和LSTM神经网络进行有效融合,进一步提升数据分类分级的准确性和泛化能力。以广发银行研发中心的27,694条数据字段为样本进行了一系列的模型检验和性能对比分析,结果显示融合模型的准确率可以达到0.822,相对于单一方法表现更优且更为稳健。由此表明:本文所构建的数据分类分级方法具有较高的准确性和有效性,在金融领域、特别是针对商业银行的数据分级分类问题具有较高的实践价值。
关  键  词:

文本挖掘; 数据分类分级; 特征增强; Stacking融合框架

Abstract: With the gradual emphasis on data security protection in the financial industry, improving the performance of data grading will greatly facilitate the data security construction. Aiming at the difficulty to obtain the accurate representation of data and the problem of data imbalance in existing literature, this study proposes a financial data grading method based on text mining and multi-module fusion. Concretely, in the data input module, both feature enhancement and semantic augmentation based on data structure and professional corpus are performed so as to obtain accurate representation of data. In the model training and fusion module, the stacking framework is used to integrate Adaboost, MLP and LSTM neural networks to further improve the accuracy of data grading and model generalizability. Using 27,694 data samples from China Guangfa Bank, this study conducts a series of empirical analysis. The result shows that the accuracy of the fusion models reaches 0.822, performing better than any single models. Besides, the result has confirmed the accuracy and efficacy of the proposed method, which also provides practical value particularly for commercial banks to deal with data grading challenges in financial industry.
Keywords:

Data Grading; Feature Enhancement; Stacking Fusion Framework

作者简介:

叶强,哈尔滨工业大学高级管理研究院院长、教授、博士生导师,E-mail: yeqiang@hit.edu.cn;

詹宝强,哈尔滨工业大学经济与管理学院博士研究生,E-mail: 21b910010@stu.hit.edu.cn;

马笑晨,哈尔滨工业大学经济与管理学院博士研究生,E-mail: xcm_0309@foxmail.com;

李永立,哈尔滨工业大学经济与管理学院教授、博士生导师,E-mail: liyongli@hit.edu.cn。

链       接: 阅读原文




分享新闻:
0

 

 友情链接