Chinese Text Correction with Semantic Detection

icann

Published on ICANN2022. Code

ABSTRACT

Text correction, especially the semantic correction of more widely used scenes, is strongly required to improve, for the fluency and writing efficiency of the text. An adversarial multi-task learning method is proposed to enhance the modeling and detection ability of character polysemy in Chinese sentence context. Wherein, two models, the masked language model and scoring language model, are introduced as a pair of not only coupled but also adversarial learning tasks. Moreover, the Monte Carlo tree search strategy and a policy network are introduced to accomplish the efficient Chinese text correction task with semantic detection. The experiments are executed on three datasets and five comparable methods, and the experimental results show that our method can obtain good performance in Chinese text correction task for better semantic rationality.

METHODOLOGY

Case Study

icann

An example is presented including an original correct sentence, a typical wrong sentence generated in our experiment, and five corrected results by four comparable correction methods and our method. For the above different corrected results, the corrected result of our method might be the most reasonable in sentence semantics. Concretely, for the character string with wave underlines in sentences, our method accurately understands the emotion of the given wrong sentence and finds similar words (争得/strive for). Similarly, for the character string with straight under-lines, only our method can give a correction solution that the (中/in) is added. Of course, the last result of our method still has not completely repaired the wrong sentence to the ideal result (original correct sentence), which also reflects the complexity and hardness of the Chinese text correction task.