面向多義詞例句語料生成的大模型微調(diào)指令自動化生成框架

打印
收藏

收藏成功

微博 QQ空間微信

打開文本圖片集

Abstract：First，a manual instruction setcontaining a body description set and a list of instruction examples is constructed as the initial input for the instruction pool.Then，input the instructions from the instruction pool into the large model to generate a number of machine-generated instructions corresponding to their corpora，the generated corpora are refined with text correction to obtain the desired polysemy example sentence corpus. Finaly，the edit distance algorithm is used to remove the weight of machine instructions，and the spectral clustering algorithm is used to cluster the candidate machine instructions，thereby achieving automated generation of machine instructions.By updating the instruction pool， iterative generation of the polysemy example sentence corpus is realized. The results show that the constructed polysemy example sentence dataset and its corresponding large model machine instruction set exhibit good linguistic diversity and content diversity. The constructed polysemy example sentence dataset meets the needs of second language learners in terms of sentence length，sentiment，vocabulary difficulty standard level ，and topics. Keywords：large language model; instruction generation; polysemy； example sentence generation; ChatGPT

中文作為一種復雜的語言，具有豐富的多義詞現(xiàn)象，即一個字或一個詞有多個不同的意義。（剩余11760字）

試讀結(jié)束

購買全文6.00元下一篇位置感知及背景掃描下軟件定義車聯(lián)網(wǎng)無縫切換方案

華僑大學學報（自然科學版）

2025年03期

￥6.00/本

特黄三级爱爱视频|国产1区2区强奸|舌L子伦熟妇aV|日韩美腿激情一区|6月丁香综合久久|一级毛片免费试看|在线黄色电影免费|国产主播自拍一区|99精品热爱视频|亚洲黄色先锋一区

面向多義詞例句語料生成的大模型微調(diào)指令自動化生成框架