Human language can describe external perpceptive world. The association between langauge and perveptive information is called language grounding. Language Grounding is a kernel problem of cognitive science. The resolution of this problem is also very important for the development of intelligence servive robots. It is now well acknowledged that children langauge acquisition mainly dependend on the highly complex brain and communications with external envirnoment, especially dialogues with their babycarers. These two factors and their tight co-operation compose the kernel mechanism of children language acquisition. This project plan to model the mechanism of children language acquisition, and make progresses in langauge grounding. We firstly propose a hybrid deep network model. The model includes two deep networks and one counter-propagation network. Two deep networks are learnt to analysis the visul and audio stream seperately, internal representations are formed in deep network after learning. Counter-propagation network is then used to build links between two internal representations. We also model different stages of children language acquisition in the same network frame. Two models with same network frame are proposed to model the internal mechanism of grounded language acquisition in both stages of one-word sentence and two-word sentence.Then,we model the dialogue between babycarers and children. The main contribution here is to develop a dialogue system which has both linguistic and visual information. A model for intentions analysis by combining both linguistic and visual information is proposed. A mechnism of intentions driven network learning and test is developed to model the co-operation between external enviornments and internal structures in children language acquisition. Finally, we integrate dialogue system and hybrid network on a robot. The robot has the capability to talk with humans by making use of information from both visual and audio modes, and acquire grounded language of one-word and two-word gratually like that in human children.
语言接地连接语言与感知,是认知科学中的一个重要问题,这一问题的解决对于研制智能服务机器人具有重要价值。儿童语言习得依赖于内部大脑结构以及与外部环境的交流,这两个因素协同作用,是儿童分阶段逐步习得接地语言的关键。本项目拟建模这种儿童语言习得机制,在语言接地技术上取得突破。首先提出一种能对视、听双模态信息分别进行深层加工,并在深层表示间建立关联的混合深层网络模型,构建其学习和评测方法。同时,建模儿童语言习得的阶段性机制,使模型在统一的深层网络结构框架下适应不同阶段的语言特点;其次对基于视、听双模态信息的监护人与儿童对话进行建模,提出监护人说话意图的分析方法;进而研究由说话人意图驱动的深层网络学习和测试控制机制,建模儿童语言习得时外部交流与内部结构间的协同作用。最后将这些技术集成部署在一个机器人上,实现一个能基于视、听双通道信息与人进行自然交互,具有分阶段逐步习得单词和双词组合接地语言的机器人。
人类语言习得与两个因素有密切关系,其一是与感知信息的关联,其二是与外部的语言交流。本研究旨在为这两者建立计算模型,并在机器人上进行验证。项目研究的主要结果如下:.提出了一种对应自编码器(Corr-AE)结构及一系列变种结构,该类结构将文本和图像两个模态各自的表示学习与双模态关联学习集成在一个联合模型中。也将此类结构推广到概率模型上,构建了对应受限玻尔兹曼机(Corr-RBM)。提出了在这些基础单元上构建深层多模态关联模型的方案及其学习算法。在三个公开数据集上的实验表明,这些模型比已有模型具有更好的获取图-文双模态关联信息的能力。研究了以对比散度算法(CD)为代表的深度学习算法的收敛性,给出了新的收敛性条件,提出了平均对比散度算法(ACD),理论分析和实验结果表明其比对比散度算法具有更好的收敛性质。.提出了一种排序可变形部件模型(RDPM)。该模型在DPM中引入排序形式的目标函数,证明了新问题是一个泛凹-凸规划问题,进而提出了一种优化算法。在公开数据集上的实验表明,RDPM 具有比DPM 更好的检测性能。在此基础上提出了一种对图像自动生成语言描述的模型,其不仅可以生成多句描述,还可以生成更完整的针对目标位置和目标形态的描述。.提出了一种层次长短期记忆模型(HLSTM)模型及其变体来联合建模对话理解中的意图识别和槽填充任务,模型同时考虑两个任务之间的关联约束和层次性,实验表明了该模型具有比已有一些模型更好的对话理解性能。.提出了一种层次MDP对话管理模型,可以有效降低对话状态空间的规模。提出了一个利用语音识别N-Best结果而非单一识别结果进行POMDP观测概率估计的方法,实验表明其可有效提高人机对话的可靠性,缩短对话轮次。.基于上述多模态关联技术和人机对话技术,实现了一个是面向概念学习的对话教学系统,初步构建了一个可以持续进行多模态语言学习的人机对话原型系统。此外,还实现了一个面向会议室预订服务的人机对话系统。.如果上述研究能有助于机器人获得与人类似的语言能力,则对探索人类语言习得机制具有重要的科学意义。
{{i.achievement_title}}
数据更新时间:2023-05-31
跨社交网络用户对齐技术综述
1例脊肌萎缩症伴脊柱侧凸患儿后路脊柱矫形术的麻醉护理配合
基于LASSO-SVMR模型城市生活需水量的预测
基于SSVEP 直接脑控机器人方向和速度研究
城市轨道交通车站火灾情况下客流疏散能力评价
非宏观语言习得独词到双词阶段的言语输入量研究
词素位置线索的习得及运用促进儿童的新词学习
学龄前孤独谱系障碍儿童汉语核心语法的习得研究
聋人汉语习得及其语用协调机制研究