sorry 自然文章方程没有号将就看看abstract吧。


所有跟贴·加跟贴·新语丝读书论坛

送交者: 短江学者 于 2016-03-15, 22:24:05:

回答: 另外这是我的前贴最后一句说了mc 由 短江学者 于 2016-03-15, 22:13:28:

引用:
The theory of reinforcement learning provides a normative account1,deeply rooted in psychological2 and neuroscientific3 perspectives on animal behaviour, of how agents may optimize their control of an environment.Touse reinforcement learning successfully insituations approaching real-world complexity, however, agents are confronted with a difficult task: theymust derive efficient representations of the environment from high dimensional sensory inputs, and use these to generalize past experience to newsituations. Remarkably, humans and other animals seemto solve this problemthrough a harmonious combination of reinforcement learningandhierarchical sensoryprocessing systems4,5, the former evidenced by a wealth of neural data revealing notable parallels betweenthe phasic signals emitted bydopaminergic neurons and temporal difference reinforcement learning algorithms3.While reinforcement learning agents have achieved some
successes in a variety of domains6–8, their applicability has previously been limited to domains inwhich useful features can be handcrafted,
or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks9–11 to develop a novel artificial agent, termed a deep Q-network, that can
learn successful policies directly fromhigh-dimensional sensory inputs using end-to-end reinforcement learning。




所有跟贴:


加跟贴

笔名: 密码: 注册笔名请按这里

标题:

内容: (BBCode使用说明