返回首页
苏宁会员
购物车 0
易付宝
手机苏宁

服务体验

店铺评分与同行业相比

用户评价:----

物流时效:----

售后服务:----

  • 服务承诺: 正品保障
  • 公司名称:
  • 所 在 地:
本店所有商品

  • 醉染图书策略前展、策略迭代与分布式强化学习9787302599388
  • 正版全新
    • 作者: (美)德梅萃·P.博赛卡斯著 | (美)德梅萃·P.博赛卡斯编 | (美)德梅萃·P.博赛卡斯译 | (美)德梅萃·P.博赛卡斯绘
    • 出版社: 清华大学出版社
    • 出版时间:2022-04-01
    送至
  • 由""直接销售和发货,并提供售后服务
  • 加入购物车 购买电子书
    服务

    看了又看

    商品预定流程:

    查看大图
    /
    ×

    苏宁商家

    商家:
    醉染图书旗舰店
    联系:
    • 商品

    • 服务

    • 物流

    搜索店内商品

    商品分类

    商品参数
    • 作者: (美)德梅萃·P.博赛卡斯著| (美)德梅萃·P.博赛卡斯编| (美)德梅萃·P.博赛卡斯译| (美)德梅萃·P.博赛卡斯绘
    • 出版社:清华大学出版社
    • 出版时间:2022-04-01
    • 版次:1
    • 印次:1
    • 字数:698000
    • 页数:500
    • 开本:16开
    • ISBN:9787302599388
    • 版权提供:清华大学出版社
    • 作者:(美)德梅萃·P.博赛卡斯
    • 著:(美)德梅萃·P.博赛卡斯
    • 装帧:平装
    • 印次:1
    • 定价:139.00
    • ISBN:9787302599388
    • 出版社:清华大学出版社
    • 开本:16开
    • 印刷时间:暂无
    • 语种:暂无
    • 出版时间:2022-04-01
    • 页数:500
    • 外部编号:1202623400
    • 版次:1
    • 成品尺寸:暂无

    1 Exact and Approximate Dynamic Programming Principles

    1.1 AlphaZero, Off-Line Training, and On-Line Play

    1.2 Deterministic Dynamic Programming

    1.2.1 Finite Horizon Problem Formulation

    1.2.2 The Dynamic Programming Algorithm

    1.. Approximation in Value Space

    1.3 Stochastic Dynamic Programming

    1.3.1 Finite Horizon Problems

    1.3.2 Approximation in Value Space for Stochastic DP

    1.3.3 Infinite Horizon Problems-An Overview

    1.3.4 Infinite Horizon-Approximation in Value Space

    1.3.5 Infinite Horizon-Policy Iteration, Rollout, andNewtons Method

    1.4 Examples, Variations, and Simpictions

    1.4.1 A Few Words About Modeling

    1.4.2 Problems with a Termination State

    1.4.3 State Augmentation, Time Delays, Forecasts, and Uncontrollable State Components

    1.4.4 Partial State Information and Belief States

    1.4.5 Multiagent Problems an Mtagent Rollout

    1.4.6 Problems with Unknown Parameters-AdaptiveControl

    1.4.7 Adaptive Control by Rollout and On-LineReplanning

    1.5 Reinforcement Learning and Optimal Control-SomeTerminology

    1.6 Notes and Sources

    2 General Principles of Approximation in Value Space

    2.1 Approximation in Value and Policy Space

    2.1.1 Approximation in Value Space-One-Step an Mtstep Lookahead

    2.1.2 Approximation in Policy Space

    2.1.3 Combined Approximation in Value and Policy Space

    2.2 Approaches for Value Space Approximation

    2.2.1 Off-Line and On-Line Implementations

    2.2.2 Model-Based and Model-Free Implementations

    2.. Methods for Cost-to-Go Approximation

    2.2.4 Methods for Exdiig the Lookahead Minimization

    . Deterministic Rollout and the Policy Improvement Principle

    ..1 On-Line Rollout for Deterministic Discrete Optimization

    ..2 Using Multiple Base Heuristics-Parallel Rollout

    .. The Simplified Rollout Algorithm

    ..4 The Fortified Rollout Algorithm

    ..5 Rollout with Multistep Lookahead

    .. Rollout with an Expert

    .. Rollout with Small Stage CossndLng Horizon-Continuous-Time Rollout

    2.4 Stochastic RolloundMnte Carlo Tree Search

    2.4.1 Simulation-Based Implementation of the Rollout Algorithm

    2.4.2 Monte Carlo Tree Search

    2.4.3 Randomized Policy Improvement by Monte Carlo Tree Search

    2.4.4 The Effect of Errors in Rollout-Variance Reduction

    2.4.5 Rollout Parallelization

    2.5 Rollout for Infinite-Spaces Problems-Optimization Heuristics

    2.5.1 Rollout for Infinite-Spaces Deterministic Problems

    2.5.2 Rollout Based on Stochastic Programming

    2.6 Notes and Sources

    3 Specialized Rollout Algorithms

    3.1 Model Predictive Control

    3.1.1 Target Tubes and Constrained Controllability

    3.1.2 Model Predictive Control with Terminal Cost

    3.1.3 Variants of Model Predictive Control

    3.1.4 Target Tubes and State-Constrained Rollout

    3.2 Multiagent Rollout

    3.2.1 Asynchronous and Autonomous Multiagent Rollout

    3.2.2 Multiagent Coupling Through Constraints

    3.. Multiagent Model Predictive Control

    3.2.4 Separable an Mtarmed Bandit Problems

    3.3 Constrained Rollout-Deterministic Optimal Control

    3.3.1 Sequential Consistency, Sequential Improvement, and the Cost Improvement Property

    3.3.2 The Fortified Rollout Algorithm and Other Variations

    3.4 Constrained Rollout-Discrete Optimization

    3.4.1 General Discrete Optimization Problems

    3.4.2 Multidimensional Assignment

    3.5 Rollout for Surrogate Dynamic Programming and Bayesian Optimization

    3.6 Rollout for Minimax Control

    3.7 Notes and Sources

    4 Learning Values and Policies

    4.1 Parametric Approximation Architectures

    4.1.1 Cost Function Approximation

    4.1.2 Feature-Based Architectures

    4.1.3 Training of Linear and Nonlinear Architectures

    4.2 Neural Networks

    4.2.1 Training of Neural Networks

    ……

    Dimitri P. Bertsekas,德梅萃 P.博塞克斯(Dimitri P. Bertseka),美国MIT终身教授,美国工程院院士,清华大学复杂与网络化系统研究中心客座教授。电气工程与计算机科学领域国际知名作者,著有《非线规划》《网络优化》《动态规划》《凸优化》《强化学习与很优控制》等十几本教材和专著。

    "读者通过本书可以了解强化学习中策略迭代,特别是Rollout方法在分布式和多智能体框架下的近期新进展和应用。本书可用作人工智能或系统与控制科学等相关专业的高年级生或作为一个学期的课程教材。也适用于开展相关研究工作的专业技术人员作为参考书阅读。

    "

    售后保障

    最近浏览

    猜你喜欢

    该商品在当前城市正在进行 促销

    注:参加抢购将不再享受其他优惠活动

    x
    您已成功将商品加入收藏夹

    查看我的收藏夹

    确定

    非常抱歉,您前期未参加预订活动,
    无法支付尾款哦!

    关闭

    抱歉,您暂无任性付资格

    此时为正式期SUPER会员专享抢购期,普通会员暂不可抢购