【强化学习玩转超级马里奥】03-马里奥环境代码说明

推荐订阅源

Cloudbric

Exploit-DB.com RSS Feed

SecWiki News

Forbes - Security

News | PayPal Newsroom

Security @ Cisco Blogs

V2EX - 技术

CERT Recently Published Vulnerability Notes

NISL@THU

Securelist

Security Archives - TechRepublic

Know Your Adversary

Vulnerabilities – Threatpost

Security Latest

Recent Commits to openclaw:main

GRAHAM CLULEY

Hacker News: Front Page

Microsoft Azure Blog

Intezer

Google Online Security Blog

美

美团技术团队

阮一峰的网络日志

The Exploit Database - CXSecurity.com

News and Events Feed by Topic

量子位

Darknet – Hacking Tools, Hacker News & Cyber Security

GbyAI

AWS News Blog

博客园 - 范仁义

吊打市面上100%的markdown编辑器范仁义软件合集全网最通俗易懂傅里叶变换【强化学习玩转超级马里奥】05-最最简单的超级马里奥训练过程【强化学习玩转超级马里奥】04-stable-baselines3 库介绍【强化学习玩转超级马里奥】02-运行超级马里奥【强化学习玩转超级马里奥】01-nes-py 包安装实例【强化学习玩转超级马里奥】01-超级马里奥环境安装【强化学习玩转超级马里奥】00-强化学习玩马里奥课程介绍 linux查找操作分析MongoDB架构案例 legend3---bootstrap modal框出现蒙层,无法点击modal框内容（z-index问题） legend3---laravel报419错误 laravel自定义中间件实例 laravel中间件Middleware原理解析及实例 git： Failed to connect to github.com port 443: Timed out 记忆规律 tinymce上传的图片不指定宽高 z-index总结

【强化学习玩转超级马里奥】03-马里奥环境代码说明

范仁义 · 2022-03-18 · via 博客园 - 范仁义

【强化学习玩转超级马里奥】03-马里奥环境代码说明

一、代码分析

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import time
from matplotlib import pyplot as plt

env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

二、分析动作

1、使用 JoypadSpace

env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

env.action_space

env.action_space.sample()

2、查看动作具体是什么

SIMPLE_MOVEMENT

SIMPLE_MOVEMENT[1]

3、不使用JoypadSpace的情况

env = gym_super_mario_bros.make('SuperMarioBros-v0')

env.action_space

4、使用固定动作效果

比如只让马里奥向右走

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import time

env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
for step in range(5000):
    if done:
        state = env.reset()
    state, reward, done, info = env.step(6)
    time.sleep(0.01)
    env.render()

env.close()

env.close()

三、分析state

state = env.reset()
state.shape

plt.imshow(state)

state, reward, done, info = env.step(2)
plt.imshow(state)

四、查看奖励

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import time

env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
for step in range(5000):
    if done:
        state = env.reset()
    state, reward, done, info = env.step(1)
    print(reward)
    time.sleep(0.04)
    env.render()

env.close()

五、查看info

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import time

env = gym_super_mario_bros.make('SuperMarioBros-v0')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

state = env.reset()
state, reward, done, info = env.step(1)
print(info)

六、换关卡

from nes_py.wrappers import JoypadSpace
import gym_super_mario_bros
from gym_super_mario_bros.actions import SIMPLE_MOVEMENT
import time

env = gym_super_mario_bros.make('SuperMarioBros-4-2-v1')
env = JoypadSpace(env, SIMPLE_MOVEMENT)

done = True
for step in range(5000):
    if done:
        state = env.reset()
    state, reward, done, info = env.step(env.action_space.sample())
    time.sleep(0.01)
    env.render()

env.close()

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。