mirror of
https://github.com/frankwxu/AI4DigitalForensics.git
synced 2026-02-20 13:40:40 +00:00
add reinforcement leanring tutorial
This commit is contained in:
@@ -147,7 +147,7 @@
|
||||
"\n",
|
||||
"---\n",
|
||||
"\n",
|
||||
"## The Safe Agent (No AI, only one hardcoded rule)\n",
|
||||
"# Trial 1: The Safe Agent (No AI, only one hardcoded rule)\n",
|
||||
"We're going to implement a simple agent 'The Safe Agent' who will thrust upward if and only if the lander's `y` position is less than 0.5.\n",
|
||||
"\n",
|
||||
"In theory this agent shouldn't hit the ground as we have unlimited fuel, but let's see."
|
||||
@@ -215,7 +215,7 @@
|
||||
"\n",
|
||||
"---\n",
|
||||
"\n",
|
||||
"## The Stable Agent (No AI, with a set of hardcoded rules)\n",
|
||||
"# Trial 2: The Stable Agent (No AI, with a set of hardcoded rules)\n",
|
||||
"Let's try to define and agent that can remain stable in the air.\n",
|
||||
"\n",
|
||||
"It will operate via the following rules:\n",
|
||||
@@ -312,7 +312,7 @@
|
||||
"\n",
|
||||
"---\n",
|
||||
"\n",
|
||||
"# The AI Agent (AI agent with Deep Reinforcement Learning)\n",
|
||||
"# Trial 3: The AI Agent (AI agent with Deep Reinforcement Learning)\n",
|
||||
"To address this challenge, we'll use deep reinforcement learning techniques to train an agent to land the spacecraft.\n",
|
||||
"\n",
|
||||
"Simpler tabular methods are limited to discrete observation spaces, meaning there are a finite number of possible states. In `LunarLander-v3` however, we're dealing with a continuous range of states across 8 different parameters, meaning there are a near-infinite number of possible states. We could try to bin similar values into groups, but due to the sensitive controls of the game, even slight errors can lead to significant missteps.\n",
|
||||
@@ -1075,7 +1075,7 @@
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 45,
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
@@ -1187,7 +1187,7 @@
|
||||
"\n",
|
||||
" # Optional: Download the video\n",
|
||||
" # from google.colab import files\n",
|
||||
" # video_file = glob.glob('video/LunarLander-v3-rl-video-episode-*.mp4')[0] # Match the generated file\n",
|
||||
" # video_file = glob.glob('video/LunarLander-v3-episode-*.mp4')[0] # Match the generated file\n",
|
||||
" # files.download(video_file)"
|
||||
]
|
||||
},
|
||||
|
||||
BIN
lab10_Reinforcement_Learning/video/LunarLander-v3-episode-0.mp4
Normal file
BIN
lab10_Reinforcement_Learning/video/LunarLander-v3-episode-0.mp4
Normal file
Binary file not shown.
Reference in New Issue
Block a user