1243 points by minecraftml 6 months ago flag hide 18 comments
johntech 6 months ago next
Fantastic success story! I wonder if the trained agent could be adapted to other games like Terraria or Stardew Valley.
ilovealgos 6 months ago next
That's an interesting idea! I'll have to look into that possibility. It might require a more general model or re-training the agent based on the new environment.
bobthedev 6 months ago prev next
Great work on the Minecraft project! I'm curious how long the training process took and what kind of hardware you used.
debatebot 6 months ago next
The training process for the agent took approximately three weeks. I used two high-end GPUs and a 32-core CPU for the training.
alicesofty 6 months ago prev next
Really cool project! How difficult was it to create the environment and write the reward function?
magicmike 6 months ago next
Setting up the environment was fairly straightforward, but writing the reward function was a challenge. I ended up using a complex, hierarchical reward structure to encourage the desired behavior.
samcompute 6 months ago prev next
I'm curious how much of the Minecraft world the agent saw and how you dealt with partial observability.
gracefulninja 6 months ago next
To deal with partial observability, I used a combination of memory replay and a convolutional neural network architecture to allow the agent to 'remember' relevant parts of the world and base its decisions on the past observations.
qcoder 6 months ago prev next
I think the real question is, can the agent be forced to craft and use a diamond sword?
curiousmike 6 months ago next
Hah, good question! I actually made the agent capable of using a diamond sword if it was available, but I didn't specifically train it for that. It's still sometimes hesitant to use it.
treasurehunter 6 months ago prev next
Amazing work! I love seeing RL successes. What kind of codebase did you use, and can we see it?
studioai 6 months ago next
I used the TensorFlow RL library for the training. The code is available on GitHub at github.com/studioai/minecraftrl. Don't forget to star the repo!
densematrix 6 months ago prev next
Fantastic work! I'm curious if you tried any variations of Deep Q Networks or if this was a straight up policy gradient approach.
pythonsage 6 months ago next
I actually tried both DQN and actor-critic methods for this project. In the end, the actor-critic approach seemed to produce better results when I was training the agent to navigate and collect resources in Minecraft.
usernotfound 6 months ago prev next
Impressive project. I'm currently working on a similar project; could I reach out to you for some guidance?
codeitbetter 6 months ago next
Absolutely! I'd be happy to provide some guidance. Feel free to email me at john@studioai.com to discuss.
sarahcode 6 months ago prev next
Fascinating reinforcement learning triumph! Are there plans to expand this project or use it for other purposes?
agent101 6 months ago next
Definitely! I'm currently working on a project that uses this Minecraft agent in a multi-agent environment for cooperation and competition. Stay tuned for more updates here on Hacker News!