Logbook for September 21
Week 35 - September 21
Thursday 9/2
Paper reviewed on arxiv about Continuous Control With Deep Reinforcement Learning. (Lillicrap et. al - 2015) arXiv:1509.02971. This is about DDPG. Initial paper comes from David Silver: Deterministic policy gradient algorithms in ICML 2014, but is not easy to read. Here is a review from towardsdatascience, in which the Deep Deterministic Policy Gradients (DDPG) is presented, and is written for people who wish to understand the DDPG algorithm.
Week 36 - September 21
Monday 9/6
Install of barrier to share keyboard/mouse between linux and windows. Nice combinaison with KVM usb switch.
Move wsl to another drive with move-wsl
Wednesday 9/8
Creation of custom gym environment and optimization using DQN, then DDPG with stable baselines 3. Takes around 50,000 steps to optimize a ultra simple grid problem… No success with DDPG, something missing?
Thursday 9/9
Still playing with gym and stable baselines 3. A2C, PPO and SAC are working but DDPG and TD3 are not (and I don’t know why)
Week 38 - September 21
Monday 9/20
Back to Aniti RL virtual school. Looking for material to be used to explain RL to my colleagues, and how to properly describe the experience I am running with gym.
Certainly will start lectures from deepming: 2021 DeepMind x UCL RL Lecture Series
Thursday 9/23
Start plotly course from datacamp using my datacamp learning process. I need basic interactivity and 3d plots to illustrate reward functions.