Filename | Size |
| 001.Welcome/001. Why should you care.mp4 | 32.4 MB |
| 001.Welcome/001. Why should you care.srt | 15.4 KB |
| 001.Welcome/002. Reinforcement learning vs all.mp4 | 10.8 MB |
| 001.Welcome/002. Reinforcement learning vs all.srt | 4.9 KB |
| 002.Reinforcement Learning/003. Multi-armed bandit.mp4 | 17.9 MB |
| 002.Reinforcement Learning/003. Multi-armed bandit.srt | 7.3 KB |
| 002.Reinforcement Learning/004. Decision process & applications.mp4 | 23 MB |
| 002.Reinforcement Learning/004. Decision process & applications.srt | 9.7 KB |
| 003.Black box optimization/005. Markov Decision Process.mp4 | 18 MB |
| 003.Black box optimization/005. Markov Decision Process.srt | 8.3 KB |
| 003.Black box optimization/006. Crossentropy method.mp4 | 36 MB |
| 003.Black box optimization/006. Crossentropy method.srt | 15.5 KB |
| 003.Black box optimization/007. Approximate crossentropy method.mp4 | 19.3 MB |
| 003.Black box optimization/007. Approximate crossentropy method.srt | 8.2 KB |
| 003.Black box optimization/008. More on approximate crossentropy method.mp4 | 22.9 MB |
| 003.Black box optimization/008. More on approximate crossentropy method.srt | 10.5 KB |
| 004.All the cool stuff that isn't in the base track/009. Evolution strategies core idea.mp4 | 20.9 MB |
| 004.All the cool stuff that isn't in the base track/009. Evolution strategies core idea.srt | 7.3 KB |
| 004.All the cool stuff that isn't in the base track/010. Evolution strategies math problems.mp4 | 17.7 MB |
| 004.All the cool stuff that isn't in the base track/010. Evolution strategies math problems.srt | 8.6 KB |
| 004.All the cool stuff that isn't in the base track/011. Evolution strategies log-derivative trick.mp4 | 27.8 MB |
| 004.All the cool stuff that isn't in the base track/011. Evolution strategies log-derivative trick.srt | 12.6 KB |
| 004.All the cool stuff that isn't in the base track/012. Evolution strategies duct tape.mp4 | 21.2 MB |
| 004.All the cool stuff that isn't in the base track/012. Evolution strategies duct tape.srt | 9.7 KB |
| 004.All the cool stuff that isn't in the base track/013. Blackbox optimization drawbacks.mp4 | 15.2 MB |
| 004.All the cool stuff that isn't in the base track/013. Blackbox optimization drawbacks.srt | 7.3 KB |
| 005.Striving for reward/014. Reward design.mp4 | 49.7 MB |
| 005.Striving for reward/014. Reward design.srt | 23.2 KB |
| 006.Bellman equations/015. State and Action Value Functions.mp4 | 37.3 MB |
| 006.Bellman equations/015. State and Action Value Functions.srt | 18.2 KB |
| 006.Bellman equations/016. Measuring Policy Optimality.mp4 | 18.1 MB |
| 006.Bellman equations/016. Measuring Policy Optimality.srt | 8.5 KB |
| 007.Generalized Policy Iteration/017. Policy evaluation & improvement.mp4 | 31.9 MB |
| 007.Generalized Policy Iteration/017. Policy evaluation & improvement.srt | 14.5 KB |
| 007.Generalized Policy Iteration/018. Policy and value iteration.mp4 | 24.2 MB |
| 007.Generalized Policy Iteration/018. Policy and value iteration.srt | 12.1 KB |
| 008.Model-free learning/019. Model-based vs model-free.mp4 | 28.8 MB |
| 008.Model-free learning/019. Model-based vs model-free.srt | 14.1 KB |
| 008.Model-free learning/020. Monte-Carlo & Temporal Difference; Q-learning.mp4 | 30.1 MB |
| 008.Model-free learning/020. Monte-Carlo & Temporal Difference; Q-learning.srt | 14.5 KB |
| 008.Model-free learning/021. Exploration vs Exploitation.mp4 | 28.2 MB |
| 008.Model-free learning/021. Exploration vs Exploitation.srt | 14 KB |
| 008.Model-free learning/022. Footnote Monte-Carlo vs Temporal Difference.mp4 | 10.3 MB |
| 008.Model-free learning/022. Footnote Monte-Carlo vs Temporal Difference.srt | 4.8 KB |
| 009.On-policy vs off-policy/023. Accounting for exploration. Expected Value SARSA..mp4 | 37.7 MB |
| 009.On-policy vs off-policy/023. Accounting for exploration. Expected Value SARSA..srt | 17.3 KB |
| 010.Experience Replay/024. On-policy vs off-policy; Experience replay.mp4 | 26.7 MB |
| 010.Experience Replay/024. On-policy vs off-policy; Experience replay.srt | 11.2 KB |
| 011.Limitations of Tabular Methods/025. Supervised & Reinforcement Learning.mp4 | 50.6 MB |
| 011.Limitations of Tabular Methods/025. Supervised & Reinforcement Learning.srt | 25.4 KB |
| 011.Limitations of Tabular Methods/026. Loss functions in value based RL.mp4 | 33.8 MB |
| 011.Limitations of Tabular Methods/026. Loss functions in value based RL.srt | 15.2 KB |
| 011.Limitations of Tabular Methods/027. Difficulties with Approximate Methods.mp4 | 47 MB |
| 011.Limitations of Tabular Methods/027. Difficulties with Approximate Methods.srt | 21.9 KB |
| 012.Case Study Deep Q-Network/028. DQN bird's eye view.mp4 | 27.8 MB |
| 012.Case Study Deep Q-Network/028. DQN bird's eye view.srt | 11.4 KB |
| 012.Case Study Deep Q-Network/029. DQN the internals.mp4 | 29.6 MB |
| 012.Case Study Deep Q-Network/029. DQN the internals.srt | 12.3 KB |
| 013.Honor/030. DQN statistical issues.mp4 | 19.2 MB |
| 013.Honor/030. DQN statistical issues.srt | 9.2 KB |
| 013.Honor/031. Double Q-learning.mp4 | 20.5 MB |
| 013.Honor/031. Double Q-learning.srt | 9.4 KB |
| 013.Honor/032. More DQN tricks.mp4 | 33.9 MB |
| 013.Honor/032. More DQN tricks.srt | 16.4 KB |
| 013.Honor/033. Partial observability.mp4 | 57.2 MB |
| 013.Honor/033. Partial observability.srt | 27.7 KB |
| 014.Policy-based RL vs Value-based RL/034. Intuition.mp4 | 34.9 MB |
| 014.Policy-based RL vs Value-based RL/034. Intuition.srt | 15.6 KB |
| 014.Policy-based RL vs Value-based RL/035. All Kinds of Policies.mp4 | 16 MB |
| 014.Policy-based RL vs Value-based RL/035. All Kinds of Policies.srt | 7.4 KB |
| 014.Policy-based RL vs Value-based RL/036. Policy gradient formalism.mp4 | 31.6 MB |
| 014.Policy-based RL vs Value-based RL/036. Policy gradient formalism.srt | 13.3 KB |
| 014.Policy-based RL vs Value-based RL/037. The log-derivative trick.mp4 | 13.3 MB |
| 014.Policy-based RL vs Value-based RL/037. The log-derivative trick.srt | 5.9 KB |
| 015.REINFORCE/038. REINFORCE.mp4 | 31.4 MB |
| 015.REINFORCE/038. REINFORCE.srt | 14 KB |
| 016.Actor-critic/039. Advantage actor-critic.mp4 | 24.6 MB |
| 016.Actor-critic/039. Advantage actor-critic.srt | 11.8 KB |
| 016.Actor-critic/040. Duct tape zone.mp4 | 17.5 MB |
| 016.Actor-critic/040. Duct tape zone.srt | 7.8 KB |
| 016.Actor-critic/041. Policy-based vs Value-based.mp4 | 16.8 MB |
| 016.Actor-critic/041. Policy-based vs Value-based.srt | 7.1 KB |
| 016.Actor-critic/042. Case study A3C.mp4 | 26.1 MB |
| 016.Actor-critic/042. Case study A3C.srt | 11.1 KB |
| 016.Actor-critic/043. A3C case study (2 2).mp4 | 15 MB |
| 016.Actor-critic/043. A3C case study (2 2).srt | 6 KB |
| 016.Actor-critic/044. Combining supervised & reinforcement learning.mp4 | 24 MB |
| 016.Actor-critic/044. Combining supervised & reinforcement learning.srt | 11.9 KB |
| 017.Measuting exploration/045. Recap bandits.mp4 | 24.7 MB |
| 017.Measuting exploration/045. Recap bandits.srt | 11.9 KB |
| 017.Measuting exploration/046. Regret measuring the quality of exploration.mp4 | 21.3 MB |
| 017.Measuting exploration/046. Regret measuring the quality of exploration.srt | 10.2 KB |
| 017.Measuting exploration/047. The message just repeats. 'Regret, Regret, Regret.'.mp4 | 18.4 MB |
| 017.Measuting exploration/047. The message just repeats. 'Regret, Regret, Regret.'.srt | 8.7 KB |
| 018.Uncertainty-based exploration/048. Intuitive explanation.mp4 | 22.3 MB |
| 018.Uncertainty-based exploration/048. Intuitive explanation.srt | 10.9 KB |
| 018.Uncertainty-based exploration/049. Thompson Sampling.mp4 | 17.1 MB |
| 018.Uncertainty-based exploration/049. Thompson Sampling.srt | 7.9 KB |
| 018.Uncertainty-based exploration/050. Optimism in face of uncertainty.mp4 | 16.5 MB |