The documentation (mainly solved exercices) of the book Reinforcement Learning by Richard S. Sutton and Andrew G. Barto
I crossed my answers and took inspiration from https://github.com/vojtamolda/reinforcement-learning-an-introduction/tree/main and https://github.com/habanoz/reinforcement-learning-an-introduction/tree/master
Thanks to samas on gargle discord for the help