This book describes the effect of various factors such as learning rate, discount factor and epsilon on the training ability of the drone to navigate from source to destination. The higher value of learning rate encourages the fast learning of the drone but here is risk of oscillation rather than converging and for the lower value of learning rate the drone learns slowly but converges steadily. This book work is mainly focused on implementation of the RL algorithms for the smaller areas. For larger complex areas these algorithms are less efficient so Deep Reinforcement learning can be used in future to make UAV more efficient for real-world implementation.