Q-Learning is a reinforcement learning technique. It has the ability to compute the utility of the actions without a model for the environment. It takes the help of action-value pair and the expected reward from the current action. During this process the agent learns to move around the environment and understand the current state which is the optimal policy by taking the action with the highest reward. Let us look at an example of this technique.
To read more, click here.
To participate in great data science hackathons and win amazing prizes, click here.