Programming by Demonstration has been recently proposed as a way for a robot learning tasks from human demonstrations, where action recognition is a crucial step in the procedure. Based on this concept, a model-free approach for object manipulation was proposed by Aksoy et al.[1]. In specific, the approach classifies actions by observing object-interaction changes based on video segmentation. However, the segmentation suffers from various difficulties, such as motion blur, complex environment, over- and under- segmentation. For this reason, we simulate and evaluate the Aksoy et al.'s method. Additionally, we adapt a kernel based representation into Aksoy et al.'s method. The experiments shows the new method improves action recognition rate significantly.