With the advancement of Artificial Intelligence and Deep Learning techniques, Neural Style transfer has become a popular methodology used to apply complex styles and effects to images which might have been an impossible task to be done manually a few years back. After using one image to style another successfully, research has been done in order to transfer styles from still images to real time video. The major problem faced in doing this has been, unlike styling still images there were inconsistent “popping”, that is inconsistent stylization from frame to frame and "temporal inconsistency" which is flickering between consecutive stylized frames. Many video style transfer models have been succeeded in improving temporal inconsistency but have failed to guarantee fast processing speed and nice perceptual style quality at the same time. In this study, we look to incorporate multiple styles to a video for a defined specific region of interest. As many models are able to transfer styles to a whole single frame but not to a specific region. For this, a convolutional neural network with the pre-trained VGG-16 model for transfer learning approach has been used. In our results we were able to transfer style to a specific region of interest of the video frame. Also, different tests were conducted to validate the model (pre-trained VGG-16) and it has been shown for which domains it is suitable.