Science is what we understand well enough to explain to a computer. Art is everything else we do.
The holy grail of Computer Science and Artificial Intelligence research is to develop programmes that can combine knowledge/information from multiple domains to perform actions that currently humans are good at. In this spirit, Image Captioning stands as a great test-bed for AI algorithms since it involves building understanding of an image and then generating meaningful sentences on top of it. Formally Image Captioning is defined as the process of automatically generating descriptions of the scene shown in an image. The aim of this post is not to provide a full tutorial on Image Captioning. For that I would encourage you to go through Andrej Karpathy’s presentation and Google’s Show and Tell paper.
Starting from 10th September, I am working as a research intern at MIDAS lab at IIIT-Delhi. MIDAS is actually an acronym that stands for Multimodal Digital Media Analysis. Headed by Dr. Rajiv Ratn Shah, the group currently conducts research across a broad spectrum of domains such as speech recognition, lip reading, musing generation, social media analysis etc. (these are the ones I know 😁). MIDAS is a young research group, it started out in early 2018. Notwithstanding the short time since its inception, MIDAS has distinguished itself in top-tier conferences like EMNLP and ACM Multimedia.
Is there something visually beautiful that can come out of Recursion? Industry stalwarts hate it (recursion is a big NO in production code!) and undergrads (like me) sweat while doing time complexity analysis for algorithms employing recursion. But whichever side of the spectrum you are in the software world, you would agree that recursion has a charm of its own. During my undergraduate years as I became familiar with the Computer Science world around me, I began to appreciate the inherent idea behind recursion. Moreover I understood that it not just a fancy technique used by clever programmers, but is something that is intrinsic to nature itself. This post is a presentation of art that I learned and then coded during my leisure time.
Recurrent Neural Networks are amusing. They are amongst the finest examples of how simple mathematical models can achieve exciting and useful results. Recurrent Neural Networks (or RNNs for short) are a type of Neural Network architecture that are used for sequential data. That is inputs that occur as a sequence like language, audio, video frames. These simple ( I would rather prefer the term ‘cute’) models are used to generate new language sequences, music in the style of Mozart, translate a sentence/document in one language to another, classify video and even play games!!!.