About meI currently work for Tiger Analytics as a data scientist at their Chennai office, and am broadly interested in Data Mining and Machine Learning.
I graduated from the Advanced Master in AI Programme at KU Leuven in Sept 2016. I have a Masters degree in Computer Systems from the Indian Institute of Science, Bangalore (Jul 2013) and a Bachelor's degree in Mechanical Engineering from NIT Trichy (May 2010)
I have two years of industry experience as a research engineer at SAP Labs India (Jul 2013-Jul 2016), where I worked on projects in Recommender Systems.
Research and Projects
Deep Kernel Architectures for Unsupervised LearningFor my masters thesis at KU Leuven, I am exploring stacked kernel architectures for clustering, denoising and dimensionality reduction. More details on this soon.
Programming for Big DataThis is an ongoing set of implementations of machine learning algorithms for large and streaming datasets as a part of the H00Y4AE Programming for Big Data course at KU Leuven. This course is based on the popular Mining of Massive Datasets course. Currently includes stream classifiers such as Hoeffding trees (Domingos KDD00), Stochastic Gradient Descent and Naive Bayes, the Locality Sensitive Hashing algorithm (Gionis VLDB99) for Similarity search and various approximate counting and sampling techniques for large datasets. The code will be put up on GitHub soon.
Large Scale Botnet DetectionDuring my time as a Masters student at IISc, I worked on the problem of detecting botnets at the Internet infrastructure level, where data velocity is very high. I developed an algorithm to efficently detect structured P2P command and control flows. The approach relied on detection of nearly regular subgraphs of a large IP-IP graph (10^8 edges) and the algorithm scaled linearly with the number of edges in the graph. This was published in the Journal of Computer Virology (see Publications). My dissertation has more details, and also addresses the modification of efficient community detection algorithms to detect botnet command and control topologies in graph data. During the course of this work I developed graffy, a graph processing library in C++ which can be found at Github
Community Detection AlgorithmsWhile at IISc, I also worked on community detection algorithms, particularly modifications of label propagation and local community detection algorithms.
Online Social Network Data AnalysisAlso done while I was at IISc, I also setup infrastructure to capture Twitter data for use for the rest of the lab. I also analyzed this data to uncover behavioural patterns of users in Twitter, and their differences during different events.
- Patil, S., Venkatesh, B., Singh, R., "From Differentiated Genes to Affected Pathways " (2016) preprint/bioRxiv [PDF] [ BioarXiv ]
- Venkatesh, B., Choudhury, S. H., Nagaraja, S., & Balakrishnan, N. (2015). BotSpot: fast graph based identification of structured P2P bots. Journal of Computer Virology and Hacking Techniques, 11(4), 247-261. [PDF] [View at Publisher]
- Ravi, S., Balakrishnan, N. & Venkatesh, B. "Behavior-based Malware analysis using profile hidden Markov models," Security and Cryptography (SECRYPT), International Conference on, Reykjavik, Iceland, 2013, pp. 1-12. [PDF] [View at Publisher]
Graduate Level Courses
- H02C6AE Data Mining
- H02C1AE Machine Learning and Inductive Inference
- H02D2AE Uncertainty in Artifical Intelligence
- H00Y4AE Programming for Big Data
- H02H6BE Bioinformatics
- H02A5AE Computer Vision
- H02C4AE Artificial Neural Networks
- H02D3AE Support Vector Machines and Applications
- H02A0AE Fundamentals of Artificial Intelligence