About me

I currently work for Tiger Analytics as a data scientist at their Chennai office, and am broadly interested in Data Mining and Machine Learning.
I graduated from the Advanced Master in AI Programme at KU Leuven in Sept 2016. I have a Masters degree in Computer Systems from the Indian Institute of Science, Bangalore (Jul 2013) and a Bachelor's degree in Mechanical Engineering from NIT Trichy (May 2010)
I have two years of industry experience as a research engineer at SAP Labs India (Jul 2013-Jul 2016), where I worked on projects in Recommender Systems.

Research and Projects

Deep Kernel Architectures for Unsupervised Learning

For my masters thesis at KU Leuven, I am exploring stacked kernel architectures for clustering, denoising and dimensionality reduction. More details on this soon.

Programming for Big Data

This is an ongoing set of implementations of machine learning algorithms for large and streaming datasets as a part of the H00Y4AE Programming for Big Data course at KU Leuven. This course is based on the popular Mining of Massive Datasets course. Currently includes stream classifiers such as Hoeffding trees (Domingos KDD00), Stochastic Gradient Descent and Naive Bayes, the Locality Sensitive Hashing algorithm (Gionis VLDB99) for Similarity search and various approximate counting and sampling techniques for large datasets. The code will be put up on GitHub soon.

Large Scale Botnet Detection

During my time as a Masters student at IISc, I worked on the problem of detecting botnets at the Internet infrastructure level, where data velocity is very high. I developed an algorithm to efficently detect structured P2P command and control flows. The approach relied on detection of nearly regular subgraphs of a large IP-IP graph (10^8 edges) and the algorithm scaled linearly with the number of edges in the graph. This was published in the Journal of Computer Virology (see Publications). My dissertation has more details, and also addresses the modification of efficient community detection algorithms to detect botnet command and control topologies in graph data. During the course of this work I developed graffy, a graph processing library in C++ which can be found at Github

Community Detection Algorithms

While at IISc, I also worked on community detection algorithms, particularly modifications of label propagation and local community detection algorithms.

Online Social Network Data Analysis

Also done while I was at IISc, I also setup infrastructure to capture Twitter data for use for the rest of the lab. I also analyzed this data to uncover behavioural patterns of users in Twitter, and their differences during different events.

Publications

Working Papers/Preprints

  1. Patil, S., Venkatesh, B., Singh, R., "From Differentiated Genes to Affected Pathways " (2016) preprint/bioRxiv [PDF] [ BioarXiv ]

Journal

  1. Venkatesh, B., Choudhury, S. H., Nagaraja, S., & Balakrishnan, N. (2015). BotSpot: fast graph based identification of structured P2P bots. Journal of Computer Virology and Hacking Techniques, 11(4), 247-261. [PDF] [View at Publisher]

Conference

  1. Ravi, S., Balakrishnan, N. & Venkatesh, B. "Behavior-based Malware analysis using profile hidden Markov models," Security and Cryptography (SECRYPT), International Conference on, Reykjavik, Iceland, 2013, pp. 1-12. [PDF] [View at Publisher]

Dissertations

  1. Venkatesh, B., "Fast Identification of Structured P2P Botnets using Community Detection Algorithms" 2013 M.Sc(Engg) Thesis, Indian Institute of Science, Bangalore [PDF] [Abstract ]

Graduate Level Courses

KU Leuven

IISc Bangalore