Biomedical Big Data: Neuroscience and Provenance

Dr. Satya Sahoo aims to characterize the role of brain connectivity in neurological disorders such as epilepsy using integrative graph models and highly scalable computational techniques. Brain connectivity measures are derived from multi-modal data, for example electrophysiological signal and diffusion MRI data, which are difficult to process, model, and analyze due to their volume, variety, and rate of generation (also called velocity) – the three defining dimensions of “big data”.  Dr. Sahoo’s highly collaborative research lies at the intersection of neuroscience and computational science with research objectives being driven by clinical goals, like the assessment and treatment of patients.


A Big Data Workflow for Neuroscience Research

Dr. Sahoo has developed a novel Epilepsy and Seizure ontology as a formal knowledge model for integrating the heterogeneous data collected as part of clinical activities for epilepsy treatment, for processing clinical text (using Natural Langauge Processing), and for analyzingelectroencephalogram (EEG) signal data using Hadoop technologies (like MapReduce, Pig, and HDFS). He has recently released a highly scalable tool called NeuroPigPen that can process more than 750GB of signal data using new data partitioning algorithms within a Hadoop cluster.  Also, Dr. Sahoo’s research in biomedical Big Data has led to development of a new framework called ProvCaRe — a framework for scientific reproducibility and data quality using provenance metadata (part of Big Data to Knowledge (BD2K) data provenance initiative.  Additional details about Dr. Sahoo’s projects are available at his website: BMHI Research