Satya Sahoo, PhD

Associate Professor | 216.368.3286 | Wolstein Research Building, 6126


PhD, Computer Science and Engineering, Wright State University

Research Interests

Development of computational techniques to leverage multi-modal biomedical big data in the study of human diseases with a particular focus on neurological disorders such as epilepsy. In addition, development of semantic provenance metadata platform to ensure scientific reproducibility and data quality in the era of big data.

Keywords: Epilepsy seizure networks; Brain connectivity; Network analysis; Graph Theory; Provenance metadata; Ontology engineering; Data integration; High performance computing

Group webpage:


2017 – Finalist for Distinguished Paper Award, American Medical Informatics Association (AMIA) annual conference

2015 – 2016 – UCITE Nord Grant Award

2015 – Best Paper Award by International Medical Informatics Association (IMIA)

2012 – 2013 – UCITE Glennan Fellowship Award


1. Valdez J, Rueschman M, Kim M, Arabyarmohammadi S, Redline S, Sahoo SS, An Extensible Ontology Modeling Approach Using Post Coordinated Expressions for Semantic Provenance in Biomedical Research, The 16th International Conference on. Ontologies, DataBases, and Applications of Semantics (ODBASE), Rhodes, Greece, 2017. pp. 337-352

2. Valdez J, Kim M, Rueschman M, Socrates V, Redline S, Sahoo SS, ProvCaRe Semantic Provenance Knowledgebase: Evaluating Scientific Reproducibility of Research Studies, American Medical Informatics Association (AMIA) Annual Symposium, 2017, pp. 1688 – 1697 (Finalist for Distinguished Paper Award)

3. Sajatovic M, Tatsuoka C, Welter E, Friedman D, Spruill TM, Stoll S, Sahoo SS, Bukach A, Bamps YA, Valdez J, Jobst BC. Correlates of quality of life among individuals with epilepsy enrolled in self-management research: From the US Centers for Disease Control and Prevention Managing Epilepsy Well Network. Epilepsy Behavior. 2017 Jan 27. pii: S1525-5050(16)30742-9. PMID: 28139451

4. Gershon AL, Lhatoo SD, Tatsuoka C, Ghosh K, Loparo K, Sahoo SS, Scalable Signal Data Processing for Measuring Functional Connectivity in Epilepsy Neurological Disorder, Biomedical Signal Processing in Big Data, Ervin Sejdic, Tiago Falk (Eds), 2017 (in press) 2016

5. Valdez J, Rueschman M, Kim M, Redline S, Sahoo SS. An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text. 15th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE) 2016: 699-708.

6. Sahoo SS, Ramesh P, Welter E, Bukach A, Valdez J, Tatsuoka C, Bamps Y, Stoll S, Jobst BC, Sajatovic M. Insight: An Ontology-based Integrated Database and Analysis Platform for Epilepsy Self-Management Research, International Journal of Medical Informatics, 2016. PMID: 27573308

7. Sahoo SS, Wei A, Valdez J, Wang L, Zonjy B, Tatsuoka C, Loparo KA, Lhatoo SD. NeuroPigPen: A Scalable Toolkit for Processing Electrophysiological Signal Data in Neuroscience Applications using Apache Pig, Frontiers in Neuroinformatics, 10:18. 2016. PMID: 27375472

8. Sahoo SS, Wei A, Tatsuoka C, Ghosh K, Lhatoo SD. Processing Neurology Clinical Data for Knowledge Discovery: Scalable Data Flows Using Distributed Computing, Book Chapter

9. Sahoo SS, Valdez J, Rueschman M. Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description, American Medical Informatics Association (AMIA) Annual Symposium, 2016:1070-1079 PMID: 28269904

10. Dean DA, Goldberger AL, Mueller R, Kim M, Rueschman M, Mobley D, Sahoo SS, Jayapandian C, Cui L, Morrical MG, Surovec S, Zhang GQ, Redline S. Scaling up Scientific Discovery in Sleep Medicine: The National Sleep Research Resource. 39(5): 1151-64. 2016. PMID: 27070134

11. Yang S, Tatsuoka C, Ghosh K, Lacuey-Lecumberri N, Lhatoo SD, Sahoo SS. Comparative Evaluation for Brain Structural Connectivity Approaches: Towards Integrative Neuroinformatics Tool for Epilepsy Clinical Research. AMIA 2016 Joint Summits on Translational Science. (Finalist for the Best Student Paper Award).446-54. PMID: 27570685

12. Ramesh P, Wei A, Sams J, Welter E, Lhatoo S, Sajatovic M, Sahoo SS. Insight: Semantic Provenance and Analysis Platform for Multi-Center Neurology Healthcare Research. Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2015:731-736. PMID: 27069752

13. Sahoo SS, Rao P. Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and Trust. In the 14th International Semantic Web Conference (ISWC 2015), Bethlehem, PA, 2015. (Tutorial)

14. Jayapandian C, Wei A, Ramesh P, Zonjy B, Lhatoo SD, Loparo K, Zhang GQ, Sahoo SS. A Scalable Neuroinformatics Data Flow for Electrophysiological Signals using MapReduce. Frontiers in Neuroinformatics. 2015 9:4. PMID: 25852536

15. Sahoo SS, Zhang GQ, Bamps Y, Fraser R, Stoll S, Lhatoo SD, Tatsuoka C, Welter E, Sajatovic M. Managing Information Well: Toward an Ontology-driven Informatics Platform for Data Sharing and Secondary Use in Epilepsy Self-Management Research Centers. Health Informatics Journal, 2015.22(3):548-61. PMID: 25769938

16. LaFrance Jr. WC, Ranieri R, Bamps Y, Stoll S, Sahoo SS, Welter E, Sams J, Tatsuoka C, Sajatovic M. Comparison of common data elements from the Managing Epilepsy Well (MEW) Network integrated database and a well-characterized sample with nonepileptic seizures Epilepsy & Behavior. 2015.45:136. PMID: 25825372

17. Jayapandian CP, Chen CH, Dabir A, Zhang GQ, Lhatoo SD, Sahoo SS. Domain Ontology As Conceptual Model for Big Data Management: Application in Biomedical Informatics, Proceedings of the 33rd International Conference on Conceptual Modeling (ER 2014) 2014. pp. 144-157

18. Zhang GQ, Cui L, Lhatoo, SD, Schuele SU, Sahoo SS. MEDCIS: Multi-Modality Epilepsy Data Capture and Integration System. American Medical Informatics Association (AMIA) Annual Symposium, 2014:1248-57. PMID: 25954436

19. Sahoo SS, Tao S, Parchman A, Luo Z, Cui L, Mergler P, Lanese R, Barnholtz-Sloan JS, Meropol NJ, Zhang GQ. Trial Prospector: Matching Patients with Cancer Research Studies using an Automated and Scalable Approach. Journal of Cancer Informatics 2014. Dec 4;13:157-66. PMID: 25506198

20. Cui L, Sahoo SS, Lhatoo SD, Garg G, Rai P, Bozorgi A, Zhang GQ. Complex Epilepsy Phenotype Extraction from Narrative Clinical Discharge Summaries. Journal of Biomedical Informatics 2014 Oct;51:272-9. PMID: 24973735

21. Sahoo SS, Jayapandian C, Garg G, Kaffashi F, Chung S, Bozorgi A, Chen CH, Loparo K, Lhatoo SD, Zhang GQ. Heartbeats in the Cloud: Distributed Analysis of Electrophysiological “Big Data” using Cloud Computing for Epilepsy Clinical Research. Journal of American Medical Informatics Association JAMIA (special issue on Big Data in Healthcare and Biomedical Research) 2013. 21(2):263-71 PMID: 24326538 (Editor’s Choice Article Special Issue)

22. Jayapandian CP, Chen CH, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. Cloudwave: Distributed Processing of “Big Data” from Electrophysiological Recordings for Epilepsy Clinical Research Using Hadoop. American Medical Informatics Association (AMIA) Annual Symposium, 2013. pp. 691-700 PMID: 24551370

23. Sahoo SS, Lhatoo SD, Gupta DK, Cui L, Zhao M, Jayapadian C, Bozorgi A, Zhang GQ.Epilepsy and Seizure Ontology: Towards an Epilepsy Informatics Infrastructure for Clinical Research and Patient Care. Journal of American Medical Informatics Association (JAMIA), 2013. EPub doi:10.1136/amiajnl-2013-001696 PMID: 23686934 (Best Paper Award by International Medical Informatics Association)

24. Bozorgi A, Chung S, Kaffashi F, Loparo KA, Sahoo SS, Zhang GQ, Kaiboriboon K, Lhatoo SD. Significant postictal hypotension: expanding the spectrum of seizure-induced autonomic dysregulation. Epilepsia. 2013 Sep;54(9):e127-30. doi: 10.1111/epi.12251. Epub 2013 Jun 12. PMID: 23758665

25. Cui L, Mueller R, Sahoo SS, Zhang GQ. Querying Complex Federated Clinical Data Using Ontological Mapping and Subsumption Reasoning. IEEE International Conference on Healthcare Informatics 2013 (ICHI 2013) pp. 351-360.

26. Lebo T, Sahoo SS, McGuinness D. (eds.) PROV-O: The PROV Ontology. 30 April 2013, W3C Recommendation.

27. Sahoo SS, Zhang GQ, Lhatoo SD. Epilepsy Informatics and an Ontology-driven Infrastructure for Large Database Research and Patient Care in Epilepsy. Review Paper, Epilepsia, 2013. 54(8). pp. 1335-41. PMID: 23647220 (Editor’s Choice Article: September 2013)

28. Jayapandian CP, Chen CH, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. Electrophysiological Signal Analysis and Visualization using Cloudwave for Epilepsy Clinical Research. The 14th World Congress on Medical and Health Informatics (MedInfo), Stud Health Technol Inform. 2013. Vol. 192. pp.817-21. PMID: 23920671

29. Asiaee AH, Doshi P, Minning T, Sahoo SS, Parikh P, Sheth A, Tarleton RL. From Questions to Effective Answers: On the Utility of Knowledge-Driven Querying Systems for Life Sciences Data. The 9th International Conference on Data Integration in the Life Sciences (DILS), 2013. pp. 38-45.

30. Parchman AJ, Zhang GQ, Mergler P, Barnholtz-Sloan J, Lanese R, Miller DW, Opper C,Sahoo SS, Tao S, Teagno J, Warfe J, Meropol NJ. Trial prospector: An automated clinical trials eligibility matching program. Proceedings of the American Society of Clinical Oncology (ASCO) Annual Meeting. 2013.

31. Jayapandian C, Zhao M, Ewing R, Zhang GQ, Sahoo SS. A Semantic Proteomics Dashboard (SemPoD) for Data Management in Translational Research. BMC Systems Biology, Vol. 6(Suppl 3):S20, 2012. PMID: 23282161

32. Parikh PP, Zheng J, Logan-Klumper F, Stoeckert Jr. CJ, Louis C, Topalis P, Protasio AV, Sheth AP, Carrington M, Berriman M, Sahoo SS. The Ontology for Parasite Lifecycle (OPL): Towards a Consistent Vocabulary of Lifecycle Stages in Parasitic Organisms. Journal Biomedical Semantics (JBMS), 2012. Vol. 23; 3(1): 5. PMID: 22621763

33. Zhang GQ, Sahoo SS, Lhatoo SD. From Classification to Epilepsy Ontology and Informatics. Epilepsia, 2012. Vol. 53(Suppl. 2). pp. 28-32. PMID: 22765502

34. Parikh PP, Minning TA, Nguyen V, Lalithsena S, Asiaee AH, Sahoo SS, Doshi P, Tarleton R, Sheth AP. A Semantic Problem Solving Environment for Integrative Parasite Research: Identification of Intervention Targets for Trypanosoma cruzi. PLoS Neglected Tropical Diseases, 2012. Vol. 6(1): e1458. PMID: 22272365

35. S.S. Sahoo, M. Zhao, L. Luo, A. Bozorgi, D. Gupta, S.D Lhatoo, GQ Zhang, “OPIC: Ontology-driven Patient Information Capturing System for Epilepsy.” Proceedings of the American Medical Informatics Association (AMIA) Annual Symposium, Chicago, IL, pp. 799-808, Nov 2012. PMID: 23304354

36. Cui L, Bozorgi A, Lhatoo SD, Zhang GQ, Sahoo SS. EpiDEA: Extracting Structured Epilepsy and Seizure Information from Patient Discharge Summaries for Cohort Identification. American Medical Informatics Association (AMIA) Annual Symposium, 2012. pp. 1191-1200. PMID: 23304396

37. Zhang GQ, Luo L, Ogbuji C, Joslyn C, Mejino J, Sahoo SS. An Analysis of Multi-type Relational Interactions in FMA Using Graph Motifs. American Medical Informatics Association (AMIA) Annual Symposium, 2012. pp. 1060-1069. PMID: 23304382

38. Teagno J, Kiefer RC, Pathak J, Zhang GQ, Sahoo SS. A Distributed Semantic Web Approach for Cohort Identification. Proceedings of the American Medical Informatics Association (AMIA) Annual Symposium, 2012; pp. 1969

39. Jayapandian C, Ewing R, Zhang GQ, Sahoo SS. A Semantic Proteomics Dashboard (SemPoD) for Proteomics Data Management in Translational Research. AMIA Clinical Research Informatics Summit (CRI), 2012. PMID: 23282161
40. 2011

41. Sahoo SS, Nguyen V, Bodenreider O, Parikh PP, Minning T, Sheth AP. A unified framework for managing provenance information in translational research. BMC Bioinformatics, 2011. Vol. 12:461. PMID: 22126369

42. Zhao J, Sahoo SS, Missier P, Sheth AP, Goble C. Extending Semantic Provenance into the Web of Data. IEEE Internet Computing, 2011. Vol. 15(1). pp. 40-48.

43. Sahoo SS, Ogbuji C, Luo L, Dong X, Cui L, Redline SS, Zhang GQ. MiDas: Automatic Extraction of a Common Domain of Discourse in Sleep Medicine for Multi-Center Data Integration. American Medical Informatics Association (AMIA) Annual Symposium, 2011. pp. 1196-1205. PMID: 22195180

44. Sahoo SS. Towards Desiderata for Provenance Ontologies in Biomedicine, International Conference on Biomedical Ontologies (ICBO), 2011. pp. 269-272.

45. Barga R, Simmhan Y, Chinthaka-Withana E, Sahoo SS, Jackson J, Araujo N. Provenance for Scientific Workflows Towards Reproducible Research. IEEE Data Engineering Bulletin, 2010. Vol. 33(3). pp. 50-58.

46. Sahoo SS, Bodenreider O, Hitzler P, Sheth AP, Thirunarayan K. Provenance Context Entity (PaCE): Scalable provenance tracking for scientific RDF data. The 22nd International Conference on Scientific and Statistical Database Management (SSDBM), 2010. pp. 461-470. PMID: 25621321

47. Missier P, Sahoo SS, Zhao J, Goble C, Sheth A. Janus: from workflows to semantic provenance and linked open data. The 3rd International Provenance and Annotation Workshop (IPAW), Lecture Notes in Computer Science, Vol. 6378/2010, 2010. pp. 129-141.

48. Deus H, Zhao J, Sahoo SS, Samwald M, Prud’hommeaux E, Miller M, Marshall MS, Cheung K. Provenance of Microarray Experiments for a Better Understanding of Experiment Results. The 2nd International Workshop on Role of Semantic Web in Provenance Management (SWPM 2010), co-located with ISWC, 2010.

49. Patni H, Sahoo SS, Henson C, Sheth A. Provenance Aware Linked Sensor Data, The 2nd International Workshop on Trust and Privacy on the Social and Semantic Web, co-located with ESWC, 2010.

50. Sahoo SS, Groth P, Hartig O, Miles S, Coppens S, Myers J, Gil Y, Moreau L, Zhao J, Panzer M, Garijo D. Provenance Vocabulary Mappings. W3C Provenance Incubator Group Report, 2010.

51. Sahoo SS, Weatherly DB, Mutharaju R, Anantharam P, Sheth AP, Tarleton RL. Ontology-driven Provenance Management in eScience: an Application in Parasite Research. The 8th International Conference on Ontologies, DataBases, and Applications of Semantics, (ODBASE), 2009. pp. 992-1009.

52. Sahoo SS, Sheth A. Provenir ontology: Towards a Framework for eScience Provenance Management. Microsoft eScience Workshop, 2009.

53. Sahoo SS, Halb W, Hellmann S, Idehen K, Thibodeau Jr. T, Auer S, Sequeda J, Ezzat A. A Survey of Current Approaches for Mapping of Relational Databases to RDF. W3C RDB2RDF Incubator Group Report, 2009.

54. Sahoo SS, Sheth AP, Henson C. Semantic Provenance for eScience: ‘Meaningful’ Metadata to Manage the Deluge of Scientific Data. IEEE Internet Computing, Web-Scale Workflow Track, M.B. Blake and M. Huhns (Eds.), 2008. Vol. 12(4). pp.46-54. (Featured in Association of Computing Machinery (ACM) TechNews 2008)

55. Sahoo SS, Bodenreider O, Rutter JL, Skinner KJ, Sheth AP. An ontology-driven semantic mash-up of gene and biological pathway information: Application to the domain of nicotine dependence. Journal of Biomedical Informatics (Special Issue: Semantic Mashup of Biomedical Data), 2008. Vol. 41(5). pp. 752-65. PMID: 18395495

56. Sheth A, Henson C, Sahoo SS. Semantic Sensor Web. IEEE Internet Computing, 2008. Vol. 12(4). pp. 78-83.

57. Valerio MD, Sahoo SS, Barga RS, Jackson JJ. Capturing Workflow Event Data for Monitoring, Performance Analysis, and Management of Scientific Workflows. SWBES08, co-located with the 4th IEEE International Conference on eScience, 2008. pp. 626-33.

58. Sahoo SS, Zeng K, Bodenreider O, Sheth AP. From ‘glycosyltransferase’ to ‘congenital muscular dystrophy’: Integrating knowledge from NCBI Entrez Gene and the Gene Ontology. The 12th World Congress on Health (Medical) Informatics (Medinfo), 2007. pp. 1260–64. PMID: 17911917.

59. Sahoo SS, Bodenreider O, Zeng K, Sheth AP. An experiment in integrating large biomedical knowledge resources with RDF: Application to associating genotype and phenotype information. International Workshop on Health Care and Life Sciences Data Integration for the Semantic Web, co-located with WWW2007, 2007.

60. Sahoo SS, Sheth A, Hunter B, York WS. SemBOWSER–Adding Semantics to biological Web services registry. Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences. Baker CJO, Cheung KO (Eds.), Springer, 2007. pp. 317–40.

61. Sahoo SS, Thomas C, Sheth AP, York WS, Tartir S. Knowledge Modeling and Its Application in Life Sciences: A Tale of Two Ontologies. The 15th International World Wide Web (WWW) Conference, 2006. pp. 317-26

62. Sahoo SS, Sheth A. Bioinformatics applications of Web Services, Web Processes and role of Semantics. Semantic Web Processes and Their Applications. Cardoso J, Sheth A (Eds.), Springer, 2006. pp. 305–22.

63. Sahoo SS, Thomas C, Sheth AP, Henson C, York WS. GLYDE-An expressive XML standard for the representation of glycan structure. Carbohydrate Research, 2005. Vol. 340(18). pp.2802-7. PMID: 16242678

64. Atwood III J, Sahoo SS, Alvarez-Manilla G, Weatherly DB, Kolli K, Orlando R, York WS.Simple modification of a protein database for mass spectral identification of N-linked glycopeptides. Rapid Communications Mass Spectrometry, 2005. Vol. 19(21). pp.3002-6. PMID: 16196021

65. Alvarez-Manilla G, Atwood. III J, Sahoo SS, Guo Y, Warren NL, York WS, Orlando R, Pierce M. Tools for glycoproteomic analysis: size-exclusion chromatography facilitates identification of tryptic glycopeptides with N-linked glycosylation site. Glycobiology 15(1208), 2005. PMID: 16512686

66. Aleman-Meza A, Halaschek-Wiener C, Sahoo SS, Sheth A, Arpinar B. Template Based Semantic Similarity for Security Applications. The IEEE Intl. Conference on Intelligence and Security Informatics (ISI-2005), 2005. pp: 621-622.

67. Sahoo SS, Sheth AP, York WS, Miller JA. Semantic Web Services for N-glycosylation Process. International Symposium on Web Services for Computational Biology and Bioinformatics, 2005.

68. Sheth A, York WS, Thomas C, Nagarajan M, Miller JA, Kochut K, Sahoo SS, Yi X. Semantic Web technology in support of Bioinformatics for Glycan Expression. W3C Workshop on Semantic Web for Life Sciences, 2004.