top of page
IMG_2265.JPG

My CV

Education, Work, and Training

Generative AI Enginner, Pivotal Life Sciences

June 2022- June 2023

I plan, manage, track, and report my team's progress with Jira, Confluence, and Github. I perform pull requests/reviews on company repositories. I write Airflow DAGs consisting of crawlers to ETL data into our AWS Athena. I program in Python using LLM technology including Langchain, Kor, OpenAI, and SpaCy. I present company-wide, quarterly updates on project/team progress as well as advancements in Generative AI. I create educational material on Generative AI best practices. I lead daily standups and weekly retrospectives.

Accomplishments of Note:

  1. Led an interdisciplinary team of 5 consisting of biologists, investors, and data scientists creating a clinical trial forecasting and research platform using generative AI tools like Langchain, Kor, OpenAI, as well as Athena, and Open Source APIs

  2. Created a financial screener tool for our investment team to automate regular reporting and analysis tasks using open-source data and Streamlit

  3. Deployed a SOTA biology knowledge database (KEGG) into a neo4j knowledge graph for biologists to perform research tasks on new companies

  4. Interviewed dozens of candidates resulting in the hire, onboarding, and training of our team of 6 including computational biologists, data engineers, full-stack engineers, and data scientists

  5. Instituted an agile framework within the team resulting in greater team collaboration, transparency, productivity, and scalability

  6. Created an alerting application that notifies investors when major changes happened in the clinical trials of companies they manage

  7. Created a clinical power analysis tool to help investors make data-informed decisions rather than take company reports at face value

  8. Created an LLM-based extraction application for document analysis of any arbitrary size

Data Scientist, SymphonyAI

September 2021- June 2022

I create/maintain tools/features for SymphonyAI’s TDA platform. I also create, present, and explain data analyses and models for classification, prediction, segmentation, and unsupervised learning for clients. I'm a superuser in Python, R, SQL, and SymphonyAI's proprietary TDA platform Eureka.

I communicate with clients to better understand their unique data problems and the business questions that come with them. I act as the lead scientist in exploratory data solutions for the client and handle the process from exploration to implementation.

I hire, mentor, and coach data scientists as the team expands.

And as all data scientists, I perform complex queries and data cleaning tasks in Python, R, and SQL.

Accomplishments of Note:

  1. Created state-of-the-art textual decomposition pipelines to filter out extremist language in social media sites for POC demo for OSINT

  2. Designed and presented a company-wide educational demo on time-series (what it is, how it works, what it can do, and the future of the field)

  3. Created competitor comparisons for internal use, outlining how SymphonyAI stacks up against other data science service providers

  4. Created marketing and educational materials for the sales and marketing team demonstrating how SymphonyAI’s TDA platform stands apart in the field and what makes our platform uniquely powerful, exciting, and novel to the data science space

  5. Ran customer sales demo for the clients that included a full analysis of supplied data as well as presentation material.

Data Scientist, Sentient Energy

October 2019- November 2021

I created/maintained internal tools for fast diagnosis of electrical signals. I created, presented, maintained, and implemented data pipelines and models for classification and prediction in Python and R. 
I analyze data from mesh networks of 1000s of devices reporting simultaneous information and acted as the lead data miner and gained insights from millions of unlabeled data points through unsupervised learning. 
Like all data scientists, I performed complex queries and data cleaning tasks in Python, R, and SQL. 

Accomplishments of Note:

  1. Used classical time series methods and previous outage data to predict faults on the line with as much as 30 days advanced notice. This design would allow utilities to stop blackouts before they take place and reduce key consumer metrics (SAIDI & SAIFI).

  2. Helped devise improvements in signal processing and analysis methods to yield faster and more accurate classification results which are more robust to outliers.

  3. Developed and maintained internal tools for fast data retrieval and manipulation. These tools are used daily by my team of six and reduce the complexity required to pull, clean, visualize, save, cluster, classify and forecast data from dozens of scripts into three lines of code. These tools save 100s of developer hours per year.

  4. Assumed a leadership role in the design of the patented conductor break algorithm.

  5. Developed image-based cluster pipelines that accelerated the start-of-value for customers.

Instructor, San Jose State University

May 2017- August 2019

I taught college-level math classes to students at San Jose State University. This included creating a safe and encouraging classroom atmosphere, making materials including midterms, finals, study guides, syllabi, and homework. I managed classroom time as well as scheduled and maintained office hours for better student instruction (both remote and in-person).

bottom of page