Causal Inference & Machine Learning
Speaker: Vinod Bakthavachalam, Data Scientist at Coursera
Lots of data science problems, especially towards informing business and product strategy, involve understanding causal relationships. The standard way to measure these is through AB testing, but many times that is infeasible, requiring alternative techniques from the causal inference that are an essential component of any data scientist's toolkit. The talk will walk through these techniques, some applications, and recent work at the intersection of causal inference and machine learning to handle large data sets.
Real-ish Time Predictive Analytics with Spark Structured Streaming
Speaker: Scott J Haines, Principal Software Engineer at Twilio
In 20 short minutes learn what becomes possible when you add Spark into your analytics pipeline. Learn how to effectivley solve common Data Engineering problems with compile-time guarenttes - like how to ingest, normalize, transform and join datasets in realtime. Learn how to add insights on top of your streaming data with simple filters and pre-trained models.
Visualizing Complexity: Dimensionality Reduction and Network Science
Speaker: Jane Adams, Data Visualization Artist at University of Vermont Complex Systems Center
Working with mathematicians, data scientists, and domain experts at the University of Vermont Complex Systems Center, data visualization artist Jane Adams has developed strategies for prototyping exploratory graphs of high-dimensional data. In this 90-minute workshop, Adams shares some of these methods for data discovery and interaction, navigating a creative workflow from paper prototypes of visual hypotheses through web-based interactive slices, offering critical insight for clustering, interpolation, and feature engineering.
Healthcare NLP with a doctor's bag of notes
Speaker: Andrew Long, PhD, Data Scientist at Fresenius Medical Care
Nausea, vomiting, and diarrhea are words you would not frequently find in a natural language processing (NLP) project for tweets or product reviews. However, these words are common in healthcare. In fact, many clinical signs and patient symptoms (e.g. shortness of breath, fever, or chest pain) are only present in free-text notes and are not captured with structured numerical data. As a result, it is important for healthcare data scientists to be able to extract insight from unstructured clinical notes in electronic medical records. In this 20 min warm-up, the audience will have the opportunity to learn about an NLP concept known as bag-of-words. The audience will also get a preview of the outline for the 90-min workshop held at the upcoming ODSC West 2019.