Joy Payton is a cloud engineer, data scientist, and adjunct professor who specializes in helping biomedical professionals conduct reproducible computational research. In addition to moving medicine forward through principles of open science and reproducibility, Joy also enjoys teaching citizen scientists how to use public data repositories to understand their own communities better and advocate for change from a data-centric perspective. Her various roles allow Joy to lead efforts to teach people how to write their first line of code and help anyone who's interested climb the data science learning curve. Currently employed by the Children's Hospital of Philadelphia and Yeshiva University, Joy is always open to hearing about open-source, data-centric volunteer opportunities for herself and her students.
PATRICK BUEHLER, PHD
Patrick Buehler is a principal data scientist at Microsoft’s Cloud AI Group. He obtained his PhD from the Oxford VGG group in Computer Vision with Prof. Andrew Zisserman. He has over fifteen years of working experience in academic settings and with various external customers spanning a wide range of Computer Vision problems.
Tom is a Junior Principal Data Engineer at QuantumBlack, a McKinsey Company. Prior to consulting, Tom was CTO and co-founder of Commandiv, a wealth management startup.
1 - DATA SCIENCE AND MACHINE LEARNING IN THE CLOUD FOR CLOUD NOVICES
Speaker: Joy Payton, Supervisor, Data Education, Children's Hospital of Philadelphia
In this hands-on training at the conference, we will use free-tier resources in the Google Cloud Platform (GCP) to introduce learners to the practical use of cloud computing resources in data science and machine learning. This training will be useful for those considering cloud adoption, interested in data engineering, or interested in working with public data as citizen scientists. Topics covered will include: Cloud computing concepts and vocabulary; Cloud providers; Free tier and cost considerations; Public datasets and citizen science; Redundancy, security, and privacy; Continuum of management levels; Cloud data storage and analytics; Machine learning in the cloud.
2 - HOW TO SOLVE REAL-WORLD COMPUTER VISION PROBLEMS USING OPEN-SOURCE
Speaker: Patrick Buehler, PhD, Principal Data Scientist, Microsoft
At the conference, the workshop will begin with an overview of common real-world tasks in the CV domain, including examples of problems our customers have faced in recent years. We will then give a brief introduction to deep learning models for CV. The main part of this session will demonstrate how to train and evaluate CV models by executing notebooks based on PyTorch’s Fast.ai and Torchvision libraries. We will start with image classification, how to fine-tune a pre-trained ImageNet model on a custom dataset, and show how to deploy the model to the cloud. Next, we will train an object detection model and extend the model to segmentation masks and keypoints. Finally, we will build an image similarity system and demo a fast image retrieval solution that can handle large amounts of images.
3 - KEDRO + MLFLOW – REPRODUCIBLE AND VERSIONED DATA PIPELINES AT SCALE
Speaker: Tom Goldenberg, Junior Principal Data Engineer, QuantumBlack
The aim of this tutorial, that we will host at ODSC Boston, is to demonstrate how Kedro (development workflow tool open sourced by QuantumBlack, a McKinsey company) and MLflow fit together in a scalable AI architecture. To start, we will give an overview of Kedro and an overview of MLflow: what they are used for, what functionality they provide, how they compare as tools. Next, we will walk through a demo of a Kedro project that has MLflow integrated into it. Finally, we will go over deployment options. At the webinar you will learn more about this tutorial.
You are welcome to avail a limited time offer of additional 10% off on conference passes for ODSC Virtual Conference 2020 with promo code WEBINAR2020 or simply by registering here.