The Open Data Science Conference has established itself as the leading conference in the field of applied data science. Each ODSC event offers a unique opportunity to learn directly from the core contributors, experts, academics, and renowned instructors helping shape the field of data science and artificial intelligence.
Our conferences are organized around focus areas to ensure our attendees are at the forefront of this fast-emerging field and current with the latest data science languages, tools, and models.
Contact us at email@example.com.
Gradient Boosting - How it works and why it matters
Speaker: Alberto Danese, Head of Data Science at Nexi (Kaggle Grandmaster)
Despite the huge success of deep learning, Gradient Boosting Trees had many amazing improvements in the last few years and usually have still an edge on deep learning, when it comes to dealing with structured tabular data.
The presentation will focus on various reasons to implement GBT-based solutions: from pure performance to interpretability of the predictions, from the availability of stable libraries to the implementations available within the AI platforms of the major cloud providers.
A simple example will be used to provide a grasp of how GBT works and the main parameters and optimization tecniques will be dealt with.
NLP models, options and best practice
Speaker: Alessandro Maserati, AI Manager at Logol
NLP is a way for computers to analyze, understand, and derive meaning from human language in a smart and useful way. By utilizing NLP, developers can organize and structure knowledge to perform tasks such as automatic summarization, translation, named entity recognition, relationship extraction, sentiment analysis, and topic segmentation. During the talk will be give an overview about the State of The Art of NLP models, and will be give a framework for understand which model is better for each single context.
Machine Learning & Data Fusion in three implementations
Speaker: Francesco Tarasconi, Senior Data Scientist at CELI
As of today, we have no doubt that Machine Learning can add value to operations in all industries. What makes the difference is how we apply Machine Learning techniques depending on context, and most importantly how we define the interaction between Machine and Human, assigning correct roles to each of them.
We present three use cases where the correct definition of people-machine interaction helped solving concrete problems:
(1) ML-based short-term forecast (Retail)
(2) ML-assisted forecast and assortment (Luxury Fashion)
(3) Predictive AI for Health, Safety and Environment (Oil & Gas)
Pay Attention - Frontiers in NLP (... and some limits)
Speaker: Cristiano De Nobili, PhD, Senior Deep Learning Scientist at Harman International
It is commonly believed that words and numbers belong to different worlds. On the one hand, there are men of letters, on the other scientists, engineers, and technicians. However, there is a special area where these two sides of our knowledge are tightly connected. A universe where words become numbers. Welcome to artificial intelligent language, also known as Natural Language Processing. We are going to dive into some details of this interaction, showing its recent successes. In particular, we are going to analyze in full detail the self-attention mechanism, the core idea behind Transformers and architectures as BERT and GPT-2.
Guided Labeling: Interactive Label Generation with Active Learning
Speaker: Paolo Tamagnini, Data Scientist, KNIME
We are in the age of data. In recent years, many companies have already started collecting large amounts of data about their business. Many other companies are starting now. However, before you can train any decent supervised model you need ground truth data. And this is the ugly truth: before proceeding, you need a sufficiently large set of correctly labeled data records to describe your problem. And data labeling - especially in a sufficiently large amount - is … expensive. In this presentation we explain the main parts of an active learning procedure and we show a blueprint web-application, based on active learning and uncertainty sampling, to interactively label any document set while investing only a fractional amount of time in manual labeling. The idea of active learning is to train a machine learning model well enough to be able to delegate it to the boring and expensive task of data labeling.
Machine Learning under attack!
Speaker: Federico Ungolo, AI Software Developer at NTT Data
The topic of security applied to Machine Learning is gaining popularity only in recent years, when the effects of the attacks on such models could not be left ignored. The presentation will provide the audience with a short introduction to the problem of security for ML models, especially DL ones. I will explain how the training process is crucial to make the model robust to certain types of attacks and how, in specific situations, there is a tradeoff between the overall performances of the model vs. the robustness against the attacks. I will include a list of types of attacks, how are deployed and how to prevent them, followed by examples.