Overview of
Pre-Conference Training

BUNDLE A: Conference + 3 Trainings
Topic
Instructor
Title​
Difficulty-Level
Time Length
Schedule on
Oct 26, 2021
Price
Virtual Data Science Learnathon with KNIME
Data Scientists, KNIME Team
Data Scientist at KNIME
Beginner
2hrs
9:00 am - 11:00 am (PST)
$29
Current data  science and machine learning applications in the medical and  aerospace industry
Dr. Kyongsik Yun
Technologist, NASA/Jet Propulsion Lab
Beginner- Intermediate
2hrs
12:00 pm - 2:00 pm
(PST)
$39
Social Media        Analysis for COVID Impacts
Anuj Saini
Manager Data Science at Sapient Global Markets
Beginner - Intermediate
2hrs
3:00 pm - 5:00 pm
(PST)
$39
BUNDLE B: Conference + 3 Trainings
Topic
Instructor
Title​
Difficulty-Level
Time Length
Schedule on
Oct 26, 2021
Price
Virtual Data Science Learnathon with KNIME
Data Scientists, KNIME Team
Data Scientist at KNIME
Beginner
2hrs
9:00 am - 11:00 am (PST)
$29
Current data  science and machine learning applications in the medical and  aerospace industry
Dr. Kyongsik Yun
Technologist, NASA/Jet Propulsion Lab
Beginner- Intermediate
2hrs
12:00 pm - 2:00 pm
(PST)
$39
Getting insights from text data (collecting, cleaning, and analyzing text data from the web)
Adriana Summerow
Founder & Data Scientist at Opening Data
Intermediate - advanced
2hrs
3:00 pm - 5:00 pm
(PST)
$49

Data: Oct 26th , 2021

Time: 9:00 am – 11:00 am (PST)

Location: Online

Tools: KNIME

Difficulty Level: Beginner

Prerequisties: None

Name:  Dr. Satoru Hayasaka – Data Scientist, KNIME, Wali Khan – Solution Engineer, KNIME, Corey Weisinger – Data Scientist, KNIME

 

 

Presented by KNIME 

This learnathon is a mix between a hackathon and a workshop. It’s like a workshop because we’ll learn more about the data science cycle: data access, data blending, data preparation, model training, optimization, testing, and deployment. It’s like a hackathon because we’ll work in groups to hack a workflow-based solution to guided exercises.

The tool of choice is the open-source, GUI-driven KNIME Analytics Platform. Because KNIME is open, it offers great integrations with an IDE environment for R, Python; SQL, and Spark.We’ll start with an introduction to KNIME Analytics Platform, followed by a short presentation about the data science cycle. After this presentation we split into three groups. Each group focuses on one of the three aspects of the data science cycle.Three zoom breakout rooms will be activated for this purpose. You go into the room for the group you sign up for (below) to attend the specific tutorial and exercises.There will be a KNIME data scientist in each breakout room to help you while you work on the exercises.  

Choose which group (Group 1, 2, or 3) you want to join.
 

Group 1 – Working on the raw data. Data access and data preparation.

Group 2 – Machine Learning. Which model shall I use? Which parameters?

Group 3 – I have a great model. Now what? The model deployment phase.

Dr. Satoru Hayasaka was trained in statistical analysis of various types of biomedical data. Since his doctoral training, he has taught several courses on data analysis geared toward non-experts and beginners. In recent years, he taught introductory machine learning courses to graduate students from different disciplines. Recently he joined KNIME as part of the evangelism team, and he continues teaching machine learning and data mining using KNIME Analytics Platform.

Wali Khan is a Solution Engineer at KNIME based out of Austin, Texas. His main focus is to help people operationalize their Machine Learning Models and analytics pipelines. Before KNIME Wali worked as a consultant at Oracle, holds a Masters Degree in Biomedical Engineering from University of Texas Arlington, and a Chemistry Degree from Texas A&M University.

Corey Weisinger is a Data Scientist with KNIME in Austin Texas. He studied Mathematics at Michigan State University focusing on Actuarial Techniques and Functional Analysis. Before coming to work for KNIME he worked as an Analytics Consultant for the Auto Industry in Detroit Michigan. He currently focuses on Signal Processing and Numeric Prediction techniques and is the Author of the Alteryx to KNIME guidebook.

Learning Outcomes

  • the data science cycle
  • how to hack a workflow-based solution
  • data access and data preparation
  • model selection
  • model deployment

Participants Description

  • Professionals who are looking for more practice with data preparation, model selection, and model deployment
  • Students and beginners who are interested in learning about the data science cycle
Highlights
 
  • Expert instruction on data access, data blending, data preparation, model training, optimization, testing, and deployment
  • Hands-on data science work
  • Guided exercises
  • More practice with the GUI-driven KNIME Analytics Platform

Data: Oct 26th , 2021

Time: 3:00 pm – 5:00 pm (PST)

Location: Online

Tools: NLP, Sklearn, Pandas, Matplotlib, Python, Spacy

Difficulty Level: Beginner – Intermediate

Prerequisties: Python

Name:  Anuj Saini – Manager Data Science at Sapient Global Markets

            

Anuj Saini

My name is Anuj Saini, and I work full time with Publicis Sapient as an AI/NLP specialist in Los Angeles. I have 12+ years of industry experience in NLP and ML/DL. I have quite extensive hands-on experience into Knowledge graphs, ontologies, social media mining with multilingual text with problems like paraphrasing, summarization, fake news detection and hate speech detection.

I am a Subject Matter Expert in the area of Natural Language Processing, Search Technologies, Statistics, Analytics, Modelling, Data Science, Data Mining and Machine Learning. I have more than 12 years’ of industry experience in developing systems based on Search and Machine Learning. I have done extensive work in the domain of NLP using linguistics as well as Machine Learning across various domains such as e-commerce, investment banking, insurance etc. I am skilled in Chatbots, Recommender Systems, Sentiment Analysis, Semantic Technologies, and Natural Language Processing using python, java, and R.

Overview

In addition to measuring the global impact of COVID-19 pandemic by number of cases and mortalities, there are various other measures ogf how COVID has impacted society e.g. socially, mentally, and in how it signiifocantly disrupted our normal way of living. Manyof these effects have been documented via social media as many of us have used it to share our emotions and experiences on these platforms throughout the duration of the pandemic.

While such impacts can’t be measured via typical numberical or statistical methods, Natural Language Processing gives us power to analyze social media content and measure effects of what we are writing on social media. Such analysis provides a rather detailed perspective as to what people are mostly worried about, as well as how sentiment have changed over a period of time. What are the major concerns or top talked topics during pandemic giving us a sense of overall mood of community?

In this detailed hands-on case study, we will be analyzing social media content such as Facebook and Twitter and will try to understand impact of COVID on human life in terms of social, financial, and mental impacts. We will learn about how to collect specific data from Twitter or Facebook comments. Participants will learn how to clean and process text and prepare social media posts for analysis. We will take a deep dive into some of most common and interesting topics such as sentiment analysis, topic detection, classification. Finally we will use power of visualizations to generate charts to understand impact of COVID from various aspects.

We will learn following modules:

  • Data collection such as Facebook comments and twitter data for a specific topic such as COVID
  • Data cleanup and analyzing. As text is always messy
  • EDA: Explaroatry data analysis by generating various charts that can make sense
  • Sentiment analysis to classify emotions of people

Learning Outcomes

  • Hands on project understanding
  • Data Collection such as web scraping
  • Data Analysis
  • Business insights such as taking decision as per impacts

Participants Description

  • Academics -Learn data collection and development
  • Graduate Students – Good project for resume
  • Undergraduate Students – Good project for resume
  • Data Science Professionals – Learn real life project
  • Subject Matter Professionals (Health Data Science, Marketing Professionals, etc.) – Covid impact from society perspective

Highlights

  • Work on real life problem with real life dataset
  • Learn how to collect dataset from social media platforms
  • Basics of Natural Language processing
  • Hands on exposure to develop end-to-end NLP project including deploying, API development, UI integration etc.
  • Find out impacts on COVID using real life data on real life humankind

Data: Oct 26th , 2021

Time: 12:00 pm – 2:00 pm (PST)

Location: Online

Tools: KNIME

Difficulty Level: Beginner

Prerequisties: Basic Statistics

Name: Kyongsik Yun, Ph.D. – Technologist, NASA/Jet Propulsion Lab

Kyongik Yun

is a technologist at the Jet Propulsion Laboratory, California Institute of Technology. His research focuses on building brain-inspired technologies and systems, including deep learning computer vision, natural language processing, brain-computer interfaces, and noninvasive remote neuromodulation. He received the JPL Explorer Award (2019) for scientific and technical excellence in machine learning applications. In addition to his research, Kyongsik co-founded two biotechnology companies, Ybrain and BBB Technologies, that have raised $25 million in investment funding.

Overview

What data science and machine learning topics and techniques are actually used in industry? How do you apply the specific deep learning skills you are learning now to solving real-world problems? What are some recent topics people are interested in addressing in your industry? If you have any of these questions, this workshop is for you. This workshop covers specific use cases of data science and machine learning technologies in the medical and aerospace industries. Topics include computationally efficient, physically constrained neural networks; combined convolutional and recurrent neural networks for explainable AI; multivariate data fusion and time series prediction. These technologies can be applied to a variety of use cases in medical, aerospace and earth science issues, and financial forecasting models.

Learning Outcomes
 
  • Computationally-efficient, physically-constrained neural networks (transforming nonlinear physical/mathematical problems into data-driven deep learning models)
  • Combining convolutional and recurrent neural networks for explainable and trustable machine learning solutions
  • Multivariate time series prediction using LSTM and Transformer models
  • 3D convolutional neural networks for medical image classification and segmentation
Participants Description
 
  • Beginner and intermediate software developer, research fellow, student in data science and machine learning
Highlights
 
  • Learn which deep learning techniques are being used to solve real problems
  • Understand the essentials of computational efficiency and explainability in deep learning
  • Gain industry insights through practical examples

Data: Oct 26th , 2021

Time: 3:00 pm – 5:00 pm (PST)

Location: Online

Tools: R or Python

Difficulty Level: Intermediate – advanced

Prerequisties: Basic Statistics, Python for beginners

Name: Dr. Adriana Summerow – Founder & Data Scientist at Opening Data

Adriana Summerow

is the founder and data scientist at Opening Data. Her experience includes 7+ years of professional experience applying predictive modeling, data pre-processing, and Natural Language Processing (NLP) algorithms. She has worked for companies such as Deloitte and Lockheed Martin as senior consultant specialist and industrial engineer solving challenging business problems. Her business acumen includes the application of data engineering, machine learning, and data visualization solutions for enterprises located in North and South America.

Overview

This is a hands-on workshop focusing on text data collection and data processing to make it ready for analysis and visualization. In this workshop we will implement machine learning classifiers and hyperparameter tuning to predict sentiment and categorize entities using the content generated on the web which has become increasingly crucial to successfully run a business.

Learning Outcomes
 
  • Text preprocessing & lemmatization
  • Word vectorization
  • Implementation of machine learning classifiers
  • Evaluation of graphs
  • Hyperparameter tuning
  • Industry best practices & insights
Participants Description
 
  • Beginner to intermediate data analyst, data scientist, data engineer, software developer, and students of data analytics

Highlights
 
  • Make sense of text data and improve the data-driven decision making by integrating Natural Language Processing into the analysis of documents, social media, online reviews and more. 
  • Streamline processes and reduce cost by automating the analysis of text data with automated and scalable machine learning models. 
  • Understand the language of your customer base, learn to perform market segmentations, and get the tools to impact performance in Finance, Healthcare or Marketing.