Overview of
Pre-Conference Training
Title
Instructor
Title
Difficulty-Level
Time Length
Schedule on
Nov 1, 2020
Price
How Data Analytics Can Be Misleading
Dr. Richard Tang
Assistant Professor of Marketing, LMU
Low
2hrs
9:00 am - 11:00 am (PST)
$29
Virtual Data Science Learnathon
Dr. Satoru Hayasaka Corey Weisinger Scott Fincher
Data Scientists, KNIME
Low
2hrs
12:00 pm - 2:00 pm
(PST)
$29
Introduction to Transactional Data Analysis Using Tableau Software
Dr. Katherine Goff Inglis
Principal Consultant, KGI Analytics
Medium
2hrs
9:00 am - 11:00 am
(PST)
$39
SEIR Model for Predicting in
COVID-19
Team LMU
The winning team at 2020 Data Science Competition for COVID-19
Medium
2hrs
3:00 pm - 5:00 pm
(PST)
$39
Transforming Unstructured Voice and Text Data into Insight Using Natural Language Processing
Dr. Kyongsik Yun
Technologist, NASA/Jet Propulsion Lab
High
2hrs
12:00 pm - 2:00 pm
(PST)
$49
Application of Machine Learning in Scientific Data Analysis
Dr. Jonanthan Jiang
Group Supervisor, Principal Scientist, NASA/Jet Propulsion Lab
High
2hrs
3:00 pm - 5:00 pm
(PST)
$49
Data: Nov 1st , 2020
Time: 9:00 am – 11:00 am (PST)
Location: Online
Tools: R or Python
Difficulty Level: Beginner to Intermediate
Prerequisties: Very Basic Statistics
Name: Dr. Richard Tang – Assistant Professor of Marketing, LMU
Dr. Richard Tang
is originally from Huaiyuan, a beautiful one-million-population small town in eastern China. Though Richard learned BASIC programming on his own and wanted to be a software engineer when he was in high school, he graduated from East China University of Science and Technology with B.A. and M.S. in Business Administration. Richard then earned his Ph.D. in Marketing with a minor in economics from the University of Arizona. He is now serving as an assistant professor of marketing at Loyola Marymount University (LMU). At LMU, Richard teaches marketing analytics, natural language processing, and mentors students in various data science competitions.
Richard’s training on quantitative research methods consists of econometrics and machine learning (with a focus on natural language processing). He is interested in applying those quantitative methods to generate constructive insights for businesses and society. Topics of his current research include quantifying business environments with geographical location information, extracting consumer insights from user-generated-content, assessing the effectiveness of AI-based service robots, and redesigning organizational structure to unleash the power of business analytics.
Overview
This workshop will provide the participants with a macro view on evaluating the quality of data analytics from the perspective of context, method, and validity. Rather than just demonstrating the power of data analytics, this workshop offers participants a unique perspective on how data analytics can be erroneously conducted. With a series of intriguing examples, participants will experience how easily data analytics can deliver inconclusive and even misleading findings. Those examples include:
- Common stereotypes, such as the idea that students with excellent academic performance are not good at athletic sports, and that leaders of organizations are less intelligent than their subordinates.
- Controversial debates, such as if females are discriminated against in university admission, and if the death penalty could curb crime.
- Technical discussions, such as when adding more data to your model could be harmful, and when applying theoretical predictions from one context to another context can be problematic. Those examples come with both R and Python codes for the participants to simulate the data generation process and “see” how and why subsequent data analytics can be wrong.
In sum, the workshop aims to help participants to establish a system to evaluate data analytics procedures and outcomes critically, and in turn, make data analytics a force for good.
Learning Outcomes
- Appreciate the importance of obtaining contextual knowledge
- Identify common threats to the validity of data analytics
- Understand the limits of analytics methods (using regression as an example)
Participants Description
- Beginners who are looking for a comprehensive view of valid data analytics practices
- “Consumers” of business analytics who need to use business analytics outcomes in their work and business
Highlights
- Understand how data analytics can be misleading
- Learn to simulate the data generation process to illustrate misleading data analytics outcomes
- Establish a system to evaluate data analytics procedures and results critically
Data: Nov 1st , 2020
Time: 12:00 pm – 2:00 pm (PST)
Location: Online
Tools: KNIME
Difficulty Level: Beginner to Intermediate
Prerequisties: None
Name: Scott Fincher – Data Scientist, KNIME
Dr. Satoru Hayasaka – Data Scientist, KNIME
Corey Weisinger – Data Scientist, KNIME
Presented by KNIME
At KNIME, we build software for fast, easy and intuitive access to advanced data science, helping organizations drive innovation.
Our KNIME Analytics Platform is the leading open solution for data-driven innovation, designed for discovering the potential hidden in data, mining for fresh insights, or predicting new futures. Organizations can take their collaboration, productivity and performance to the next level with a robust range of commercial extensions to our open source platform.
Participants will learn more about the data science cycle: data access, data blending, data preparation, model training, optimization, testing, and deployment. They will also work in groups to hack a workflow-based solution to guided exercises. There are three hands-on exercises to be selected:
- Group 1 – Working on the raw data. Data access and data preparation.
- Group 2 – Machine Learning. Which model shall I use? Which parameters?
- Group 3 – I have a great model. Now what? The model deployment phase.
The tool of choice is the open-source, GUI-driven KNIME Analytics Platform. Because KNIME is open, it offers great integrations with an IDE environment for R, Python; SQL, and Spark.
Dr. Satoru Hayasaka
was trained in statistical analysis of various types of biomedical data. Since his doctoral training, he has taught several courses on data analysis geared toward non-experts and beginners. In recent years, he taught introductory machine learning courses to graduate students from different disciplines. Recently he joined KNIME as part of the evangelism team, and he continues teaching machine learning and data mining using KNIME Analytics Platform.
Corey Weisinger
is a data scientist on the Evangelism team at KNIME. He currently focuses on signal processing and numeric prediction techniques for time series analysis. He is the author of the Alteryx to KNIME guidebook and a regular contributor to the KNIME blog.
Scott Fincher
routinely teaches, presents, and leads group workshops covering topics such as the KNIME Analytics Platform, Machine Learning, and the broad Data Science umbrella. He enjoys assisting other data scientists with general best practices and model optimization. For Scott, this is not just an academic exercise. Prior to his work at KNIME, he worked for almost 20 years as an environmental consultant, with a focus on numerical modeling of atmospheric pollutants.
Learning Outcomes
Participants of this workshop will learn:
- the data science cycle
- how to hack a workflow-based solution
- data access and data preparation
- model selection
- model deployment
Participants Description
- Professionals who are looking for more practice with data preparation, model selection, and model deployment
- Students and beginners who are interested in learning about the data science cycle
Highlights
- Expert instruction on data access, data blending, data preparation, model training, optimization, testing, and deployment
- Hands-on data science work
- Guided exercises
- More practice with the GUI-driven KNIME Analytics Platform
Data: Nov 1st , 2020
Time: 9:00 am – 11:00 am (PST)
Location: Online
Tools: Tableau Desktop, Tableau Prep, Tableau Public
Difficulty Level: Intermediate
Prerequisties: Install a free trial version of Tableau Desktop & Prep, Complete the pre-workshop survey
Name: Dr. Katherine Goff Inglis – Principal Consultant, KGI Analytics
Dr. Katherine Goff Inglis
career in operations research and analytics spans 15 years in the Entertainment, Retail, Advertising and Residential Services industries. Katherine earned her Doctorate in Industrial Engineering in 2017 from Ryerson University in Toronto with a focus on demand driven operations management. For a decade, Katherine led analytics and data science for Cineplex, a top-tier Canadian brand operating in Motion Picture Exhibition and Media Advertising. She currently teaches in higher education, conducts scientific research and provides consulting services to individuals & organizations focused on enterprise analytics and data strategy. Katherine loves getting hands on with technology and has worked with many types of data sources such as mobile, spatial, cameras, sensors, weather & customer loyalty.
Overview
This workshop is for those interested in utilizing Tableau software to create data visualizations and tell stories from transactional data that highlight customer insights and business implications. No prior experience with Tableau software is necessary, you will be introduced to Tableau’s suite of software products and learn how to start using Tableau Prep and Tableau Desktop. In this introductory workshop, you will create data visualizations and dashboards in Tableau Desktop using spatial and aspatial transactional data. Successful examples of Tableau software being used to tell stories about customer behaviour will be reviewed. Best practices for preparing transactional data and identifying customer insights will be discussed.
The workshop will be hands on and consist of a case study project, software demo, group discussion and a handout that will be provided in advance and referenced throughout the session. For the project component, you will create a personalized case study by choosing from a provided list of markets (cities) and business problems to work on. You will have the opportunity to optionally share your work and get feedback from other participants in the workshop.
Learning Outcomes
Participants of this workshop participants will learn to:
- Use data visualization to tell stories about customer behaviour
- Analyse and visualize transactional customer data
- Identify insights and business implications from transactional customer data
- Utilize Tableau desktop software to visualize spatial and aspatial transactional customer data
- Utilize Tableau prep software to explore, integrate and manipulate transactional customer data
- Summarize appropriate and best use cases for Tableau software suite of products compared to other popular analytics tools (SQL, R, Python, Excel, etc)
Participants Description
- Professionals who are looking for an introduction to transactional data analysis in Tableau software
- Students and beginners who are interested in creating maps and visualizations using Tableau software
Highlights
- Learn how to apply Tableau software to solve real world business problems in specific cities and markets
- See examples of how data visualization is used to tell stories about customer behavior with transactional data
- Cover industry best practices for interpreting insights from transactional customer data
- Create and share dashboards and data visualizations using Tableau software
Data: Nov 1st , 2020
Time: 3:00 pm – 5:00 pm (PST)
Location: Online
Tools: Python
Difficulty Level: Intermediate
Prerequisties: Basic Python and Modeling Skills
Name: Eric Wu – MSBA Graduate Student
Kayla Tanli – Data Analyst Intern at Speridian
Rongxing (Vincent) Chen – Business Analyst Intern in ARC
Eric, Kayla, and Vincent
(“Team LMU”) took part in RMDS Lab’s 2020 data science competition for COVID-19 and are working on the implementation stage. They also excelled in a previous data science competition held by RMDS lab in 2019 for which they placed fourth out of 25 teams globally. They are currently cooperating with RPA, Fandango and RMDS on separate projects, which include the customer value changes with BERT and neural network; theater occupancies with big data; and retail agglomeration with Wasserstein distance.
Overview
This workshop mainly involves building a simple SEIR model, making use of its prediction and city-level attributes to generate a risk score based on the city. With the help of the API from RMDS Lab and the city of Los Angeles, we could visualize the different risk levels within LA county.
In the first part of the workshop, we will review the concept of the SEIR model. We will also discuss why the SEIR model was used in this project. From there, we will discuss the Python code used to build the model. We may also run an RNN model to compare their performances.
The second part of the workshop will consist of taking the results of the model and using those results in order to give recommendations to the city/county on how to handle COVID-19. We will also be discussing the applications of the SEIR model.
Participants of this workshop participants will learn to:
- Use data visualization to tell stories about customer behaviour
- Analyse and visualize transactional customer data
- Identify insights and business implications from transactional customer data
- Utilize Tableau desktop software to visualize spatial and aspatial transactional customer data
- Utilize Tableau prep software to explore, integrate and manipulate transactional customer data
- Summarize appropriate and best use cases for Tableau software suite of products compared to other popular analytics tools (SQL, R, Python, Excel, etc)
- Beginners who hope to learn to build a new model, get familiar with the modeling and data application process.
- Professionals who are interested in the models used in COVID pandemic prediction.
- Learn how to apply data science and machine learning to solve real-world business problems
- Preprocessing, exploratory data analysis and visualization
- Learn how the virus spread from a statistical perspective
- Learn how to create a solution that is ready to be operationalized to help the community navigate this pandemic
Data: Nov 1st , 2020
Time: 12:00 pm – 2:00 pm (PST)
Location: Online
Tools: Python
Difficulty Level: High
Prerequisties: Basic Statistics
Name: Kyongsik Yun, Ph.D. – Technologist, NASA/Jet Propulsion Lab
Kyongik Yun
is a technologist at the Jet Propulsion Laboratory, California Institute of Technology. His research focuses on building brain-inspired technologies and systems, including deep learning computer vision, natural language processing, brain-computer interfaces, and noninvasive remote neuromodulation. He received the JPL Explorer Award (2019) for scientific and technical excellence in machine learning applications. In addition to his research, Kyongsik co-founded two biotechnology companies, Ybrain and BBB Technologies, that have raised $25 million in investment funding.
Overview
This workshop covers how to transform unstructured voice and text data into insights using natural language processing (NLP). In the real data science problem, the amount of data to be processed increases, and in particular, there is an increasing demand to extract specific information from unstructured text data. Therefore, it is very important to understand how to organize, process, and analyze voice and text data. This workshop provides the knowledge necessary to solve complex problems related to NLP using machine learning based on python. The instructor explains the basic concepts of NLP, and covers vectorization techniques and how to build a machine learning classifier. The instructor will introduce you to text summarization and custom speech recognition problems, and help you gain a deeper understanding of NLP with practical examples.
- Concept of natural language processing
- Process of tokenization, vectorization, and lemmatization
- TF-IDF (term frequency-inverse document frequency)
- Text summarization using deep convolutional neural networks
- Speech recognition and custom language model using recurrent neural networks
- Real world applications of natural language processing
- Intermediate software developer, research fellow, student in data science and machine learning
- Learn how to transform unstructured text data into insights
- Understand the essential elements of natural language processing in the era of deep learning
- Gain industry insights through practical examples
Data: Nov 1st , 2020
Time: 3:00 pm – 5:00 pm (PST)
Location: Online
Tools: Python
Difficulty Level: High
Prerequisties: Basic Statistics and Programming Skills
Name: Dr. Jonathan H. Jiang – Principal Scientist
Dr. Jonathan H. Jiang
is a Principal Scientist of Engineering and Science Directorate at the NASA’s Jet Propulsion Laboratory (JPL), California Institute of Technology, and the Supervisor of JPL’s Aerosol and Cloud Research Group, managing a team of more than 30 scientists, engineers and postdoctoral scholars. Dr. Jiang is also serving as an Editor for the American Geophysical Union’s Journal of Earth and Space Science; He is the elected Chair for the American Meteorological Society’s Committee on Atmospheric Chemistry. Dr. Jiang has authored/co-authored nearly 200 research publications in the field of satellite remote sensing, atmospheric and climate sciences, as well as astronomy and astrophysics. Dr. Jiang was twice awarded the NASA Exceptional Achievement Medals in 2010 and 2013, for his leadership and innovation in scientific research using NASA satellite observations. Most recently, Dr. Jiang received the 2019 Ed Stone Award for his outstanding research publications related to the NASA missions; He is also the recipient for the 2019 NASA Exceptional Scientific Achievement Medal for his outstanding scientific achievement in scientific research.
Overview
Machine learning has become increasingly popular in scientific data analysis. This workshop will introduce several commonly used machine learning techniques and their application in the field of physics, astronomy, and earth sciences. We will first illustrate several simple examples to the class as a hands-on exercise. Then a more detailed machine learning application in the field of exoplanet research will be presented as a practical example of scientific data analysis.
- Deep understanding of machine learning in scientific studies (e.g., physics, astronomy, and earth sciences)
- Learning fundamental machine learning concepts
- Be able to solve scientific questions through machine learning techniques
This workshop is designed for those who are interested in machine learning applications to solve problems; who want to develop and promote a strategy for implementing machine learning in order to analyze scientific data.
- A cutting-edge application of machine learning techniques to conduct exoplanet studies
- A comprehensive overview of machine learning applications in the scientific studies
- Hands-on exercises to apply the knowledge to practice
Instructors









