Senior Data Scientist, Saturn Cloud
Aaron Richter is a software developer turned data engineer and data scientist. He has pioneered the development and implementation of large-scale data science infrastructure in both business and research environments.He has become an expert in finding efficient ways to clean data, run pipelines, and tune models. Aaron is a Senior Data Scientist at Saturn Cloud, where he works to make data scientists faster and happier. He holds a PhD in machine learning from Florida Atlantic University.
WATCH LIVE: November 2nd at 1:00 pm
What happens when you run out of processing power on your laptop? You could scale up – get more efficient hardware, or scale out – add more machines. Whichever you choose, there are great tools for accomplishing scale within the Python data science ecosystem. Dask is a parallel computing framework that scales from your laptop to a cluster of thousands of machines. RAPIDS is a GPU-computing framework that pushes traditional CPU workloads to the GPU. Dask and RAPIDS together allow you to accelerate your code in clusters of GPU machines. This talk will help you navigate this exciting new world, and show what it takes to convert “traditional” single-node Python workflows to execute in parallel and GPU-accelerated environments. We will also discuss ways to get the hardware to run these tools in the cloud, and some considerations for infrastructure and management.