5 Day Boot Camp
Build a foundation for Data Science
5 Day Technical Boot Camp
The Boot Camp provides a foundation for all of the core topics in Data Science.
You will learn how to:
Design, evaluate, and tune machine learning models
Read and interpret programming code in Python
Address a broad range of real-world problems and solutions
This course offers a comprehensive balance between theory and practice, with visualizations, demonstrations, exercises, case studies and projects. You’ll learn the most important concepts and models in the morning, and then practice with practicals throughout the afternoon. Along the way, you will learn to read code so that you can collaborate with – and manage – technical teams and real-world projects.
The course is limited to 30 participants to ensure time for one-on-one assistance.
During the COVID-19 pandemic, we are securing rooms sized for 100+, so we can keep density down to 25%-40% to allow sufficient space for social distancing.
Foundational concepts:
Data sampling, measurement, and wrangling
Exploratory data analysis
Data description, visualization, and graphing
Bias, variance, and the bias-variance tradeoff
Model validation and model cross-validation
Hyperparameter tuning and information leakage
Model evaluation and comparison
Model weighting of costs and benefits
Ensemble learning and meta-learning
Predictive labeling and data augmentation
Data-driven business models
Big Data, Map-Reduce, and Spark
Virtual Machines and Cloud Computing
Strategic Planning for a Digital Transformation
The management of talent and strategic Human Capital
Methods and models:
Normalizing and standardizing data
Linear and Log-Linear models
Non-parametric models, splines and locally-linear models
Nearest neighbor and similarity models
Agglomerative clustering and K-means clustering
Decision trees, bagging, boosting, and random forests
Dimension reduction, PCA, t-SNE, and manifold projections
Support Vector Machines
Text as Data and Natural Language Processing (NLP)
Word Embeddings and Latent Topic Modeling
Feed-Forward Neural Networks
Convolutional Neural Networks
Recurrent Neural Networks, LSTMs, Bi-Lateral LSTMs
Generative Adversarial Networks
Reinforcement Learning
Course Outline
After reading books on Data Science and watching tutorials on the web, I was left with the impression of never really getting started. I eventually found DSFM on the EPFL website and decided to give it a try, instead of going for online courses over several months in small sessions.
In good courses you learn new concepts, in exceptional ones you not only learn, but you just admire how well it was designed, organized and executed. The take home materials are just great, with theory, demonstrations, and extremely well documented Jupyter notebooks ready to execute and review any time after the course.
Prof. Younge has created a true pearl of pedagogy in a field still unexplored by most managers and decision takers. He opens wide the black box of machine learning and AI mystique, showing us what's inside, with rigor, foresight, clarity, pragmatism and humor, and gives us the so-needed impulse to roll up our sleeves and just start doing it !
I accomplished in one week what would have taken me months to figure out - how to discriminate the hype from reality, assimilate complex concepts and learn how they get applied in real life.
A great week !
Stevan Klaas, Lead Advisor to the CEO
Venue
The DSFM Boot Camp is held on campus at EPFL (the École Polytechnique Fédérale de Lausanne) - part of the Swiss Federal Institute of Technology.
DSFM is aimed for those wanting to return to campus for the 'EPFL experience.' The venue adds to the intensity and ambition of the course by motivating participants to move quickly through a great amount of material in a relatively short amount of time. The five day boot camp covers much of the same material as a challenging, semester-long, masters course at EPFL.
EPFL is also home to over 350 laboratories and research groups, each working at the forefront of science and technology – with a diverse, committed and stimulating research community that is active over a wide spectrum of quantitative and design-focused disciplines.
Preparation
Most managers have forgotten their mathematics, so we emphasize visualizations of mathematical concepts instead of complicated proofs. Moreover, most boot camp participants are not professional programmers. We therefore present basic programming concepts and build from there to complete solutions.
Novice programmers will learn how to read programming code provided in solutions; more advanced students will learn to build those solutions from the bottom-up using scikit-learn APIs. Teaching assistants are available throughout the day to provide one-on-one assistance with practical problems. All participants will leave the course being able to build, evaluate, and work with real data and real models!
No prior training in Data Science is required to take DSFM, and the course is limited to 40 participants to ensure time for one-on-one assistance.
However, to get the most out of the Boot Camp, you will...
Be familiar with linear algebra (although we use very little math)
Be familiar with statistics (although we will review the basics)
Be conversant in English (the course will be given in English)
Bring a laptop (Mac, Windows, Linux, or Chromebook)
Python
You don't need to be a programmer, or to program solutions from scratch in the course, but we will look at real coding examples to see what it does. (And why!)
We recommend that you dedicate 5 to 10 hours of online study with the Python programming language before the start of the Boot Camp course. If you are new to Python, a little bit of preparation will help you to get much more out of the class.
We will email you several suggestions for online preparation in Python when you register.
Do not fear! We will have several highly-qualified EPFL graduate students on-hand to work with you one-on-one to answer questions about the programming code.