Data Science 102 - Full Stack

Hortonworks Data Science 102 - Full Stack - Introduction Workshop

This workshop is geared for a full-stack experience where you get a little Operations work by deploying a cluster and loading data - and a little Data Science by running models.


This workshop will have you working with some Data Science concepts and basic Machine Learning tools in Hortonworks Data Platform. Get hands on experience with Hortonworks CloudBreak Deployer, dive into a provisioned Hortonworks Data Platform + Apache Spark cluster, and learn some of the fundamental concepts and key technological components of Data Analytics and Data Science!

Who should attend

What you will learn


Welcome to the Data Science 102 Workshop brought to you by Hortonworks and Fierce Software!

If you are on-site with us, we will progress through this workshop in one of the following ways:

In either case, the environment has been set up and the instructors will provide you with everything you need to get started to move through each exercise.

Otherwise, if you're running this on your own, just go through the list of labs below in order. Keep in mind if you're not part of a coordinated workshop, the environment will most likely not be provisioned but it's easy enough to follow along.

Here we’ll be demoing some of the tools in the Apache Hadoop ecosystem by using Hortonworks CloudBreak Deployer to deploy a Hortonworks Data Platform and Apache Spark cluster to work through some Zeppelin notebooks.

As is everyone’s favorite way to get things going, it’ll be Death By Powerpoint and we’ll be starting with some high level concepts of the Hadoop ecosystem with Hortonworks Data Platform. Then the focus is brought to the specific architecture of our lab environments and the components in them, and a little bit more about the ones we’ll be using.

After the slideshow torture, we’ll be diving straight into the lab. There’s a small Setup section as a preliminary step to get you familiar with the lab environment and the tools at your disposal.

We’ll be spending most of our time focused on the lab content. You’ll be working through this by yourself, or ideally even with the people next to you as a team! There are (mostly) clear instructions along the way, and even some pictures to help. Even then, don’t be shy. Ask questions, challenge the way we do things, if you need some help we’ll be happy to guide you along.


These labs have been tailored for HDP v2.6.5. CloudBreak has been deployed to AWS via CloudFormation Quickstart in a new VPC as the starting point for this lab.


Many thanks to Satish Bomma and Ian Brooks of Hortonworks for the help in putting this together

Workshop Details

Domain Fierce Software Logo
Student ID