Exercise 2 - Ambari & adding data to HDFS

Return to Workshop

Ambari Interface

Now that we're in Ambari, let's take a look at a few key points...
  1. Services in the Cluster - A list of all the active components in this cluster that Ambari has access to. Also accessible as a drop-down
  2. Alerts - If there’s something wrong you can find that here
  3. Metrics - Very important especially since some of these components and services are Java based and need to have their heap and other performance metrics tuned closely in production environments.
  4. Extras menu - Functions that support the larger cluster as a whole

Extras Menu - File View

Why don’t we actually select Files View from that Extras menu.

Navigating HDFS & Uploading Data

Once the Files View loads, we can click into the /tmp directory. Next, download the following census data file in CSV format from the following link:

Poisonous Mushroom Database CSV

With that downloaded, navigate to the tmp directory & click the Upload button in the upper right portion of the File View and drag that file into the dropbox to upload.

This is one method of going about loading files into HDFS. Another method is to download it locally to the server via a shell, and then adding it to the HDFS registry via the hadoop fs command. We’ll sort of use those here in the next few steps when we get into Zeppelin.

Quick Links - Launch Zeppelin

We’ve got the data we need in HDFS, everything should still be running nominally, let us select the Zeppelin service.
Next, in the middle of the top part of the main page, you should see a Quick Links drop down, click it and select Zeppelin UI.
A new tab should open with the Zeppelin web application.


Workshop Details

Domain Fierce Software Logo
Workshop
Student ID

Return to Workshop