|
|
(30 intermediate revisions by 2 users not shown) |
Line 1: |
Line 1: |
| = Big Data course on Beocat =
| | This course is now available here: http://people.beocat.ksu.edu/~dan/education/bigdata/ |
| | |
| The Pittsburgh Supercomputing Center hosts 2-day remote Big Data workshops
| |
| several times each year. The information provided here will allow individual
| |
| users to go through the videos at their own pace and perform the exercises
| |
| on our local Beocat supercomputer. Each exercise will have data and results
| |
| tailored to each individual to allow instructors to measure the progress of
| |
| students assigned to take this course interactively.
| |
| | |
| Use the Agenda website below to access the slides starting with the Welcome slides
| |
| that don't have an associated video. The '>' sign at the start of lines below
| |
| represents the command line prompt on Beocat, and '>>>' represents the prompt
| |
| you'll get when you start pyspark or python.
| |
| | |
| Agenda: https://www.psc.edu/hpc-workshop-series/big-data
| |
| | |
| Videos: https://www.youtube.com/watch?v=NpapUmGHXyw&list=PLdkRteUOw2X-YKqommnuGWqNfEEUG6P2E
| |
| | |
| == Welcome ==
| |
| | |
| ssh into Beocat from your computer and copy the workshop data to your
| |
| home directory.
| |
| | |
| > cp -rp ~daveturner/workshops/bigdata_workshop .
| |
| > cd bigdata_workshop
| |
| | |
| PDF versions of the slides are available for each section
| |
| as are directories containing the data for each set of exercises.
| |
| You'll need to copy the PDF files to your local computer for viewing.
| |
| | |
| Go through the Welcome slides from the Agenda website link or PDF file
| |
| Big_Data_Welcome.pdf. Much of this information is specific to
| |
| the Bridges supercomputer at PSC so just scan over these slides.
| |
| | |
| == Intro to Big Data ==
| |
| | |
| [web link is bad]
| |
| Watch the video 'Intro to Big Data - Big Data Video 1'
| |
| (slides are A_Brief_History_of_Big_Data.pdf)
| |
| | |
| == Hadoop ==
| |
| | |
| Watch the video 'Hadoop - Big Data Video 2' (slides are <B>Hadoop2019.pdf</B>)
| |
| We do not have Hadoop on Beocat so the commands they cover will not work locally
| |