EECE.5540 Data Intensive Computing
Id: 041900
Credits: 3-3
Description
This course deals with various topics in data-intensive computing to address challenges in managing large-scale data and methods for extracting values from big data. Specifically, we explore state-of-the-art techniques to build parallel systems and applications for scalable data analysis on a massive and complex dataset, those from scientific and engineering problems. Topics include: 1) Storage requirements of big data; 2) parallel and distributed computing systems in both high-performance computing (HPC) and commercial domains; 3) Data-parallel frameworks such as MapReduce/Hadoop/Spark; 4) parallel file systems such as HDFS/Lustre; 5) NoSQL data models such as Dynamo/BigTable/Cassandra; and 6) time-series data models such as InfluxDB/Prometheus.
Prerequisites
EECE.4520 Microprocessor Systems II & Embedded Systems, or EECE.4811 Operating Systems, or EECE.4821 Computer Architecture & Design, or Permission of Instructor.
View Current Offerings
Course prerequisites/corequisites are determined by the faculty and approved by the curriculum committees. Students are required to fulfill these requirements prior to enrollment. For courses offered through online or GPS delivery, students are responsible for confirming with the instructor or department that all enrollment requirements have been satisfied before registering.