This project investigates 3Vs of Big Data (e.g Volume, Variety, and Velocity).
Volume: Due to the exponential increase in data volume, it is necessary to adopt parallelism techniques to achieve reasonable query response time. The main focus will be on parallel query processing, which is the main driver for Big Data processing
Variety: Data comes in a variety of formats, not only in a traditional tabular form as in the relational database systems. We will investigate the richness of data formats, including graph data and hierarchical data formats.
Velocity: Data is coming at a faster rate than we can absorb. Some of the common data sources are mobile data and GPS data - related to location-based information, such as mobile movements, mobile activities, and mobile navigations. The focus will be on IoT data processing
References: High Performance Parallel Database Processing and Grid Databases (Wiley, 2008)
- Be familiar with Big Data technologies (e.g. Apache Spark, Kafka)
- Have prior knowledge on or experience in parallel processing or data stream/IoT
- Have strong programming skills
- Must have done FIT5148 or FIT5202 or similar units (Monash students only)