Conduct analysis of several dozen gigabytes of transaction data using Amazon Web Services (AWS).
Project Goal and Objectives
Keep costs to a minimum by using the transparent AWS cost structures and flexibility of increasing or reducing server capacity. Secure data transfer to the Redshift Cloud. Carry out an effective and qualitative data analysis.
Requirements, Constraints and Framework
Similar to all data intensive projects, data cleansing proved to be the most time consuming part of the project. With large data volumes each phase took anything between a few minutes and a couple of hours. A structured approach was necessary.
Key Implementation Steps
- Transfer using a direct dedicated connection to the AWS backbone
- Concept and setup of DWH (Data Warehouse) cluster
- Data cleansing
- Data analysis using standard SQL-Clients (Tableau, Alteryx)
- Presentation of the analysis. Archive the data. Shut down of the Redshift cluster
For data preparation, the setup of the Redshift cluster and transfer to the cloud:
AWS Solution Architect: 2 Weeks (50-100%)
For performing the analysis, creating the visualizations and documentation:
IT Analyst: 8 Weeks (100%)