Due to time constraints during the internship and the priority given to core infrastructure stability, this section is presented as an advanced concepts. The objective is to upgrade the data flow from Batch Processing to Real-time Streaming Processing to optimize system responsiveness.
2. Description
The proposed system utilizes Amazon Kinesis Data Firehose to stream logs directly from CloudWatch Logs to the S3 Data Lake instead of waiting for data batching.
This reduces data latency from minutes to seconds, providing critical support for monitoring sensitive systems.
3. Architecture
Kinesis Data Stream: Acts as an input for high volumes of logs from multiple resources without causing system bottlenecks.
Kinesis Data Analytics: Executes SQL queries on-the-fly to detect anomalies or system errors instantly.
Kinesis Data Firehose: Automatically converts data formats to Parquet before storing in S3 to optimize costs and performance for Amazon Athena.
4. Benefits
Immediate Response: Error alerts via SNS will be triggered almost instantly when error logs are detected through Kinesis Analytics.
Powerful Scalability: The system can process terabytes of logs daily without the need for manual server management or maintenance.