Comparison of Amazon Athena , Amazon Redshift, Redshift Spectrum, Redshift Serverless
This post will cover the Comparision of Amazon Athena with Amazon Redshift , Amazon Redshift Spectrum, Amazon Redshift Serverless
Amazon Athena is a serverless, interactive query service to query data and analyze big data in Amazon S3 using standard SQL. For more information on Amazon Athena, refer to our post – http://www.cloudinfonow.com/amazon-athena/
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. For more information on Amazon Redshift , refer to our post – http://www.cloudinfonow.com/amazon-redshift-serverless/
Feature | Amazon Athena | Amazon Redshift |
Overview | Serverless, interactive query service to query data and analyze big data in Amazon S3 using standard SQL. | Fast, fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools |
Compute & Scaling | Serverless. Scales your infrastructure as your datasets and number of users grow | Amazon Redshift starts with configured compute capacity, but can scale up/down automatically through Concurrency Scaling Feature. Amazon Redshift Serverless introduced recently (Dec 2021) will provide auto scaling capabilities. |
Storage | No data stored in Amazon Athena . Query against already stored data in S3 buckets through external tables. | For Amazon Redshift, Data is stored in Redshift Clusters which is again Redshift Managed storage in S3 buckets. For Amazon Redshift Spectrum, query against already stored in S3 buckets through external tables. |
Performance | Highly Performant. Varies by use case. Amazon Athena provides the easiest way to run ad-hoc queries for data in S3 without the need to setup or manage any servers. | Highly Performant. Varies by use case. Fastest query performance for enterprise reporting and business intelligence workloads, particularly those involving extremely complex SQL with multiple joins and sub-queries. |
Ideal Use cases | Athena is great if you just need to run a quick query on some web logs to troubleshoot a performance issue on your site. | Amazon Redshift is your best choice when you need to pull together data from many different sources – like inventory systems, financial systems, and retail sales systems – into a common format, and store it for long periods of time, to build sophisticated business reports from historical data |
ACID Transactions | ACID Complain through Governed tables and Iceberg tables | ACID Complaint |
Cost | Ideal for Adhoc queries due to per TB scan cost | Ideal for consistent data loads and usage through Reserved instances purchase |
Security | Amazon Athena allows you to control access to your data by using IAM policies, ACLs, and Amazon S3 bucket policies. Supports row-level, cell-level, column-level permissions | Redshift security is built at no extra cost. Redshift has most extensive options for security data at rest and in transit. Supports column level permissions. |
Federation | Athena enables you to run SQL queries across data stored in relational, non-relational, object, and custom data sources. | Supports Federation. Federated queries can work with external databases in Amazon RDS for PostgreSQL, Amazon Aurora PostgreSQL-Compatible Edition, Amazon RDS for MySQL, and Amazon Aurora MySQL-Compatible Edition. |
Machine Learning | Integrated with Sagemaker. You can invoke your SageMaker Machine Learning models in an Athena SQL query to run inference. Training and Model build is not supported. | Integrated with Sagemaker. Simply use SQL statements to create and train Amazon SageMaker machine learning models using your Redshift data and then use these models to make predictions. |