Comparison of Kinesis Data streams vs Kinesis Data Firehose vs AWS DMS vs AWS Glue
Characteristic | Kinesis Data Streams | Kinesis Data Firehose | AWS DMS | AWS Glue |
scalability | Each shard can support up to 1,000 PUT records per second. However, you can increase the number of shards limitlessly. One shard provides a capacity of 1 MB/sec data input and 2 MB/sec data output. | Kinesis Data Firehose will automatically scale to match the throughput of your data, without any manual intervention or developer overhead. | AWS DMS uses Amazon EC2 instances as the replication instance. You can scale up or down your replication instance, depending on utilization. | AWS Glue uses a scale-out Apache Spark environment to load your data into its destination. To scale out, you specify the number of DPUs (data processing units) that you want to allocate to your ETL jobs. AWS Glue during Reinvent 2021 introduced Autoscaling. |
fault tolerance | Amazon Kinesis Data Streams synchronously replicates data across three Availability Zones, providing high availability and data durability. | Amazon Kinesis Data Firehose synchronously replicates data across three Availability Zones, providing high availability and data durability. | You have the option of enabling Multi-AZ which provides a replication stream that is fault-tolerant through redundant replication servers. | AWS Glue connects to the data source of your preference, whether it is in an Amazon S3 file, Amazon RDS table, or another set of data. As a result, all your data is stored and available as it pertains to that data stores durability characteristics. AWS Glue also provides default retry behavior that will retry all failures three times before sending out an error notification. To be informed of job failures or completions, you can set up Amazon SNS notifications via Amazon CloudWatch actions. |
Pricing | You pay per shard hour and per PUT payload unit. Optionally, there are fees associated with extended data retention and enhanced fan-out, if you choose to use those features. | You pay for the volume of data you ingest using the service and for any data format conversions. | You pay for compute resources (depending on instance type) used during the migration process and any additional log storage. | With AWS Glue, you pay an hourly rate, billed by the second, for crawlers (discovering data) and ETL jobs (processing and loading data). |