Amazon Redshift

Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and existing Business Intelligence (BI) tools.

RedShift is a SQL based data warehouse used for analytics applications.

RedShift is a relational database that is used for Online Analytics Processing (OLAP) use cases.

RedShift is used for running complex analytic queries against petabytes of structured data, using sophisticated query optimization, columnar storage on high-performance local disks, and massively parallel query execution.

RedShift is ideal for processing large amounts of data for business intelligence.

RedShift is 10x faster than a traditional SQL DB.

RedShift uses columnar data storage:

Data is stored sequentially in columns instead of rows.
Columnar based DB is ideal for data warehousing and analytics.
Requires fewer I/Os which greatly enhances performance.

RedShift provides advanced compression:

Data is stored sequentially in columns which allows for much better performance and less storage space.
RedShift automatically selects the compression scheme.

RedShift uses replication and continuous backups to enhance availability and improve durability and can automatically recover from component and node failures.

RedShift always keeps three copies of your data:

The original.
A replica on compute nodes (within the cluster).
A backup copy on S3.

RedShift provides continuous/incremental backups:

Multiple copies within a cluster.
Continuous and incremental backups to S3.
Continuous and incremental backups across regions.
Streaming restore.

RedShift provides fault tolerance for the following failures:

Disk failures.
Nodes failures.
Network failures.
AZ/region level disasters.

Boyang Yan

Explorer

Amazon Redshift

Graph View

Backlinks