1. 程式人生 > >Understand Redshift Cluster Storage Space

Understand Redshift Cluster Storage Space

The amount of disk storage space allocated to two tables that are in different Amazon Redshift clusters can vary significantly, even if the tables are created using the same data definition language (DDL) statements and contain the same number of rows. In the following scenario, the difference in disk storage space consumed by each table is determined by:

  • The number of populated slices on each Amazon Redshift cluster
  • The number of table segments used by each table

The minimum disk space is the smallest data footprint that a table can have on an Amazon Redshift cluster. You can check the minimal table size when analyzing the cluster storage use or when

resizing an Amazon Redshift cluster. You can calculate the minimum disk space using the following formula:

  • For tables created using the KEY or EVEN distribution style:
    Minimum table size = block_size (1 MB) * (number_of_user_columns + 3 system columns) * number_of_populated_slices * number_of_table_segments.
  • For tables created using the ALL distribution style:
    Minimum table size = block_size (1 MB) * (number_of_user_columns + 3 system columns) * number_of_cluster_nodes * number_of_table_segments.

If two Amazon Redshift tables share the following attributes:

  • Created with identical DDL statements
  • Contain the same number of rows
  • Haven't been manually modified

Then the table disk storage space allocation can vary depending on:

  • The number of cluster slices populated by the Table, for the EVEN and Key Distribution style
  • The number of nodes in the cluster for ALL distributed slices
  • The number of segments in a table

If an Amazon Redshift table has a sort key, the table has two segments—one sorted segment and one unsorted segment. If an Amazon Redshift table has no sort key, all data is unsorted, and therefore the table has one unsorted segment.

When data is added to an existing table with a sort key, the new data is maintained in a separate segment that contains unsorted data—the data is not inserted into the original sorted key segment until a VACUUM operation is performed. For more information, see Managing the Volume of Merged Rows.

Note: The VACUUM operation merges the data with sorted data. However the table will still have unsorted segment for future loads.

The variable number_of_table_segments is one of three values that represent the number of table segments to allocate for Amazon Redshift tables:

    0: A table has never been loaded; allocate disk space for zero table segments.

    1: A table without a sort key has been loaded one or more times.

    2: A table with a sort key has been loaded one or more times

Example minimum table size calculations:

If a table has 125 user columns with sort keys on a cluster with 16 slices, then the smallest size the table can have populating all 16 slices is calculated as follows:

1 MB * (125 + 3) * 16 * 2 = 4096 MB

If a table is created with a DDL statement and the table resides on a cluster two-slice cluster that is populating both slices, then the minimal table size calculation dictates that the table uses significantly less disk storage:

1 MB* (125 + 3)* 2 * 2 = 512 MB

If a table is created with an identical DDL statement and the table resides on a cluster with 64 populated slices, then the following minimum table size calculation dictates that the table uses significantly more disk storage:

1 MB * (125 + 3) * 64 * 2 = 16384 MB

Based on the minimal table size example, the table size can grow or shrink based on the number of slices populated on the cluster. 

相關推薦

Understand Redshift Cluster Storage Space

The amount of disk storage space allocated to two tables that are in different Amazon Redshift clusters can vary significantly, even if the tab

Resize an Amazon Redshift Cluster

Resize operation speed To check the status of your resize operation, choose the Status tab in the Amazon Redshift console. The Statu

Transfer Amazon Redshift Cluster

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Find a VPC for Use With Your Redshift Cluster

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Moving a Redshift Cluster From One VPC to Another

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Change Redshift Cluster from Private to Public

Follow the steps for Modifying a Cluster. In the Modify Cluster window, change Publicly accessible to Yes. Confirm the change by checking the P

Questions in Cloud Object Storage space

Translate this page Translation is from English to selected language, using n.Fluent real-time machine translation service. No guarantees are

View Storage Use for Your Amazon Aurora DB Cluster

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Understand Amazon RDS and Amazon Redshift Queries Running During a Maintenance Window

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Use Logs to Track Redshift Database Cluster

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Understand Connection Limits for Amazon Redshift

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

連接db2數據庫出現No buffer space available (maximum connections reached?)

ons 端口 .net exception ket 數據庫 available con local Caused by: javax.naming.NamingException: [jcc][t4][2043][11550][3.57.82] 異常 java.net.So

ActiveMQ集群Master-Slave + Broker Cluster模式

網絡 解決方案 message 一、簡介Master-Slave集群:由至少3個節點組成,一個Master節點,其他為Slave節點。只有Master節點對外提供服務,Slave節點處於等待狀態。當主節點宕機後,從節點會推舉出一個節點出來成為新的Master節點,繼續提供服務。優點是可以解決多服務

/ThinkPHP/Library/Think/Storage/Driver/File.class.php  LINE: 48

chm 系統 服務 ora linux php 針對 失望 沒有權限 針對網上這個問題,本人有點失望滿篇贅述,卻未說道重點; 下面我說一下,你可能用的是linux或ubuntu的系統 thinkphp框架放到服務器上必會提示Runtime文件夾沒有權限 此時可以

mysql報錯Multi-statement transaction required more than 'max_binlog_cache_size' bytes of storage

.cn nbsp 導致 variable ria sed size log more mysql報錯Multi-statement transaction required more than ‘max_binlog_cache_size‘ bytes of storage

Mariadb Galera Cluster 部署

數據庫不同於標準的MySQL服務器和MySQL集群,MySQL / MariaDB Galera集群在啟動方式上有一些細小的區別。Galera需要在集群啟動一個節點作為參考點,剩余的節點才能加入形成集群。這個過程被稱為集群引導。引導是一個初始步驟,引導數據庫節點作為主節點,其它節點將主節點作為參考點同步數據。

redis-cluster的安裝管理

redis-cluster redis redis集群部署 redis-cluster的安裝管理 聲明:本文只允許用於個人學習交流使用,如有錯誤之處請多多指正。文檔版本:Version 1.0修改記錄:2015-10-30環境介紹系統環境:RedHat Enterprise Linux Serve

ceph 集群報 mds cluster is degraded 故障排查

ceph 故障排查 mds degraded ceph 集群報 mds cluster is degraded 故障排查ceph 集群版本:ceph -vceph version 10.2.7 (50e863e0f4bc8f4b9e31156de690d765af245185)ceph -w

web storage

cape .get 問題: html int 時間 all ips 使用 web storageHTML4中使用cookies在客戶端保存諸如用戶名等簡單的信息,但是,使用cookies存儲永久數據存在以下問題:大小:cookies的大小限制在4KB帶寬:cookies是隨

AIX創建刪除page space

justify style 創建 ps Smitty mkps 如2塊硬盤做了mirror,選其中一塊,PP大小128M,PS大小128*64 刪除PS 1、smitty swapoff //關閉PS 2、smitty rmpsAIX創建刪除page space