1. 程式人生 > >Using Amazon EFS to Persist Data from Amazon ECS Containers

Using Amazon EFS to Persist Data from Amazon ECS Containers

My colleagues Jeremy Cowan and Drew Dennis sent a nice guest post that shows how to use Amazon Elastic File System with Amazon ECS.

Docker containers are ideal for building microservices because they’re quick to provision, easily portable, and provide process isolation. While these services are generally ephemeral and stateless, there are times when you want to persist data to disk or share it among multiple containers; for example, when you are running MySQL in a Docker container, capturing application logs, or simply using it as temporary scratch space to process data.

In this post, I’ll discuss how to persist data from Docker containers to Amazon Elastic File System (Amazon EFS), a storage service for Amazon EC2 instances based on the NFSv4 protocol.

Note: AWS offers a number of options to store data on Amazon EC2 instances, including Amazon EBS General Purpose, Amazon EBS Provisioned IOPS, and Amazon EFS. Review the I/O characteristics of your workload to select the most appropriate storage.

Amazon EC2 Container Service (Amazon ECS) is a highly-scalable, high performance container management service that supports Docker containers and allows you to run applications easily on a managed cluster of EC2 instances. The ECS service scheduler places tasks—groups of containers used for your application—onto container instances in the cluster, monitors their performance and health, and restarts failed tasks as needed.

Using task definitions, you can define the properties of the containers you want to run together and configure containers to store data on the underlying ECS container instance that your task is running on. Because tasks can be restarted on any ECS container instance in the cluster, you need to consider whether the data is temporary or needs to persist. If your container needs access to the original data each time it starts, you require a file system that your containers can connect to regardless of which instance they’re running on. That’s where EFS comes in.

EFS allows you to persist data onto a durable shared file system that all of the ECS container instances in the ECS cluster can use. Moreover, by using EFS you won’t need to monitor available disk space on your ECS cluster instance because the EFS file system will grow automatically as the amount of data increases. With EFS you only pay for the amount of data that’s stored in the EFS file system. Lastly, data management becomes a lot simpler because all your data can be stored on a single EFS volume.

Provisioning an ECS cluster

For this post, I used an AWS CloudFormation template which is available for download. I’ll walk through what the template does for you.

Note: This template requires you to be enrolled in the Amazon EFS preview.

Networking and security

The first thing the template does is create a VPC with subnets, an Internet gateway, and associated routes. After the network infrastructure is in place, it creates two security groups: one for the EFS file system mount targets and another for the ECS container instances. Two inbound rules and one outbound rule are then added to the ECS security group. The inbound rules allow SSH (22) and MySQL (3306) inbound from anywhere (0.0.0.0/0).

Note: This is for demonstration purposes only. We do not recommend creating rules that allow unfettered access to resources in your VPC.

These rules allow you to connect to the ECS container instances and containers themselves using an SSH and MySQL client which I’ll demonstrate later. The outbound rule allows all traffic outbound to anywhere, and is primarily there to allow the ECS container instances to connect to the EFS file system via mount targets in your VPC. Next, the template adds an inbound and outbound rule to the EFS security group. The inbound rule allows EFS (2049) traffic inbound from the VPC CIDR range. The outbound rule allows all traffic from anywhere. Together, these rules allow your ECS container instances to connect to the EFS mount points. If you’re unfamiliar with how to create security group rules, see Adding Rules to a Security Group.

IAM roles

In addition to creating security group rules, the template creates an IAM instance role with a managed policy that allows the EC2 container instances to register and deregister with the ECS cluster, create an ECS cluster, and a handful of other actions. The policy looks like the following:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecs:CreateCluster",
        "ecs:DeregisterContainerInstance",
        "ecs:DiscoverPollEndpoint",
        "ecs:Poll",
        "ecs:RegisterContainerInstance",
        "ecs:Submit*"
      ],
      "Resource": [
        "*"
      ]
    }
  ]
}

For more information about instance roles, see IAM Roles for Amazon EC2.

EFS file systems

After the security infrastructure is in place, the template creates an EFS file system and associated mount points in each of the VPC subnets. As part of the EFS provisioning process, the template adds the EFS security group you created earlier to each mount target. This allows EC2 instances within the VPC to connect to the EFS file system.

Note: When using EFS, we recommend connecting to a mount target in the same Availability Zone as your EC2 instance.

Finally, the template assigns a key-value pair to the volume. For more information about creating EFS file systems, see Getting Started with Amazon Elastic File System.

Load balancer (optional)

The template also provisions an ELB load balancer, which is subsequently added to the ELB security group created earlier. The load balancer is used to distribute traffic across ECS tasks that run on separate and distinct ECS container instances. An ECS service is a set of ECS tasks that run on the cluster. During the configuration of a service, you specify how many instances of a particular task definition to run on the cluster. I’ll discuss creating a service later in this post.

Auto Scaling group and launch configuration

The template creates an Auto Scaling group and launch configuration as well. The Auto Scaling group is used to set the minimum, maximum, and initial size of the ECS cluster and the launch configuration specifies which AMI, instance type, IAM role, user data, and other EC2 instance properties to use when bootstrapping a new instance. While not part of this CloudFormation template, you could create an Amazon CloudWatch alarm to trigger an Auto Scaling event that adds ECS container instances to the cluster automatically when the cluster’s capacity drops below a particular threshold.

Bootstrapping ECS cluster instances

The CloudFormation template creates an Auto Scaling launch configuration based on the user data script below, to bootstrap instances automatically and add them to your cluster; however, if you’ve chosen to create your own environment using an alternate method, you can use the script below as a reference.

Note: This script has multiple dependencies. Carefully review the list below before running the script in your environment:

  • A cluster named ‘default’
  • An ECS-optimized AMI from Amazon
  • An EC2 instance on a public subnet
  • An EC2 security group that allows all traffic outbound
  • An EFS security group that allows 2049 inbound and outbound
  • An instance role with an attached ECS inline policy and an attached AmazonElasticFileSystemReadOnly managed policy
  • Read and write access to the EFS file system
  • An EFS file system with the key-value pair, Name:efs-docker. The tag is used by the script to identify the EFS file system to which to connect

For a general overview of Auto Scaling groups and launch configurations, see What Is Auto Scaling?

Content-Type: multipart/mixed; boundary="===============BOUNDARY=="
MIME-Version: 1.0

--===============BOUNDARY==
MIME-Version: 1.0
Content-Type: text/x-shellscript; charset="us-ascii"

#! /bin/bash
#Put your standard user data here
echo "extra standard user data"

--===============BOUNDARY==
MIME-Version: 1.0
Content-Type: text/cloud-boothook; charset="us-ascii"

#cloud-boothook
#Join the default ECS cluster
echo ECS_CLUSTER=default >> /etc/ecs/ecs.config
PATH=$PATH:/usr/local/bin
#Instance should be added to an security group that allows HTTP outbound
yum update
#Install jq, a JSON parser
yum -y install jq
#Install NFS client
if ! rpm -qa | grep -qw nfs-utils; then
    yum -y install nfs-utils
fi
if ! rpm -qa | grep -qw python27; then
	yum -y install python27
fi
#Install pip
yum -y install python27-pip
#Install awscli
pip install awscli
#Upgrade to the latest version of the awscli
#pip install --upgrade awscli
#Add support for EFS to the CLI configuration
aws configure set preview.efs true
#Get region of EC2 from instance metadata
EC2_AVAIL_ZONE=`curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone`
EC2_REGION="`echo \"$EC2_AVAIL_ZONE\" | sed -e 's:\([0-9][0-9]*\)[a-z]*\$:\\1:'`"
#Create mount point
mkdir /mnt/efs
#Get EFS FileSystemID attribute
#Instance needs to be added to a EC2 role that give the instance at least read access to EFS
EFS_FILE_SYSTEM_ID=`/usr/local/bin/aws efs describe-file-systems --region $EC2_REGION | jq '.FileSystems[]' | jq 'select(.Name=="efs-docker")' | jq -r '.FileSystemId'`
#Check to see if the variable is set. If not, then exit.
if [-z "$EFS_FILE_SYSTEM_ID"]; then
	echo "ERROR: variable not set" 1> /etc/efssetup.log
	exit
fi
#Instance needs to be a member of security group that allows 2049 inbound/outbound
#The security group that the instance belongs to has to be added to EFS file system configuration
#Create variables for source and target
DIR_SRC=$EC2_AVAIL_ZONE.$EFS_FILE_SYSTEM_ID.efs.$EC2_REGION.amazonaws.com
DIR_TGT=/mnt/efs 
#Mount EFS file system
mount -t nfs4 $DIR_SRC:/ $DIR_TGT
#Backup fstab
cp -p /etc/fstab /etc/fstab.back-$(date +%F)
#Append line to fstab
echo -e "$DIR_SRC:/ \t\t $DIR_TGT \t\t nfs \t\t defaults \t\t 0 \t\t 0" | tee -a /etc/fstab
--===============BOUNDARY==--

ECS cluster

After all the previous steps are completed, an ECS cluster is created. With the cluster in place, the template’s Auto Scaling group launches two instances, which automatically join the cluster as they’re bootstrapped. When the template is finished running, the ECS console should similar to the following screenshot.

If you prefer to create a cluster from scratch instead, you can follow the directions in the documentation Setting Up with Amazon ECS.

Creating an ECS task definition

Now that the infrastructure is ready, you can create a task that persists data on the EFS file system. For this example I use MySQL, as you generally want to persist data that’s stored in a database.

  1. Select the cluster created by the CloudFormation template.
  2. Choose Create new task definition.
  3. Give your container a name, e.g., MySQL.
  4. Choose Add volume.
  5. In the Name field, type efs. You use this value to reference the EFS mount point later.
  6. In the Source path field, type /mnt/efs/mysql.
    Note: This is the path to your EFS file system mounted on the EC2 container instance.
  7. Choose Add container definition.
  8. Enter a name for the container, e.g., MySQL.
  9. In the Image field, type the name of the container stored in the Docker registry, e.g., mysql, that you’re using to retrieve the office MySQL container from Docker Hub.
  10. Assign the appropriate amount of memory and CPU units.
  11. Assign port mappings, if necessary.
    Note: The default port for MySQL is 3306.
  12. In the Source volume field, type the name you gave to the file system, e.g., efs.
  13. In the Container path field, type the name of the directory you want to persist on to the EFS volume, e.g., /var/lib/mysql.
  14. In the Environment variables field, type “MYSQL_ROOT_PASSWORD” for the key and “password” for the value.

Note: This is for demonstration purposes only. We do not recommend using plaintext environment variables for sensitive values.

Alternatively, you can copy and paste the following text in to the JSON tab:

{
  "family": "MySQL",
  "containerDefinitions": [
    {
      "name": "MySQL",
      "image": "mysql",
      "cpu": 10,
      "memory": 500,
      "entryPoint": [],
      "environment": [
        {
          "name": "MYSQL_ROOT_PASSWORD",
          "value": "password"
        }
      ],
      "command": [],
      "portMappings": [
        {
          "hostPort": 3306,
          "containerPort": 3306,
          "protocol": "tcp"
        }
      ],
      "volumesFrom": [],
      "links": [],
      "mountPoints": [
        {
          "sourceVolume": "efs",
          "containerPath": "/var/lib/mysql",
          "readOnly": false
        }
      ],
      "essential": true
    }
  ],
  "volumes": [
    {
      "name": "efs",
      "host": {
        "sourcePath": "/mnt/efs/mysql"
      }
    }
  ]
}

When you’re finished entering all the parameters, choose Add. Upon returning to the task definitions page, select the task you created from the list and choose Run task from the Actions menu. This starts your newly-created task on an EC2 container instance in the cluster.

If you SSH into the container instance where your task is running and run ‘docker exec’ command to connect to the container, you can see the files in the /var/lib/mysql directory. Now, exit the exec session and list the files in the /mnt/efs/mysql directory; you’ll see they’re the same, proving that the files are stored on the EFS file system.

You can also connect to the container using MySQLWorkbench.

Creating a service

While a task can have a finite lifespan, the ECS service scheduler ensures that the specified number of tasks are constantly running and reschedules tasks when a task fails (for example, if the underlying container instance fails for some reason). The service scheduler optionally also makes sure that tasks are registered against an Elastic Load Balancing load balancer. To create a service from the task you created earlier, follow the directions:

  1. In the navigation pane, select Task Definitions.
  2. On the Task Definitions page, choose the name of the task definition you created earlier, e.g., MySQL.
  3. On the Task Definition name page, choose the revision 1.
  4. Review the task definition, and choose Create Service.
  5. On the Create Service page, enter a unique name for your service in the Service name field, e.g., MySQL_Service.
  6. In the Number of tasks field, enter 1.

Note: You should only run one instance of a stateful task in a cluster. Running multiple instances of a stateful task may cause instability, as each task writes to the same directory on the EFS file system.

Conclusion

In this post, we looked at how you can use EFS to persist data from ECS containers. This allows you to run stateful containers, like MySQL, on ECS without worrying about what happens when your container is restarted on another instance in the cluster. That’s because each instance maintains a connection to the shared EFS file system where your MySQL data is stored.

In fact, if you put the ECS service we created behind an ELB load balancer, your clients don’t have to be reconfigured when your container moves. You can see an example of this in action by creating an ECS cluster with greater than 2 EC2 instances and terminating the instance that the MySQL task is running on. Not only does the MySQL task start on another instance in the cluster, but the terminated instance is replaced with another instance.

The other advantages to using EFS are that you only pay for the amount of storage that’s being consumed, and the file system grows automatically as you add files, eliminating the need for you to monitor disk space and helping you avoid paying for unused capacity.

We hope you found this post useful and look forward to your comments about where you plan to implement this in your current and future projects.

相關推薦

Using Amazon EFS to Persist Data from Amazon ECS Containers

My colleagues Jeremy Cowan and Drew Dennis sent a nice guest post that shows how to use Amazon Elastic File System with Amazon ECS. —

[Nuxt] Use Vuex Actions to Delete Data from APIs in Nuxt and Vue.js

export begin async delet tin remove todo ras alt You‘ll begin to notice as you build out your actions in Vuex, many of them will look qui

Using machine learning to index text from billions of images

In our previous blog posts, we talked about how we updated the Dropbox search engine to add intelligence into our users’ workflow, and how we built our

Amazon CloudFront Serving Outdated Content From Amazon S3

Invalidate the S3 objects You can invalidate an S3 object to remove it from the CloudFront distribution's cache. After the object is

3D Computer Grapihcs Using OpenGL - 07 Passing Data from Vertex to Fragment Shader

vertex 一致性 表示 變量 width src log 兩個 image 上節的最後我們實現了兩個綠色的三角形,而綠色是直接在Fragment Shader中指定的。 這節我們將為這兩個三角形進行更加自由的著色——五個頂點各自使用不同的顏色。 要實現這個目的,我們分兩

Quickly Filter Data in Amazon Redshift Using Interleaved Sorting

My colleague Tina Adams sent a guest post to introduce another cool and powerful feature for Amazon Redshift. — Jeff; Ama

New – Encryption of Data in Transit for Amazon EFS

Amazon Elastic File System was designed to be the file system of choice for cloud-native applications that require shared access to file-based sto

Remove Invalid Characters from Amazon Redshift Data

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Use STL_LOAD_ERRORS to Troubleshoot a Failed Amazon Redshift Data Load

7|BMO Field|Toronto|ON|0 16|TD Banknorth Garden|Boston|MA|0 23|The Palace of Auburn Hills|Auburn Hills|MI|0 28|American Airlines Arena|Miami|FL|

EFS File Sync – Faster File Transfer To Amazon EFS File Systems

We launched EFS File Sync a few days before AWS re:Invent 2017 and I finally have time to tell you about it! If you need to move a large c

How to Prevent Locks from Blocking Queries in Amazon Redshift

txn_owner | txn_db | xid | pid | txn_start | lock_mode | table_id | tablename | granted | blocking_pid |

Encrypt Data in Amazon EFS

Amazon Web Services is Hiring. Amazon Web Services (AWS) is a dynamic, growing business unit within Amazon.com. We are currently hiring So

Announcing Amazon FreeRTOS – Enabling Billions of Devices to Securely Benefit from the Cloud

I was recently reading an article on ReadWrite.com titled “IoT devices go forth and multiply, to increase 200% by 2021“, and while the article not

[MST] Loading Data from the Server using lifecycle hook

del asi con all load() body clas call code Let‘s stop hardcoding our initial state and fetch it from the server instead. In this lesson

Loading Data From Oracle To Hive By ODI 12c

ODI Oracle Hive 本文描述如何通過ODI將Oracle表數據同步到Hive。1、準備工作在hadoop集群的各個節點分別安裝Oracle Big Data Connectors,具體的組件如下圖所示:這裏只需安裝Oracle Loader For Hadoop(oraloader)以

using ThreadLocal to cache data in request scope

thread stack;reques/** * aim to cache the data that‘s accessed frequently and costly. * @param <K> * @param <V> */ public interface Cache&l

How to force https on amazon elastic beanstalk

conf 使用 quest The 沒有 last director 改變 etc 假設您已在負載平衡器安全組中啟用https,將SSL證書添加到負載平衡器,將443添加到負載平衡器轉發的端口,並使用Route 53將您的域名指向Elastic Beanstalk環境(或等

kafka連zk報錯:Unable to read additional data from server sessionid 0x0...

問題描述: 主機資訊: IP hostname 10.0.0.10 host10 10.0.0.12 host12 10.0.0.13 h

MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled

早上來到公司,線上的專案報錯: Error in execution; nested exception is io.lettuce.core.RedisCommandExecutionException: MISCONF Redis is configured to save RDB snapsho

學習筆記2018-10-26 讀論文A single algorithm to retrieve turbidity from remotely-sensed data in all coastal

TOPIC: A single algorithm to retrieve turbidity from remotely-sensed data in all coastal and estuarine waters from RSE WRITERS: A.I