If your company hasn't ventured an Amazon cloud deployment already, the day may be fast approaching. Amazon's pay-as-you-go cloud is no longer "just" a popular playground for developers, a magnet for technology startups, and the clandestine home of "shadow IT" projects. It's also increasingly a component of official IT operations.
Working with the Amazon EC2 cloud isn't especially difficult, but it is different. This quick guide will get you up and running and on your way to cloud mastery. When your company finally embarks on that Amazon deployment or the next stop in your career requires cloud skills, you'll be ready to answer the call.
[ Stay on top of the current state of the cloud with InfoWorld's special report, "Cloud computing in 2012." Download it today! | Also check out our "Private Cloud Deep Dive," our "Cloud Security Deep Dive," our "Cloud Storage Deep Dive," and our "Cloud Services Deep Dive." ]
Learning your way around AmazonA first look at the Amazon Web Services dashboard confronts a bewildering array of services. Where to start? The truth is that a few of these resources will do almost everything you need. Others you may use little or not at all. The following services are the ones that will loom largest on your radar.
EC2 (Elastic Compute Cloud). EC2 instances are the servers on which you run your workload. Although you use a Web interface or API call to provision the servers and bring them into your collection, ultimately they are real computers with CPUs, memory, and access to physical storage.
S3 (Simple Storage Service). So-called simple storage, S3 is used for persistent and very cheap storage. S3 integrates with CloudFront, Amazon's content delivery solution. If you have website content such as graphics images and CSS, these files would typically be stored in S3 and fetched by your Web server at delivery time.
EBS (Elastic Block Storage). EBS is essentially a virtualized storage area network or SAN solution that all of your servers can share. Slice out chunks of storage for use by your instances as root or alternate volumes. You can then take snapshots of them to use for backups -- just as you would with Linux's LVM (Logical Volume Manager).
RDS (Relational Database Services). Amazon RDS is Amazon's managed relational database solution based on MySQL, Oracle, or SQL Server under the hood. When you launch a database instance, you choose the database engine you want.
ElastiCache. This is an Amazon-managed memcache solution. You can add and remove nodes easily, and with CloudWatch monitoring, you can have Amazon replace nodes for you if they fail.
Route 53. Route 53 is an Amazon-hosted DNS solution that allows you to associate names to your provisioned computing resources. Because instances in Amazon change their IP addresses whenever they are stopped and started again, reaching those boxes via names can be much more convenient and easier to support than relying on IP addresses.
VPC (Virtual Private Cloud). VPC is a superb addition to the Amazon portfolio of services, one that may very well benefit your enterprise. VPC essentially allows you to dynamically scale your existing data center using Amazon resources. Connecting the Amazon cloud with your data center via VPN, VPC allows your existing network to route Amazon instances privately, as though they were physical machines in your data center. Get all the benefits of the cloud with none of the security headaches.
There are of course many other Amazon services available, including email sending, message queueing, workflow, search, NoSQL, MapReduce, and alternative authentication solutions. But the above are the main services to understand.
In addition to these core services, you're sure to encounter a number of Amazon vocabulary terms again and again. Before you get started, it will pay to be familiar with the following concepts.
EC2 Instance. An instance is a unit of computing power, with CPUs, memory, and attached storage.
Amazon Machine Images. An Amazon Machine Image (AMI) is essentially a snapshot of a root volume. It may initially be difficult to wrap your head around this idea, but imagine the Linux Logical Volume Manager. Like LVM, an AMI allows you to snapshot your root volume and create a block-by-block copy of everything stored on the disk. That includes the master boot record, the kernel image, and so forth. The hypervisor layer in EC2 allows you to boot from these images on generic commodity servers in the Amazon data centers.
EBS Volumes. Volumes are snapshots or backups of volumes you have mounted on your server instances. In other words, EBS volumes persist independently of the instances themselves.
Security Groups. Amazon doesn't go with traditional perimeter security unless you're using the Virtual Private Cloud services. That means each server is its own universe, governed by security roles enforced by the hypervisor layer. This is real security, though the new paradigm may take some getting used to. Think of putting servers in groups by role, such as a database tier group, a Web server tier group, and so forth. You might even spin up a t1.micro instance and use it as a jump box. Make this instance the only machine in your environment with SSH access allowed, then grant access to all your servers' port 22 (for SSH) only from this jump box.
Load balancers. A load balancer in AWS becomes another facility that you can configure in a completely virtual way. Here's where you start to see the real power of the AWS environment. You can associate your instances to the load balancer by instance ID even if they are in different availability zones. You can configure the listener and cookie stickiness policies as well.
Availability Zones. Availability Zones are distinct data centers in the Amazon environment, but deployment is nevertheless transparent. All resources can be deployed easily whether on the East Coast, the West Coast, or the other side of the world.
Install the Amazon EC2 API ToolsNow that you're familiar with the core offerings and vocabulary, let's try out some of the services. You'll need to create an AWS account before we can go any further. Note that a free usage tier is available for new users.
First, we'll want to install the API tools. These Java-based tools allow you to issue Amazon commands from any terminal window, whether it be your local laptop, another server, or even an instance hosted in Amazon itself. Bootstrapping indeed!
The first step is to download the tools from Amazon. Next you'll set up a couple of environment variables:
export JAVA_HOME=/usrexport EC2_HOME=/home/sean/api-tools
These are examples of the commands for Linux and Unix. For more detail on these and for the corresponding commands on Windows, follow this link to Amazon's documentation.
Create your access keys The Amazon dashboard provides an easy way to set up your keys.
export EC2_PRIVATE_KEY=/home/sean/keys/pk-A5X4ZTZRLDEMYVHGXCQHU2HW3HALFS3T.pemexport EC2_CERT=/home/sean/keys/cert-A5X4ZTZRLDEMYVHGXCQHU2HW3HALFS3T.pem
Choose an Availability Zone and RegionAvailability Zones are distinct data centers. It is incredible that we can distill a data center down to a short identifier such as us-east-1a or us-west-1c, but that is the beauty of cloud computing and Amazon Web Services. As you build more complex applications with more resilient architecture, you'll pay more attention to which Availability Zone you deploy components in. For now, pick the one that's physically closest to your location.
You'll find the menu for selecting your Availability Zone right next to your account name in the upper-right corner of the EC2 dashboard.
Choose an Amazon Machine ImageNext stop on your Amazon tour is to decide which AMI to use. There are nearly 1,000 AMIs to choose from, and you can easily browse or search for what you need.
At this stage I wouldn't spend an inordinate amount of time deciding. Go with an Ubuntu image as a default. Also be sure to pick an EBS root AMI. There are very few use cases for Instance Store now that EBS is mature. I'm personally partial to Eric Hammond's images, which are well maintained, well supported, and well respected in the community.
A note on 32-bit versus 64-bit images: Only micro, small, and medium instances are available in 32-bit. As a general rule, it's best to go with 64-bit for everything unless you have a particular and compelling reason to require 32-bit. With 64-bit, your images will work on all instance types, and you can vertically scale easily.
Spin up your EC2 instanceYou have your tools installed, you have your keys, you've picked an AMI and availability zone. Now you're finally ready to create a real Amazon instance. At the command line, enter:
$ ec2-run-instances ami-31814f58 -k my-keypair -t t1.micro -z us-east-1a
Notice I chose a micro instance. Micro instances are free, so they're a great option for trying out the tools.
Connect to your instance Now that you have a running instance in EC2, you'll want to connect. Let's find out its name:
RESERVATION r-d1a71cc1046997127105 defaultINSTANCE i-17086273ami-31814f58 ec2-64-21-210-168.compute-1.amazonaws.comip-10-44-61-104.ec2.internalrunning my-keypair0 t1.micro2012-06-15T13:11:05+0000us-east-1a aki-417d2539monitoring-disabled 220.127.116.11 10.46.63.204ebs paravirtualxen sg-65f4ec0adefaultBLOCKDEVICE /dev/sda1vol-3f1ac253 2012-06-15T13:11:32.000Z
Once you know the IP address to the box, go ahead and connect:
$ ssh -i my-keypair email@example.com
A few routine tasksFolks familiar with Linux Volume Manager know that you can easily snapshot a disk volume. In Amazon, snapshots are a powerful facility for creating backups, protecting you from instance failure, and even creating new AMIs from your custom server setups. Look at the BLOCKDEVICE line above. You'll see the volume ID. That's all you need:
$ ec2-create-snapshot vol-3f1ac253
A few details to keep in mind: Although you can snapshot a running server, some tools will stop your instance in order to snapshot the root volume. This is for extra protection against corruption of the file system. If you're using a journaling file system such as ext3, ext4, or xfs, snapshotting a running system will leave your volume in a state similar to a crashed server. Upon startup, incomplete blocks will be repaired. In the case of a database mount such as MySQL, however, you should issue these additional commands from the MySQL shell:
mysql > flush tables with read lock;mysql > system xfs_freeze -f /data
For an in-depth explanation of how to do this, see my article, "Autoscaling MySQL on Amazon EC2."
When instances are started, Amazon automatically assigns a new IP address to them. Dynamic addresses are fine for playing around, but you'll undoubtedly want static, global IP addresses for some machines eventually. That's where elastic IP addresses enter the picture; your AWS account comes with a number of these. You can set your new instance with one of these static IPs using a simple command-line call:
$ ec2-associate-address 10.20.30.40 -i i-17086273
You're all set.
Now that you've had a taste of Amazon, you'll want to explore more. With the command-line tools installed and your security keys set up, you have everything you need to go further -- and get comfortable with different instance types, various AMIs, the Availability Zones your instances and volumes are stored in, how load balancers work, and beyond. The further you go, the more you'll appreciate that Amazon's documentation is as copious as its services.
This article, "How-to: Get started with Amazon EC2," originally appeared at InfoWorld.com. Follow the latest developments in cloud computing at InfoWorld.com. For the latest business technology news, follow InfoWorld.com on Twitter.
Read more about cloud computing in InfoWorld's Cloud Computing Channel.
Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.