Ec2 Performance V2.ppt-2

  • Uploaded by: Michael Fairchild
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Ec2 Performance V2.ppt-2 as PDF for free.

More details

  • Words: 921
  • Pages: 30
UC Berkeley

Evaluating Amazon’s EC2 As A Research Platform Michael Armbrust and Gunho Lee

1

RAD Lab Overview High level spec

Director

Low level spec

Drivers Drivers Drivers

Offered load, resource utilization, etc.

Automatic Workload Engine

Training data performance & cost models

Log Mining

Instrumentation Backplane

New apps, equipment, global policies (eg SLA)

Compiler

Web 2.0 apps web svc Ruby on APIs Rails environment trace collection local OS functions VM monitor

Policy-aware switching SCADS Berkeley DB trace collection local OS functions 2

Overview of EC2 •! Elastic virtual machine capacity •! Programmatically controlled through web service API •!~5 minutes to allocate new machines •! Charges for machine time and data transfer in/out of datacenter Platform Units

Memory Disk

Small - $0.10 / hour

32-bit

1

1.7GB

160GB

Large - $0.40 / hour

64-bit

4

7.5GB

850GB – 2 spindles

X Large - $0.80 / hour 64-bit

8

15GB

1690GB – 4 spindles

High CPU Med - $.20

64-bit

5

1.7GB

350 GB

High CPU Large-$.80

64-bit

20

7GB

1690 GB

One EC2 Compute Unit provides the equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.

Appeal •! Automatically dynamically scale your computing resources as demand changes •! For researchers: –! Run experiments for $100 per 1000 machine hours! –! AMIs provide simple containment of experimental setup and allow for easy verification

Rapid Changes

•! Changing machines and processor types •! New features –! Availability Zones –! Elastic IP Addresses –! Persistent Storage

Reverse Engineering •! How much hardware do they have? –! Not as much as we expected –! Seems to be changing

•! What have we seen in our tests? –! 402 VMs –! 379 Physical Machines –! Overlap of as many as 7 VMs per machine

Goal •! Characterize the performance and variance of different aspects of EC2 –! Clock Accuracy –! CPU –! Memory Bandwidth –! Disk –! Network Throughput / Latency

•! Provide recommendations to researchers hoping to use EC2

Benchmark Architecture •! Suite of benchmarks controlled from the R Cluster –! SSH to machines –! Installed required binaries –! Collect and record results to MySQL Usage: util/runTests.rb [options] -r, --reservation RESERVATION_ID Include the specified reservation in tests -d, --disk SIZE Execute a disktest of specified GBs -c, --cpu_info Caputure CPU Info -t, --trace_path FILE Run traceroute and capture to file (for graphing) -n, --network N Perform a network throughput test between random pairs of nodes N times -s, --skewserver SERVER Install and start the skew monitor in the background reporting into SERVER -m, --memorytest Perform the stream memory benchmark -x, --detectcap Attempt to detect Xen CPU Capping

Clock Accuracy

•! Time difference between EC2 & local server •! Frequency (1~3 times / hour) and magnitude (~200ms) of clock skews seen by many machines over some period

CPU Performance •! EC2 uses Xen CPU caps to provide fairly consistent performance across different processor type •! Programs may experience scheduling artifacts with small or latency sensitive computation

CPU Performance

Memory Bandwidth •! Used Stream Memory Benchmark •! Low variation between machines of the same type –! Std Dev < 55 MB/s

•! Very high between different processor types

Disks – Warm Up Effect •! When first using /mnt (ephemeral storage) there are significant allocation performance artifacts •! Oddly, these seem to correlate with the processor type

Disks – Warm Up Effect

Disks – Long Term Performance •! Overtime aggregate disk performance is more consistent •! Over 90% of writes occur at 40mb/s or greater •! Average 54.88mb/s, Standard Deviation 9.05

Individual Performance •! There is a much greater discrepancy between individual machines •! Most likely due to collocation with other disk intensive customers

Original Topology •! Originally we saw three distinct sections of their network

New Topology

Network Quirks and Latency •! 96% RTT < 1ms •! Occasional route changes lead to weird paths and increased latency •! Current EC2 bug limits the number of concurrent connections that can be established cluster wide –! Expected fix in a few weeks

Network Performance

Long Term Network

Long Term Network

Large – CPU

•! No CPU cap –! Slower CPU, but full speed

Large – Disk

•! Similar performance

Large – Network

•! 2x performance at 4x cost

Small vs. Large •! Small instance shows statistically better performance/$ –! Due to lack of Xen I/O caps coupled with low overall cluster utilization

•! Large instance has higher network bandwidth –! Explicitly specified –! Not as high as it costs

•! No CPU caps for large instance –! Seems like not sharing physical core with other instances –! No future guarantee

Evaluation for Research •! Typical experimental environment for (distributed) system research •! •! •! •!

Single machine (simulation) Private cluster (many nodes) Emulab (network topology emulation) PlanetLab (geographically distributed)

–! Many experiments on private cluster and Emulab can be done better on EC2 Size

Cost

Usability

Control

Private Cluster

Small

Expensive

Easy

Full

Emulab / PlanetLab

Medium

Free

Hard

Partial

EC2

Large

Cheap

Easy

Little

Evaluation for Research •! EC2 is useful when you do… •! Experiment on “large” cluster -! distributed storage/computation/service… •! Computation with “power” of large cluster -! computational biology, vision, ML… •! Research on virtualized environment just like EC2, of course -! cloud computing…

Recommendations •! Design for I/O variability –! Do disk writes in bulk in the background –! Be topology aware –! Dynamically redistribute work

•! Be aware of the difference between small & large instances –! Overall, better performance/$ for small –! Some advantages for large

Q&A •! Thanks!

Related Documents

Ec2-ug.pdf
April 2020 12
Ec2-docente.pdf
November 2019 39
01 Aws Ec2.pdf
December 2019 17
Trouble With Tv Ec2
November 2019 17
Performance
November 2019 45

More Documents from ""

Linuxclubgpgpu
December 2019 27
Storegpu-hpdc08
December 2019 20
Poolparty Deployment
May 2020 14
Security In Dhts
December 2019 17
Chord Sigcomm
December 2019 29