Td Mxc Intel Tech Session Stewart

  • October 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Td Mxc Intel Tech Session Stewart as PDF for free.

More details

  • Words: 3,134
  • Pages: 36
Optimizing OpenSolaris* for Xeon May, 2008

For Sun Tech Days

Agenda

– Intel Server Advances – Intel and Sun collaboration – Key Development Areas – Summary/Call to Action

Intel is a trademark of Intel Corporation in the U.S. and other countries. * Other names and brands may be claimed as the property of others.

Executive Summary •

One year Anniversary of Intel/Sun collaboration agreement • Engineering teams show excellent collaboration • Collaboration intensifies in 2008, more projects in flight • Both companies are very upbeat about collaboration on SW, meeting all the goals • Deep and long term engineering engagement and relationship

– Solaris + Intel Architecture + 1 year = New Opportunities for our developers and customers in 2008 • Strong Intel roadmap • Best in class mission critical OS positioned to take advantage of new Intel server technologies • Solaris Openness, Indiana • IBM, Dell to OEM Solaris • Choice of Virtualization environments • Expansion of Sun SW portfolio

Intel Server Advances

Intel’s Sustained Architecture Leadership

2 YEARS

2 YEARS

2 YEARS

Stable roadmap for continued software innovation Shrink/Derivative Presler · Yonah · Dempsey

65nm New Microarchitecture

Intel® Core™ Microarchitecture

Shrink/Derivative Penryn Family

45nm New Microarchitecture Nehalem Shrink/Derivative Westmere

32nm New Microarchitecture Sandy Bridge

“Tick Tock” (Shrink) (Innovate) See “Intel Architecture and Silicon Cadence”. Whitepaper http://download.intel.com/technology/eep/cadence-paper.pdf

Source: Intel. All future products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.

Intel® Quad-Core - A Superior Design Dual-die vs Monolithic:

Intel Core™ uArch:

Faster to design:6-9 mos Lower Cost • Smaller die size • Better yield (~20%) • Lower mfg cost (~12%)

Better supply Extends to 45nm

Socket compatible:

From dual-core through to 45nm quad-core

Core 0

Leading Perf and Perf/W 64-bit Intel Virtualization Tech.

Core 1

Core 2

Core 3

32KB 32KB 32KB 32KB L1 I L1 D L1 I L1 D Cache Cache Cache Cache

32KB 32KB 32KB 32KB L1 I L1 D L1 I L1 D Cache Cache Cache Cache

4 MB Shared L2 Cache

4 MB Shared L2 Cache

Front Side Bus Interface

Front Side Bus Interface

Large L2 cache:

2X competitors size Lower latency (vs L3) Fewer cache misses More efficient inclusive design Reduces bus traffic

Front-side Bus: up to 1333MHz Enables uniform access to shared memory

Leading performance, low cost and extensible

Legal Disclaimers Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. All dates and products specified are for planning purposes only and are subject to change without notice Relative performance is calculated by assigning a baseline value of 1.0 to one benchmark result, and then dividing the actual benchmark result for the baseline platform into each of the specific benchmark results of each of the other platforms, and assigning them a relative performance number that correlates with the performance improvements reported. SPEC, SPECint2000, SPECfp2000, SPECint2006, SPECfp2006, SPECjbb, SPECWeb are trademarks of the Standard Performance Evaluation Corporation. See http://www.spec.org for more information. Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM) and, for some uses, certain platform software enabled for it. Functionality, performance or other benefits will vary depending on hardware and software configurations and may require a BIOS update. Software applications may not be compatible with all operating systems. Please check with your application vendor. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor series, not across different processor sequences. See http://www.intel.com/products/processor_number for details. Intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. All dates and products specified are for planning purposes only and are subject to change without notice * Other names and brands may be claimed as the property of others. Copyright © 2007-2007 Intel Corporation. All rights reserved. Intel, the Intel logo, Xeon and Intel Core are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

Quad-Core Intel® Xeon® Processor 5400 series based platforms

312%

Top500 Linpack 312%

Performance Comparison of 5400 Series versus AMD Opteron* Relative Performance. Higher is better

Quad-Core Intel Xeon 5400 Series Quad-Core AMD Opteron 1.9 GHz Quad-Core AMD Opteron 2.0 GHz Quad-Core AMD Opteron 2.3 GHz Quad-Core AMD Opteron 2.5 GHz

225% 200%

21 6%

Java 165 16% 5%

175% 13 8%

150% 125% SPECf p Rate -7% (QC) 51 %

100% 75%

10 7%

12 6%

12 6%

112 %

Integer 57% (QC) 70 %

63 %

59 %

94 % 78 %

69 %

29 %

25%

42 %

16 % Linpack*Φ

BlackScholes*Φ

Fluent 6.3 (9 Workloads bmk)β

Cinebench*Φ

SAP-SD* 2-TierΦ

SPECint*_rate2006€

3dsmax*Φ

Abaqus Explicit 6.6-1β

SPECint*_rate_base2006€

TPC-C*Φ

SPECWeb*2005€

SPECfp*_rate2006€

Best available Dual-Core AMD Opteron* results used as baseline. SPECOMPM*2001€

0%

12 5%

57 %

57 %

SPECfp*_rate_base2006€

50%

88 %

TPCC 96% 96 %

14 1%

SPECjbb*2005€

250%

Quad-Core Intel Xeon’s sustained leadership continues Data Source: Published, measured, submitted or approved results as of April 7, 2008. See backup for details; € Dual-Core AMD Opteron* Model 2222SE (3.0GHz) Φ Dual-Core AMD Opteron* Model 2220SE (2.80 GHz); β Dual-Core AMD Opteron* Model 2218 (2.60 GHz);

Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm Copyright © 2008, Intel Corporation. * Other names and brands may be claimed as the property of others.

8

Quad-Core Intel® Xeon® Processor 7300 Series based Servers Comparison to AMD Opteron* MP on Performance and Energy-Efficiency (Perf/Watt) Java 147 %

Data source: Published/Measured/Submitted results as of Sept 12, 2007. See backup for details

15 0%

12 5%

10 0%

50 %

Integer 92% (DC)

SAPSD 78%

TPCC 55%

75 %

FSI 73%

Integer 33% (QC)

25 %

Quad-Core AMD Opteron* 2.0GHz results Performance Comparison using Xeon 7350

Baseline: Best published Dual-Core AMD Opteron* results

_r

05 20 jb b* EC

C in t*

SP

at

la

e_

ck

ba s

e2

Sc ho le

00

s*

6^

^

5# jb b C SP E

te ra *_ in t C

*2

20

00

06

#

6# 00 e2 as SP E

B SP E

SP

EC

in t

*_

B

ra

la

te

ck

*2 SD PSA

_b

-T

Sc ho le

ie

s*

#

r#

*$ -C TP C

5# 00 eb *2 W EC

^

Perf/Watt Comparison using Xeon 7340

0%

SP

Java 131 %

Xeon 7350 – Quad-Core Intel® Xeon® Processor X7350; Xeon 7340 – Quad-Core Intel® Xeon® Processor E7340 ; # Dual-Core AMD Opteron* Model 8222SE (3.0 GHz); $ Dual-Core AMD Opteron* Model 8220SE(2.80 GHz); ^ Dual-Core AMD Opteron* Model 8220(2.80 GHz, 95 Watt TDP); Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit http://www.intel.com/performance/resources/limits.htm or call (U.S.) 1-800-628-8686 or 1-916-356-3104. Copyright © 2007, Intel Corporation. * Other names and brands may be claimed as the property of others.

Sun and Intel Collaboration

Sun-Intel Solaris Collaboration is Significant • Broad multi-year strategic alliance • Sun roadmap commitment – 1P, 2P, 4P, >4P

• Telco, WS, Enterprise

• Intel endorsement of Solaris* as ® ® a mainstream OS for Intel Xeon processors • Joint investment in engineering, design, and marketing alliance for Solaris (and Java*)

Get the best software and hardware for mission-critical applications

Solaris on Xeon Experience • What worked before (Solaris 10):

• What we’re improving:



Over 800 x86 systems supported in Hardware Compatibility List http://www.sun.com/bigadmin/hcl



Power-on, errata, memory/string operation speedups for Penryn, Nehalem



146 Intel-based servers supported by Solaris 10 (vs. 86 AMD servers, 61 SPARC servers)



Microcode update for serviceability



Drivers for Intel wireless, graphics, ICH storage, manageability

– –

Xen / VT roadmap, IOAT



Xeon enhancements for Fault Management



Performance optimizations to win World Record SPECint using Sun Studio 12



Majority of existing certified x64 Solaris applications already run on Intel Arch.



Fully supported by Sun on Intel servers, workstations, mobile, and desktop from multiple OEMs

Power optimizations (Powertop, improved P-States, C-States, NPTM, etc)

Solaris Development Model Sun Community

IntelPlatform project

Open Solaris

Nevada

Selective Back ports

Solaris 10

Intel-based HW

Intel Hidden Hidden projects Hidden projects projects

OpenSolaris (6m beat rate)

Sun Firewall

Updates (6m beat rate)

Key Development Areas

Areas of Development • • • • • • • •

Performance Enhancement CPU Performance tools Compiler Vectorization and Tools Power Management Driver Support I/O Acceleration Technology Virtualization Technology Predictive Self-Healing

Join us: http://opensolaris.org Joint work touches most significant areas of OS

Intel Core Silicon Enabling • microcode update – increased serviceability • iommu – increased DMA capability and security • extended xAPIC – extend processor addressability for interrupt delivery (up to 4G-1 cores) • ICHx – LAN, AHCI, managibility (AMT) • CPUID – lead to 4.5% SPECint performance on C2D • MONITOR/MWAIT – replaced halt leads to 1.2x in certain microbenchmarks Join us at http://opensolaris.org/os/project/intel-platform/

Performance Enhancements • Goal: Use Intel current and future technologies to improve Solaris performance



Libc optimizations • memcpy(), memmove(), and memset() – – –

Optimized to use SSE2 and/or SSSE3 instructions Significant performance improvements as measured by libMicro Available soon in OpenSolaris

• Str(n)cpy(), str(n)cmp() and strlen() – – –



Optimized to use SSE2 and/or SSSE3 instructions Significant performance improvements as measured by libMicro Available soon in OpenSolaris

Kernel optimizations in progress –

bzero(), bcopy(), kcopy(), etc.

Power Management • P-states - Active Power Management –

Performance states. Different P-states are at different frequency and voltage. You actually save energy.

• C-states – Idle Power Management – – – –

C0 - you execute code, no other mode executes code C1 - HALT instruction, no instructions get executed C2 - like C1, no code executed. Clock stopped C3 - FSB shut down. No snooping and caches can be shut off

• T-state –

Emergency brake

• Parts of ACPI: – –

Static Tables that the BIOS creates Captures the platform power capabilities (how many P/C-states, power, switch latency, etc.)

Stay as long as you can in deeper C-states

Power Management Development Areas •

PowerTOP available for Solaris

– To show what wakes up your system from saving power model – Uses DTrace P-State

C-State residency

residency

ACPI info Top causes for wakeup

Download at http://www.opensolaris.org/os/project/tesla/work/powertop

Power Management Development Areas • Lots of kernel improvement areas • Tickless kernel • Power-friendly scheduling • P-State improvement • C-state support • HPET timer • Interrupt binding

Join us at http://opensolaris.org/os/project/tesla/

OpenSolaris vs “best in class” power use CPU GMH ICH Memory PCI x16 slot LAN

OpenSolaris SNV b87

Backlight PS2 Serial I/O CLK

“Best in Class” OS

SATA USB

We have more work to do for Solaris to be best-in-class

IO Acceleration Technologies Intel® I/O Acceleration Technology Intel® GbE Controller (Gilgal, Ophir)

Supported Features

Intel® 82575 Gigabit Controller, Intel® 82598 10GbE Controller

Next gen Gigabit Controller, Next gen 10GbE Controller

Intel® QuickData Technology







LAN stateless offloads

Header/data split







Receive Side Scaling







TX/RX checksum offload







TCP segmentation











Header-splitting / replication Receive Side Coalescing





(Intel® 82598 10GbE Controller)

(Next gen 10GbE Controller)

MSI-X

MSI-X

Direct Cache Access





Low Latency Interrupt





Message Signaled Interrupts

MSI

IOAT v1 and v2 in progress

Today’s Virtualization Usage Models Static Server Consolidation App OS

App OS

App OS

VMM HW

VMM HW

VMM HW

App

App

App

OS

OS

OS

VMM HW

End User Value

Time

Reduce CapEx, increase utilization

Mainframe Migration App 4 OS

App OS

App

App

OS

OS

Multi-OS Workstation App

App

OS

OS

VMM HW App

App

App

App

OS

OS

OS

OS

VMM HW

VMM HW

OS and HW freedom for mission critical applications

High Availability/ Dynamic Load Disaster Balancing Recovery

Workstation Consolidation without compromise on graphics performance

VMM HW

Maintain high levels of business continuity

Reduce OpEx, streamline resource utilization balancing realtime computing demands with capacity

Without hardware support VM1

VMn

App

App

OS

OS

• What the VMM Does … • Emulates a complete hardware environment for every Virtual Machine

Virtual Machine Monitor Shared Physical Hardware Memory

Network

Processors

Storage

Graphics

KY/MS

• Allocates platform resources • Isolates execution in each virtual machine

Virtualization solutions without hardware support work, but there are limitations and require frequent software intervention

Sun xVM and Innotek VirtualBox Complete Virtualization and Management: Desktop to Datacenter

Unlocking Virtualization on Xeon

• Intel® Virtualization Technology • Interoperability • Performance optimizations

 Intel® Virtualization Technology  Interoperability  Performance optimizations  Manageability at scale  Availability  Security and compliance

• Manageability at scale • Availability • Security and compliance

Intel® Virtualization Technology Evolution •Assists for IO sharing: • PCI IOV compliant devs • VMDq: Multi-context IO • End-pointVT-c DMA translation caching • IO virtualization assists

Vector 3: IO Device Focus Core support for IO robustness & performance via DMA VT-d remapping

Interrupt filtering & remapping VT-d extensions to trackVT-d2 PCI-SIG IOV

Close basic processor “virtualization VT-x holes” in Intel® 64 & Itanium CPUs

Richer/faster: Intel VT FlexPriority, FlexMigration VT-x2 EPT, VPID, ECRR, APIC-V

Perf improvements for interrupt intensive env, faster VT-x3 VM boot

Simpler and more secure VMM through use of hardware VT support

Better IO/CPU perf and functionality via hardware-mediated access to memory

Richer IO-device functionality and IO resource sharing

Vector 2: Chipset Focus Vector 1: Processor Focus

VMM Software Evolution

Software-only VMMs Binary translation Paravirtualization Device emulations

Past

2005

2010 VMM software evolution over time, with hardware support

We are adding vt-d, vt-d2, vt-x, and vt-x2 into Solaris xVM

xVM Server Enabling in Solaris

Today

Tomorrow

xVM Server

V1.0 will support VT-x, extended page tables, VTPR, WBINVD for better performance, reliability

Future version supports VT-d and VTd2 device assignment and interrupt remapping for higher performance

xVM VirtualBox

VT-x, good performance

Blazing fast on Intel Architecture

xVM Ops Center

Intel Architecture support

Device assignment

Join us at http://opensolaris.org/os/community/xen/

Fault Management Architecture



Error – an incor rect signal, datum, result



Fault – a defect that may pr oduce error s



O bservation that is a symptom of a fault



The outpu t of the diagnosis of error s



O ld systems only know how to repo rt error s



Something we can associate wi th an impact and a corrective action



Diagnosis left to humans



Diagnosis softwar e automa tes the steps

FMA and Intel® Xeon® processors • Fault Management Architecture in Solaris saves millions in service costs • Intel platform support – Bensley and Caneland platforms

LAN Zoar x4

PWR

PCI-e x16

LP IPMI

PCI-33

PCI-X-133/100

PCI-e x8 in x16

North Bridge PXH-V x8

• Reporting of physical location of failed DIMMs

DDR2 FBD 16GB

SCSI ESB2

SATA x6 SCSI

FLPY IDE-M IDE-S

CPU2

CPU1

PWR

SCSI

• Future processors – new RAS features in Nehalem

PCI-X-133/100

PCI-X-100 ZCR

• Error injection: ensures that FMA code paths work correctly

Intel platform FMA model

Location of failed DIMMs

Error injection VRM 4+4

RAS support is great for 2 and 4 socket servers

Developer Tools • Sun Studio 12 Compiler (released June 2007) with Xeon-specific optimizations • Sun Studio Performance Analyzer: latest Intel Architecture performance counters

• Threading Building Blocks for Solaris – – threadingbuildingblocks.org

• Transitive QuickTransit ®

®

– Run Solaris/SPARC binaries on Solaris/Xeon

Sun Studio Compiler Optimization Flags • Aggressive – For large projects – – – – –

-fast -xtarget=woodcrest -m64 -xvector=simd,lib -xipo -xprofile=collect/use -Wu,sched_first_pass=1 -xtarget=woodcrest expands to “-xarch=ssse3 -xchip=core2 -xcache=32 /64/8:4096/64/16” SSE3 code generation, core2 architecture optimization and cache configuration selections, 05 level optimization, and inter-procedural optimization Enable instruction scheduler for FP calculation on IA Profile guided optimization

• Medium – For most applications – –

-fast -xtarget=woodcrest -m64 -Wu,-sched_first_pass=1 All aggressive optimization but no IPO and profile guidance

• Low – For extra precise floating point calculations – – –

-O -xtarget=woodcrest -m64 O3 (medium) optimization level Quickest compilation

Use Sun Studio to optimize your application on Solaris/IA

Solaris system-level tuning • Tuning is critical for best performance – Solaris is designed for safe handling of heavy, mixed workloads “out-of-the-box”; tune for optimal handling of specific workload characteristics • Processor binding/scheduling – Monitor application for threads that dominate CPU – Tie these to CPUs in dedicated processor set to guarantee resource without contention – Shield application CPUs from interrupts – Use Fixed-Priority scheduling class for critical processes • Network stack tuning – Update driver: tuning as new NICs appear – Solaris buffers – size to avoid retransmissions without consuming too much memory • Look at applications that communicate with the app – Analyse with Dtrace – Infiniband for lowest-latency interconnection – Running on the same box using containers for ultimate low-latency

Desktop/Mobile Driver Support - Wireless Driver • www.opensourcewireless.org • Focus on 4965 and future Wifi planned • Downloadable uCode and dual licensed header files • Phase 1 – completed – – – – –

802.11 A/B/G Infrastructure mode power/temperature calibration (FCC regulatory) Rx sensitivity calibration WEP

• Phase 2 -- expected completion Jun – –

802.11 A/N WPA

4965 wireless driver is working and improvement on the way

Device Drivers Support - Others • Graphics – All Intel graphic silicon is supported • AMT – HECI driver and LMS service are available for AMT3.0 – AMT 4/5 are under planning. • NICs • Others – Audio codec, USB, etc.

Intel platform laptop/desktop is supported.

Summary/Call to Action • Intel platform and Solaris bring the best technology to end user • Intel and Sun teams at full strength through the community • Result is significant in various kernel areas – Performance, drivers, FMA, virtualization, etc. • Call to action – Run OpenSolaris/Solaris on latest Intel server platforms – Joint development with us at OpenSolaris projects

Related Documents

Td Mxc Jmaki Chen
October 2019 39
Td Mxc Rubyrails Shin
October 2019 38
Td Mxc Python Wierzbiki
October 2019 35