Hadoop World: Low Latency, Random Reads From Hdfs

Uploaded by: Oleksiy Kovyrin
0
0

June 2020
PDF

Download

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA

Overview

Download & View Hadoop World: Low Latency, Random Reads From Hdfs as PDF for free.

More details

Words: 745
Pages: 20

Preview
Full text

RadFS

Random Access DFS

DFS in a slide 



Locates DataNode, opens socket, says hi DataNode allocates a thread to stream block contents from opened position



Client reads in as it processes



Great for streaming entire blocks



Positioned reads cut across the grain

A Different Approach 

Everything is a positioned read



All interactions with DN are stateless



Wrapped ByteService interfaces −

Caching, Network, Checksum

−

Each configurable, can be optimized for task



Connections pooled and re-used often



Fewer threads on server

DFS Anatomy – seek + read()

DFS seek+read con't 



Locates preferred block, caches DN locations Opens new Socket and BlockReader for each random read



Reads from the Socket



Object creation means GC debt





No optimizations for repeated reads of the same data Threads will consume resources on server after client hangup – files aren't automatically closed

RadFS Overview

RadFS seek+read, con't 

Transparently caches frequently read data



Automatically pools/manages file handles



Reduces network congestion (in theory)



Lower DataNode workload −

3 threads total instead of 1 per Xceiver



Configurable on client side for the task at hand



Network latency penalty on long-running reads



Checksum implementation means 2 reads per random read if caching is disabled

Implementation Notes 





Checksum is currently generated by wrapping CheckSumFileSystem around RadFileSystem −

Inefficient, reads 2 files over dfs

−

Improper – what if checksum block is corrupt?

CachingByteService implements lookahead (good) by copying bytes twice (bad) Permissions happen “by accident” at namenode −

Attackable by searching blockid space on DNs

−

Could exchange UserAccessToken on request

Benchmark Environment 

EC2 “Medium” - 2x2GHz, 1.7GB, shared I/O



Operations against 20GB sequence file







All tests run singlethreaded from the lightly loaded namenode Fast internal network, adequate memory but not enough to page entire file All benchmarks in a given set were run on the same instance, middle value from 3 runs

Random Reads - 2k 





10,000 random reads of 2k each over the length of a 20GB file DFS averaged 7.8ms while Rad with no cache averaged 4.4ms Caching added a full 2ms – hardcoded lookahead was no help and lots of unnecessary byte copying

Random Reads – 2kb (avg in ms) 10 8 6 Column 2

4 2 0

DFS

RadFS Rad - No Cache

SequenceFile Search 

Binary search over 10gb sequence file



DFS, RadFS with various cache settings



Indicative of potential filesystem uses −

Lucene

−

Large numbers of secondary indices

−

Ease of development

−

Read-only RDBMS-like systems built from ETLs or other long-running process

Sequence File Binary Search 5000 searches, avg ms per search 120

100

80

60

40

Column 2

20

0

DFS

RAD0 RAD128MB RAD16MB

Streaming 







DFS is inherently faster for streaming due to the dedicated server thread Checksumming is expensive! Early radfs builds beat dfs at 1-byte read()s because they didn't have checksumming Require a PipeliningByteService for use in streaming jobs that would make requests to Datanode, stream in and checksum in a separate client-side thread

Streaming – 1GB 1b reads, time in seconds 350

300

250

200

150

Column 2

100

50

0

DFS

RAD no checksum

Streaming 1GB 2k reads, time in seconds 160

140

120

100

80 Column 2 60

40

20

0

DFS

RAD

Going forward – modular reader

Going forward - Applications 



Could improve Hbase, solves file handle problem and improves latency Could be used to create low-latency lookup formats accessible from scripting languages −

Cache is automatic, simplifying development

−

“Table” directory with main store file and several secondary index files generated by ETL

−

Lucene indices? Can be built with MapReduce

Going forward - Development 





Copy existing HDFS method of interleaving checksums directly from datanode – one read −

Audit checksumming code for CPU efficiency – reading can be CPU bound

−

Implement as a ByteService instead of clumsy wrapper around FileSystem. Make configurable

Implement PipeliningByteService to improve streaming by pre-fetching pages Exchange UserAccessToken at each read, could possibly use for encryption of blockid

Contribute! 

Patch is at Apache JIRA issue HDFS-516



Will be on GitHub momentarily



Goals: −

Equivalent streaming performance to DFS

−

Faster random read, caching option

−

Lower resource consumption on server



3 doable tasks above



Large configuration space to explore



Email me: [email protected]

Hadoop World: Low Latency, Random Reads From Hdfs

Overview

More details

Related Documents

Hadoop World: Low Latency, Random Reads From Hdfs

Hadoop World: Hadoop Development At Facebook: Hive And Hdfs

Low Latency Data

Hadoop Training #2: Mapreduce & Hdfs

Thomson Low Latency-metadata Presentation

Latency

More Documents from ""

Mysql Magazine. Issue 6 - Fall 2008

Standford Cs 193p: 12-text Input Present Modal

Jobacle Sick Day Calendar 2009

Mastering Postgresql Administration

The Little Manual Of Api Design