Keith Kim
About Information Technology
Home
Machine Learning
Security
Tools
Note
Fun
February 15, 2020
Notes on Parquet and ORC
ORC (Optimized Row Columnar)
flattened data
light weight index + bloom filter
better compression
Better with Hive
much less GC
Parquet
Nested data
Better with Spark
Note is in progress
No comments:
Post a Comment
Newer Post
Older Post
Home
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment