Keith Kim
Home
Machine Learning
Robotics
Quantum Computing
IT News
Security
Tools
Note
Fun
February 15, 2020
Notes on Parquet and ORC
ORC (Optimized Row Columnar)
flattened data
light weight index + bloom filter
better compression
Better with Hive
much less GC
Parquet
Nested data
Better with Spark
Note is in progress
No comments:
Post a Comment
Newer Post
Older Post
Home
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment