Back to glossary

Onehouse Compute Runtime (OCR)

Onehouse Compute Runtime is the foundation for how Onehouse manages compute resources and performance for its managed data lakehouse service. Whether allocating and managing resources for data ingestion, table optimizations, or extract-transform-load (ETL) operations, the runtime is critical in optimizing Onehouse-managed compute resources in customers’ virtual private clouds.

Onehouse Compute Runtime dynamically adapts performance and configuration to users’ workload patterns at runtime. The runtime ensures that data lakehouse workloads achieve optimal performance and efficiency, with enhancements well beyond those available in open source software.

Onehouse Compute Runtime includes optimizations around three major areas of development:

  • Compute Management
    • Elastic cluster scaling - for handling data workload spikes and swings
    • Serverless clusters - for flexible resource allocation and isolation
    • Automatic upgrades and security patches - for reduced operational
    • complexity
  • Adaptive Workload Optimizations
    • Multiplexed job scheduler - to minimize compute footprint by sharing compute across multiple jobs running in parallel
    • Lag-aware scheduling - to enforce latency SLAs
    • Performance profiles - to balance write and query performance
  • High Performance Lakehouse I/O
    • Vectorized columnar merging - for fast writes
    • Parallel pipelined execution - to maximize CPU efficiency
    • Optimized storage access - to reduce network requests compared with vanilla Apache Parquet readers

Additional resources: