Medallian Architecture

Introduction

Ortege Lakehouse is designed around the innovative Medallion Architecture, a tiered data management strategy that systematically categorizes data into three distinct layers: Bronze, Silver, and Gold. This architecture facilitates a streamlined progression of data from raw ingestion to refined analytics, ensuring users across various roles and functions can access the most relevant and optimized data for their needs. The structured approach underpinning the Medallion Architecture reflects our commitment to delivering clarity, efficiency, and value in data analysis.

Layered Approach

Bronze Layer:

The foundation of the Medallion Architecture, the Bronze Layer, houses raw data directly ingested from diverse sources. This layer prioritizes the authenticity and completeness of data, serving as the primary repository for unprocessed information.

Silver Layer:

Data transitions to the Silver Layer following initial processing, which includes cleansing, normalization, and structuring. This layer enhances the usability of data, making it more accessible and meaningful for analysis. The Silver Layer acts as a bridge between raw data and advanced analytics, providing a balanced dataset that is both rich in detail and optimized for exploration.

Gold Layer:

The primary purpose of the Gold Layer is to refine, consolidate, and curate datasets that are critical for the development and enhancement of Language Learning Models (LLM), Machine Learning (ML) models, and aggregate data models. In the rapidly evolving landscape of cryptocurrencies, the ability to derive meaningful insights from vast amounts of data is paramount. The Gold Layer is dedicated to providing a structured and enriched dataset that enables accurate modeling, forecasting, and analysis.

As cryptocurrencies continue to grow in both complexity and volume, the need for an advanced, reliable data foundation becomes increasingly important. The Gold Layer addresses this need by offering a comprehensive dataset solution that encompasses everything from individual decentralized exchange (DEX) volumes to complex, aggregated data models. This layer is not just about data storage; it’s about transforming raw data into a goldmine of insights that power our language and machine learning models, as well as our innovative aggregate data models.

Our commitment to excellence is reflected in the meticulous design and functionality of the Gold Layer. By leveraging state-of-the-art technologies and methodologies, we ensure that the data within this layer is of the highest quality, accuracy, and reliability. Whether you’re developing advanced ML models or seeking to consolidate numerous DEX volumes into a cohesive aggregation layer, the Gold Layer provides the essential datasets needed to drive your analytics forward.

In the following sections, we will delve deeper into the architecture and design of the Gold Layer, the types of datasets available, their use cases, and how you can access and integrate these datasets into your analytical tools and models. Join us as we explore the foundational layer that is setting new standards in cryptocurrency analytics.

Naming Convention

To navigate the Medallion Architecture effectively, a consistent naming convention is employed across all layers, encapsulating key attributes of data entities:

  • Type: Classifies the data entity (tbl for tables, vw for views, and mvw for materialized views), indicating its structure and purpose within the data ecosystem.

  • Environment (Env): Identifies the operational stage of the data entity (test for testing, and prod for production), facilitating environment-specific data management and access controls. Please note Generally we will only release production data to our customers, but for some cutting edge datasets we will release test to battle test them before promotion to production.

  • Layer: Denotes the tier within the Medallion Architecture (br for Bronze, sl for Silver, and gld for Gold), reflecting the data’s processing stage and intended use case.

  • Entity: Describes the content or function of the data entity, offering insights into its relevance and application.

Conclusion

The Medallion Architecture is a cornerstone of Ortege Lakehouse, providing a clear and effective framework for data management and analysis. By categorizing data into Bronze, Silver, and Gold layers, and employing a systematic naming convention, Ortege Lakehouse empowers users to harness the full potential of their data, from raw ingestion to advanced analytics. This tiered approach ensures that regardless of where data resides in the lifecycle, its value is maximized for all stakeholders involved.