Data Estate – renierbotha ltd

Beyond the Medallion: Cost-Saving Alternatives for Microsoft Fabric Data Estates

November 29, 2025November 29, 2025Leave a comment

The Medallion Architecture (Bronze → Silver → Gold) has become the industry’s default standard for building scalable data estates—especially in Microsoft Fabric. It’s elegant, modular, easy to explain to business users, and aligns well with modern ELT workflows.

The Medallion Architecture remains one of the most effective and scalable patterns for modern data engineering because it introduces structured refinement, clarity, and governance into a data estate. By organising data into Bronze, Silver, and Gold layers, it provides a clean separation of concerns: raw ingestion is preserved for auditability, cleaned and conformed data is standardised for consistency, and curated business-ready data is optimised for analytics. This layered approach reduces complexity, improves data quality, and makes pipelines easier to maintain and troubleshoot. It also supports incremental processing, promotes reusability of transformation logic, and enables teams to onboard new data sources without disrupting downstream consumers. For growing organisations, the Medallion Architecture offers a well-governed, scalable foundation that aligns with both modern ELT practices and enterprise data management principles

But as many companies have discovered, a full 3-layer medallion setup can come with unexpected operational costs:

Too many transformation layers
Heavy Delta Lake I/O
High daily compute usage
BI refreshes duplicating transformations
Redundant data copies
Long nightly pipeline runtimes

The result?
Projects start simple but the estate grows heavy, slow, and expensive.

The good news: A medallion architecture is not the only option. There are several real-world alternatives (and hybrids) that can reduce hosting costs by 40-80% and cut daily processing times dramatically.

This blog explores those alternatives—with in-depth explanation and real examples from real implementations.

Why Medallion Architectures Become Expensive

The medallion pattern emerged from Databricks. But in Fabric, some teams adopt it uncritically—even when the source data doesn’t need three layers.

Consider a common case:

A retail company stores 15 ERP tables. Every night they copy all 15 tables into Bronze, clean them into Silver, and join them into 25 Gold tables.

Even though only 3 tables change daily, the pipelines for all 15 run every day because “that’s what the architecture says.”

This is where costs balloon:

Storage multiplied by 3 layers
Pipelines running unnecessarily
Long-running joins across multiple layers
Business rules repeating in Gold tables

If this sounds familiar… you’re not alone.

1. The “Mini-Medallion”: When 2 Layers Are Enough

Not all data requires Bronze → Silver → Gold.

Sometimes two layers give you 90% of the value at 50% of the cost.

The 2-Layer Variant

Raw (Bronze):
Store the original data as-is.
Optimised (Silver/Gold combined):
Clean + apply business rules + structure the data for consumption.

Real Example

A financial services client was running:

120 Bronze tables
140 Silver tables
95 Gold tables

Their ERP was clean. The Silver layer added almost no value—just a few renames and type conversions. We replaced Silver and Gold with one Optimised layer.

Impact:

Tables reduced from 355 to 220
Daily pipeline runtime cut from 9.5 hours to 3.2 hours
Fabric compute costs reduced by ~48%

This is why a 2-layer structure is often enough for modern systems like SAP, Dynamics 365, NetSuite, and Salesforce.

2. Direct Lake: The Biggest Cost Saver in Fabric

Direct Lake is one of Fabric’s superpowers.

It allows Power BI to read delta tables directly from the lake, without Import mode and without a Gold star-schema layer.

You bypass:

Power BI refresh compute
Gold table transformations
Storage duplication

Real Example

A manufacturer had 220 Gold tables feeding Power BI dashboards. After migrating 18 of their largest models to Direct Lake:

Results:

Removed the entire Gold layer for those models
Saved ±70% on compute
Dropped Power BI refreshes from 30 minutes to seconds
End-users saw faster dashboards without imports

If your business intelligence relies heavily on Fabric + Power BI, Direct Lake is one of the biggest levers available.

3. ELT-on-Demand: Only Process What Changed

Most pipelines run on a schedule because that’s what engineers are used to. But a large portion of enterprise data does not need daily refresh.

Better alternatives:

Change Data Feed (CDF)
Incremental watermarking
Event-driven processing
Partition-level processing

Real Example

A logistics company moved from full daily reloads to watermark-based incremental processing.

Before:

85 tables refreshed daily
900GB/day scanned

After:

Only 14 tables refreshed
70GB/day scanned
Pipelines dropped from 4 hours to 18 minutes
Compute cost fell by ~82%

Incremental processing almost always pays for itself in the first week.

4. OneBigTable: When a Wide Serving Table Is Cheaper

Sometimes the business only needs one big denormalised table for reporting. Instead of multiple Gold dimension + fact tables, you build a single optimised serving table.

This can feel “anti-architecture,” but it works.

Real Example

A telco was loading:

12 fact tables
27 dimensions
Dozens of joins running nightly

Reporting only used a handful of those dimensions.

We built a single OneBigTable designed for Power BI.

Outcome:

Gold tables reduced by 80%
Daily compute reduced by 60%
Power BI performance improved due to fewer joins
Pipeline failures dropped significantly

Sometimes simple is cheaper and faster.

5. Domain-Based Lakehouses (Micro-Lakehouses)

Rather than one giant medallion, split your estate based on business domains:

Sales Lakehouse
Product Lakehouse
HR Lakehouse
Logistics Lakehouse

Each domain has:

Its own small Bronze/Silver/Gold
Pipelines that run only when that domain changes

Real Example

A retail group broke their 400-table estate into 7 domains. The nightly batch that previously ran for 6+ hours now runs:

Sales domain: 45 minutes
HR domain: 6 minutes
Finance domain: 1 hour
Others run only when data changes

Fabric compute dropped by 37% with no loss of functionality.

6. Data Vault 2.0: The Low-Cost Architecture for High-Volume History

If you have:

Millions of daily transactions
High historisation requirements
Many sources merging in a single domain

Data Vault often outperforms Medallion.

Why?

Hubs/Links/Satellites only update what changed
Perfect for incremental loads
Excellent auditability
Great for multi-source integration

Real Example

A health insurance provider stored billions of claims. Their medallion architecture was running 12–16 hours of pipelines daily.

Switching to Data Vault:

Stored only changed records
Reduced pipeline time to 45 minutes
Achieved 90% cost reduction

If you have high-cardinality or fast-growing data, Data Vault is often the better long-term choice.

7. KQL Databases: When Fabric SQL Is Expensive or Overkill

For logs, telemetry, IoT, or operational metrics, Fabric KQL DBs (Kusto) are:

Faster
Cheaper
Purpose-built for time-series
Zero-worry for scaling

Real Example

A mining client stored sensor data in Bronze/Silver. Delta Lake struggled with millions of small files from IoT devices.

Switching to KQL:

Pipeline cost dropped ~65%
Query time dropped from 20 seconds to < 1 second
Storage compressed more efficiently

Use the right store for the right job.

Putting It All Together: A Modern, Cost-Optimised Fabric Architecture

Here’s a highly efficient pattern we now recommend to most clients:

The Hybrid Optimised Model

Bronze: Raw Delta, incremental only
Silver: Only where cleaning is required
Gold: Only for true business logic (not everything)
Direct Lake → Power BI (kills most Gold tables)
Domain Lakehouses
KQL for logs
Data Vault for complex historisation

This is a far more pragmatic and cost-sensitive approach that meets the needs of modern analytics teams without following architecture dogma.

Final Thoughts

A Medallion Architecture is a great starting point—but not always the best endpoint.

As data volumes grow and budgets tighten, organisations need architectures that scale economically. The real-world examples above show how companies are modernising their estates with:

Fewer layers
Incremental processing
Domain-based designs
Direct Lake adoption
The right storage engines for the right data

If you’re building or maintaining a Microsoft Fabric environment, it’s worth stepping back and challenging old assumptions.

Sometimes the best architecture is the one that costs less, runs faster, and your team can actually maintain.

The Epiphany Moment of Euphoria in a Data Estate Development Project

December 6, 2024December 6, 2024Leave a comment

In our technology-driven world, engineers pave the path forward, and there are moments of clarity and triumph that stand comparable to humanity’s greatest achievements. Learning at a young age from these achievements shape our way of thinking and can be a source of inspiration that enhances the way we solve problems in our daily lives. For me, one of these profound inspirations stems from an engineering marvel: the Paul Sauer Bridge over the Storms River in Tsitsikamma, South Africa – which I first visited in 1981. This arch bridge, completed in 1956, represents more than just a physical structure. It embodies a visionary approach to problem-solving, where ingenuity, precision, and execution converge seamlessly.

The Paul Sauer Bridge across the Storms River Gorge in South Africa.

The bridge’s construction involved a bold method: engineers built two halves of the arch on opposite sides of the gorge. Each section was erected vertically and then carefully pivoted downward to meet perfectly in the middle, completing the 100m span, 120m above the river. This remarkable feat of engineering required foresight, meticulous planning, and flawless execution – a true epiphany moment of euphoria when the pieces fit perfectly.

Now, imagine applying this same philosophy to building data estate solutions. Like the bridge, these solutions must connect disparate sources, align complex processes, and culminate in a seamless result where data meets business insights.

This blog explores how to achieve this epiphany moment in data projects by drawing inspiration from this engineering triumph.

The Parallel Approach: Top-Down and Bottom-Up

Building a successful data estate solution, I believe requires a dual approach, much like the simultaneous construction of both sides of the Storms River Bridge:

Top-Down Approach:
- Start by understanding the end goal: the reports, dashboards, and insights that your organization needs.
- Focus on business requirements such as wireframe designs, data visualization strategies, and the decisions these insights will drive.
- Use these goals to inform the types of data needed and the transformations required to derive meaningful insights.
Bottom-Up Approach:
- Begin at the source: identifying and ingesting the right raw data from various systems.
- Ensure data quality through cleaning, validation, and enrichment.
- Transform raw data into structured and aggregated datasets that are ready to be consumed by reports and dashboards.

These two streams work in parallel. The Top-Down approach ensures clarity of purpose, while the Bottom-Up approach ensures robust engineering. The magic happens when these two streams meet in the middle – where the transformed data aligns perfectly with reporting requirements, delivering actionable insights. This convergence is the epiphany moment of euphoria for every data team, validating the effort invested in discovery, planning, and execution.

When the Epiphany Moment Isn’t Euphoric

While the convergence of Top-Down and Bottom-Up approaches can lead to an epiphany moment of euphoria, there are times when this anticipated triumph falls flat. One of the most common reasons is discovering that the business requirements cannot be met as the source data is insufficient, incomplete, or altogether unavailable to meet the reporting requirements. These moments can feel like a jarring reality check, but they also offer valuable lessons for navigating data challenges.

Why This Happens

Incomplete Understanding of Data Requirements:
- The Top-Down approach may not have fully accounted for the granular details of the data needed to fulfill reporting needs.
- Assumptions about the availability or structure of the data might not align with reality.
Data Silos and Accessibility Issues:
- Critical data might reside in silos across different systems, inaccessible due to technical or organizational barriers.
- Ownership disputes or lack of governance policies can delay access.
Poor Data Quality:
- Data from source systems may be incomplete, outdated, or inconsistent, requiring significant remediation before use.
- Legacy systems might not produce data in a usable format.
Shifting Requirements:
- Business users may change their reporting needs mid-project, rendering the original data pipeline insufficient.

The Emotional and Practical Fallout

Discovering such issues mid-development can be disheartening:

Teams may feel a sense of frustration, as their hard work in data ingestion, transformation, and modeling seems wasted.
Deadlines may slip, and stakeholders may grow impatient, putting additional pressure on the team.
The alignment between business and technical teams might fracture as miscommunications come to light.

Turning Challenges into Opportunities

These moments, though disappointing, are an opportunity to re-evaluate and recalibrate your approach. Here are some strategies to address this scenario:

1. Acknowledge the Problem Early

Accept that this is part of the iterative process of data projects.
Communicate transparently with stakeholders, explaining the issue and proposing solutions.

2. Conduct a Gap Analysis

Assess the specific gaps between reporting requirements and available data.
Determine whether the gaps can be addressed through technical means (e.g., additional ETL work) or require changes to reporting expectations.

3. Explore Alternative Data Sources

Investigate whether other systems or third-party data sources can supplement the missing data.
Consider enriching the dataset with external or public data.

4. Refine the Requirements

Work with stakeholders to revisit the original reporting requirements.
Adjust expectations to align with available data while still delivering value.

5. Enhance Data Governance

Develop clear ownership, governance, and documentation practices for source data.
Regularly audit data quality and accessibility to prevent future bottlenecks.

6. Build for Scalability

Future-proof your data estate by designing modular pipelines that can easily integrate new sources.
Implement dynamic models that can adapt to changing business needs.

7. Learn and Document the Experience

Treat this as a learning opportunity. Document what went wrong and how it was resolved.
Use these insights to improve future project planning and execution.

The New Epiphany: A Pivot to Success

While these moments may not bring the euphoria of perfect alignment, they represent an alternative kind of epiphany: the realisation that challenges are a natural part of innovation. Overcoming these obstacles often leads to a more robust and adaptable solution, and the lessons learned can significantly enhance your team’s capabilities.

In the end, the goal isn’t perfection – it’s progress. By navigating the difficulties of misalignment, incomplete or unavailable data with resilience and creativity, you’ll lay the groundwork for future successes and, ultimately, more euphoric epiphanies to come.

Steps to Ensure Success in Data Projects

To reach this transformative moment, teams must adopt structured practices and adhere to principles that drive success. Here are the key steps:

1. Define Clear Objectives

Identify the core business problems you aim to solve with your data estate.
Engage stakeholders to define reporting and dashboard requirements.
Develop a roadmap that aligns with organisational goals.

2. Build a Strong Foundation

Invest in the right infrastructure for data ingestion, storage, and processing (e.g., cloud platforms, data lakes, or warehouses).
Ensure scalability and flexibility to accommodate future data needs.

3. Prioritize Data Governance

Implement data policies to maintain security, quality, and compliance.
Define roles and responsibilities for data stewardship.
Create a single source of truth to avoid duplication and errors.

4. Embrace Parallel Development

Top-Down: Start designing wireframes for reports and dashboards while defining the key metrics and KPIs.
Bottom-Up: Simultaneously ingest and clean data, applying transformations to prepare it for analysis.
Use agile methodologies to iterate and refine both streams in sync.

5. Leverage Automation

Automate data pipelines for faster and error-free ingestion and transformation.
Use tools like ETL frameworks, metadata management platforms, and workflow orchestrators.

6. Foster Collaboration

Establish a culture of collaboration between business users, analysts, and engineers.
Encourage open communication to resolve misalignments early in the development cycle.

7. Test Early and Often

Validate data accuracy, completeness, and consistency before consumption.
Conduct user acceptance testing (UAT) to ensure the final reports meet business expectations.

8. Monitor and Optimize

After deployment, monitor the performance of your data estate.
Optimize processes for faster querying, better visualization, and improved user experience.

Most Importantly – do not forget that the true driving force behind technological progress lies not just in innovation but in the people who bring it to life. Investing in the right individuals and cultivating a strong, capable team is paramount. A team of skilled, passionate, and collaborative professionals forms the backbone of any successful venture, ensuring that ideas are transformed into impactful solutions. By fostering an environment where talent can thrive – through mentorship, continuous learning, and shared vision – organisations empower their teams to tackle complex challenges with confidence and creativity. After all, even the most groundbreaking technologies are only as powerful as the minds and hands that create and refine them.

Conclusion: Turning Vision into Reality

The Storms River Bridge stands as a symbol of human achievement, blending design foresight with engineering excellence. It teaches us that innovation requires foresight, collaboration, and meticulous execution. Similarly, building a successful data estate solution is not just about connecting systems or transforming data – it’s about creating a seamless convergence where insights meet business needs. By adopting a Top-Down and Bottom-Up approach, teams can navigate the complexities of data projects, aligning technical execution with business needs.

When the two streams meet – when your transformed data delivers perfectly to your reporting requirements – you’ll experience your own epiphany moment of euphoria. It’s a testament to the power of collaboration, innovation, and relentless dedication to excellence.

In both engineering and technology, the most inspiring achievements stem from the ability to transform vision into reality. The story of the Paul Sauer Bridge teaches us that innovation requires foresight, collaboration, and meticulous execution. Similarly, building a successful data estate solution is not just about connecting systems or transforming data, it’s about creating a seamless convergence where insights meet business needs.

The journey isn’t always smooth. Challenges like incomplete data, shifting requirements, or unforeseen obstacles can test our resilience. However, these moments are an opportunity to grow, recalibrate, and innovate further. By adopting structured practices, fostering collaboration, and investing in the right people, organizations can navigate these challenges effectively.

Ultimately, the epiphany moment in data estate development is not just about achieving alignment, it’s about the collective people effort, learning, and perseverance that make it possible. With a clear vision, a strong foundation, and a committed team, you can create solutions that drive success and innovation, ensuring that every challenge becomes a stepping stone toward greater triumphs.

AI Missteps: Navigating the Pitfalls of Business Integration

March 25, 2024March 25, 2024Leave a comment

AI technology has been at the forefront of innovation, offering businesses unprecedented opportunities for efficiency, customer engagement, and data analysis. However, the road to integrating AI into business operations is fraught with challenges, and not every endeavour ends in success. In this blog post, we will explore various instances where AI has gone or done wrong in the business context, delve into the reasons for these failures, and provide real examples to illustrate these points.

1. Misalignment with Business Objectives

One common mistake businesses make is pursuing AI projects without a clear alignment to their core objectives or strategic goals. This misalignment often leads to investing in technology that, whilst impressive, does not contribute to the company’s bottom line or operational efficiencies.

Example: IBM Watson Health

IBM Watson Health is a notable example. Launched with the promise of revolutionising the healthcare industry by applying AI to massive data sets, it struggled to meet expectations. Despite the technological prowess of Watson, the initiative faced challenges in providing actionable insights for healthcare providers, partly due to the complexity and variability of medical data. IBM’s ambitious project encountered difficulties in scaling and delivering tangible results to justify its investment, leading to the sale of Watson Health assets in 2021.

2. Lack of Data Infrastructure

AI systems require vast amounts of data to learn and make informed decisions. Businesses often underestimate the need for a robust data infrastructure, including quality data collection, storage, and processing capabilities. Without this foundation, AI projects can falter, producing inaccurate results or failing to operate at scale.

Example: Amazon’s AI Recruitment Tool

Amazon developed an AI recruitment tool intended to streamline the hiring process by evaluating CVs. However, the project was abandoned when the AI exhibited bias against female candidates. The AI had been trained on CVs submitted to the company over a 10-year period, most of which came from men, reflecting the tech industry’s gender imbalance. This led to the AI penalising CVs that included words like “women’s” or indicated attendance at a women’s college, showcasing how poor data handling can derail AI projects.

3. Ethical and Bias Concerns

AI systems can inadvertently perpetuate or even exacerbate biases present in their training data, leading to ethical concerns and public backlash. Businesses often struggle with implementing AI in a way that is both ethical and unbiased, particularly in sensitive applications like hiring, law enforcement, and credit scoring.

Example: COMPAS in the US Justice System

The Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) is an AI system used by US courts to assess the likelihood of a defendant reoffending. Studies and investigations have revealed that COMPAS predictions are biased against African-American individuals, leading to higher risk scores compared to their white counterparts, independent of actual recidivism rates. This has sparked significant controversy and debate about the use of AI in critical decision-making processes.

4. Technological Overreach

Sometimes, businesses overestimate the current capabilities of AI technology, leading to projects that are doomed from the outset due to technological limitations. Overambitious projects can drain resources, lead to public embarrassment, and erode stakeholder trust.

Example: Facebook’s Trending Topics

Facebook’s attempt to automate its Trending Topics feature with AI led to the spread of fake news and inappropriate content. The AI was supposed to curate trending news without human bias, but it lacked the nuanced understanding of context and veracity, leading to widespread criticism and the eventual discontinuation of the feature.

Conclusion

The path to successfully integrating AI into business operations is complex and challenging. The examples mentioned highlight the importance of aligning AI projects with business objectives, ensuring robust data infrastructure, addressing ethical and bias concerns, and maintaining realistic expectations of technological capabilities. Businesses that approach AI with a strategic, informed, and ethical mindset are more likely to navigate these challenges successfully, leveraging AI to drive genuine innovation and growth.

Case Study: Renier Botha’s Leadership in the Winning NHS Professionals Tender Bid for Beyond

August 24, 2022May 24, 2024Leave a comment

Introduction

Renier Botha, a seasoned technology leader, spearheaded Beyond’s successful response to a Request for Proposal (RFP) from NHS Professionals (NHSP) for outsourced data services. This case study examines the strategic approaches, leadership, and technical expertise employed by Botha and his team in securing this critical project.

Context and Challenge

NHSP sought to outsource its data engineering services to enhance data science and reporting capabilities. The challenge was multifaceted, requiring a deep understanding of NHSP’s current data operations, stringent data governance and GDPR compliance, and the integration of advanced cloud technologies.

Strategy and Implementation

1. Stakeholder Engagement:
Botha led the initial stages by conducting key stakeholder interviews and meetings to gauge the current state and expectations. This hands-on approach ensured alignment between NHSP’s needs and Beyond’s proposal.

2. Gap Analysis:
By understanding the existing Data Engineering function, Botha identified inefficiencies and gaps. His team offered strategic recommendations for process improvements, directly addressing NHSP’s operational challenges.

3. Infrastructure Assessment:
Botha’s review of the current data processing systems uncovered dependencies that could impact future scalability and integration. This was crucial for designing a solution that was not only compliant with current standards but also adaptable to future technological advancements.

4. Data Governance Review:
Given the critical importance of data security in healthcare, Botha prioritised a thorough review of data governance practices, ensuring all proposed solutions were GDPR compliant.

5. Future State Architecture:
Utilising cloud technologies, Botha proposed a high-level architecture and design for NHSP’s future data estate. This included a blend of strategic and BAU tasks aimed at transforming NHSP’s data handling capabilities.

6. Team and Service Delivery Design:
Botha defined the composition of the Data Engineering team necessary to deliver on NHSP’s objectives. This included detailed job descriptions and a clear division of responsibilities, ensuring a match between team capabilities and service delivery goals.

7. KPIs and Service Levels:
Critical to the project’s success was the definition of KPIs and proposed service levels. Botha’s strategic vision included measurable outcomes to track progress and ensure accountability.

8. RFP Response and Roadmap:
Botha’s provided a detailed response to the RFP, outlining a clear and actionable data engineering roadmap for the first two years of service, broken down into six-month intervals. This detailed planning demonstrated a strong understanding of NHSP’s needs and showcased Beyond’s commitment to service excellence.

9. Technical Support:
Beyond also supported NHSP with system architecture queries, ensuring that all technical aspects were addressed comprehensively.

Results and Impact

Under Botha’s leadership, Beyond won the NHSP contract by effectively demonstrating a profound understanding of the project requirements and crafting a tailored, forward-thinking solution. The strategic approach not only aligned with NHSP’s operational goals but also positioned them for future scalability and innovation.

Conclusion

Botha’s expertise in data engineering and project management was pivotal in Beyond’s success. By meticulously planning and executing each phase of the RFP response, he not only led his team to a significant business win but also contributed to the advancement of data management practices within NHSP. This project serves as a benchmark in effective stakeholder management, strategic planning, and technical execution in the field of data engineering services.