You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
MANAGEMENT: DATA ARCHITECTURE Dm<br />
"DECOUPLING COMPUTE AND STORAGE IN PUBLIC CLOUDS IS MORE STRAIGHTFORWARD TO<br />
A<strong>DM</strong>INISTER AND RELATIVELY INEXPENSIVE. BESIDES, THESE COMPUTE AND STORAGE CLOUD SERVICES<br />
ARE VIRTUALLY UNLIMITED IN SCALABILITY, ELIMINATING LEGACY HARDWARE PROCUREMENT ISSUES.<br />
THEY ALSO OFFER SUPREME LEVELS OF AVAILABILITY AND PERFORMANCE."<br />
Lastly, the independent scalability of both<br />
storage and compute facilitated ondemand,<br />
elastic resource precision, adding<br />
flexibility to architecture designs.<br />
However, these changes took time to<br />
materialise. Expensive Storage Area<br />
Networks (SANs) and less costly but often<br />
complex Network Attached Storage (NAS)<br />
systems have existed for quite a while. Both<br />
storage models were limited due to<br />
administrative and procurement<br />
overheads. Mass adoption of separating<br />
compute and storage only became feasible<br />
with public cloud computing.<br />
<strong>Dec</strong>oupling compute and storage in<br />
public clouds is more straightforward to<br />
administer and relatively inexpensive.<br />
Besides, these compute and storage cloud<br />
services are virtually unlimited in scalability,<br />
eliminating legacy hardware procurement<br />
issues. They also offer supreme levels of<br />
availability and performance. Therefore, the<br />
separation of compute from data brings<br />
forth three immediate benefits:<br />
A significant reduction in complicated<br />
and expensive data copies and<br />
movements as the data warehouse as<br />
the sole source of truth gets replaced<br />
by accessing data in open formats in<br />
the data lake, eliminating data silos.<br />
Open data standards and formats<br />
provide universal data access from<br />
infinite services and applications,<br />
creating the freedom to pick the best<br />
solutions.<br />
An open architecture ensures that<br />
future cloud services can directly access<br />
the data, avoiding going through a<br />
data warehouse vendor's proprietary<br />
format or moving/copying data from<br />
the data warehouse.<br />
THE OPPORTUNITIES OF OPEN<br />
ARCHITECTURE<br />
Cloud data warehouse providers enticed<br />
firms with the allure of scalability and costefficiency<br />
that was unsustainable with onpremises<br />
solutions. However, after<br />
uploading their data into the warehouse,<br />
organisations were restricted entirely to the<br />
vendor's ecosystem or denied access to<br />
other promising technologies that could<br />
extract more value from their data.<br />
Open architecture is a significant<br />
advantage of cloud data lake/lakehouse<br />
over the data warehouse. As a result,<br />
organisations are reassessing their<br />
strategies to use an open architecture that<br />
promotes flexibility and re-establishes<br />
ownership of their data. This shift signifies<br />
three things:<br />
The flexibility to utilise various superior<br />
services and engines on the company's<br />
data. This allows the use of diverse<br />
technologies like superior SQL,<br />
Databricks or any other data-processing<br />
tool. Given that companies have<br />
numerous use cases and requirements,<br />
utilising the best-suited tool yields<br />
higher productivity - especially for data<br />
teams - and lower cloud costs. It's also<br />
important to remember that no single<br />
vendor can offer all the processing<br />
capabilities a company requires.<br />
Not being confined to one vendor.<br />
Platform changes become profoundly<br />
challenging when dealing with a data<br />
warehouse holding up to a million<br />
tables and hundreds of complex<br />
ingestion pipelines. Comparatively, if an<br />
organisation uses a superior SQL on its<br />
cloud data lake today and a new tool<br />
emerges tomorrow, it's possible to<br />
query the existing data with the new<br />
system without migrating it.<br />
The ability to benefit from future<br />
technological advancements. Avoiding<br />
becoming locked-in is crucial, as it<br />
keeps vendors from exploiting a<br />
company financially. But more<br />
significant is the capacity to adopt and<br />
benefit from emerging technology,<br />
even if the current vendor remains<br />
favourable. If a superior machine<br />
learning service or a better batch<br />
processing engine is invented,<br />
organisations can have peace of mind<br />
that they can use the tool freely.<br />
Application architectures have<br />
demonstrated that a service-oriented<br />
approach allows maximum scale,<br />
flexibility, and agility. While separating<br />
compute and storage marked an essential<br />
first step in reducing analytic costs, it<br />
doesn't offer the kind of benefits visible in<br />
modern application architectures.<br />
However, by disengaging compute from<br />
data, the benefits of application design<br />
can now be used for data analytics,<br />
especially given the critical importance of<br />
data for all businesses.<br />
As a result, open data architecture brings<br />
forth many benefits, from flexibility,<br />
independence, and future-proofing to<br />
creating new avenues for gaining valuable<br />
business insights. In the rapidly evolving<br />
digital era, embracing open data<br />
architectures is more than a strategic<br />
choice; it's a decisive move towards a more<br />
flexible, scalable, and insightful future.<br />
More info: www.dremio.com<br />
www.document-manager.com<br />
<strong>Nov</strong>ember/<strong>Dec</strong>ember <strong>2023</strong><br />
@<strong>DM</strong>MagAndAwards<br />
31