DM Jul-Aug 2021
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
TECHNOLOGY: STORAGE Dm<br />
"WHILE BOTH OBJECT STORAGE AND DFS ARE WELL-SUITED FOR STORING SUBSTANTIAL VOLUMES OF<br />
UNSTRUCTURED DATA, THEY SUIT DIFFERENT USE CASES. AS OBJECT STORAGE EXPOSES A REST API, IT IS ONLY<br />
SUITABLE FOR APPLICATIONS THAT ARE SPECIFICALLY INTENDED TO INTERACT WITH THIS TYPE OF STORAGE. IN<br />
CONTRAST, DFS EXPOSE A TRADITIONAL FILESYSTEM API WHICH IS SUITABLE FOR ANY APPLICATION, INCLUDING<br />
LEGACY APPLICATIONS THAT WORK OVER A HIERARCHICAL FILESYSTEM. DFS PROVIDE A DEEPER AND MORE<br />
GENERAL-PURPOSE INTERFACE TO APPLICATIONS, ALLOWING THEM TO PERFORM CERTAIN ACTIVITIES THAT ARE<br />
NOT SUITABLE FOR OBJECT STORAGE."<br />
huge files and is less expensive per<br />
gigabyte than DFS.<br />
IT teams considering implementing a<br />
DFS for their unstructured data must<br />
decide between two different types:<br />
clustered or federated.<br />
CAP THEOREM<br />
As mentioned above, DFS might<br />
support strong or eventual consistency.<br />
This is where computer science theory<br />
comes in, as a distributed data store<br />
can have no more than two out of<br />
three properties according to CAP<br />
theorem. These three properties are:<br />
Consistency: Every read receives the<br />
most recent write or an error<br />
Availability: Every request receives a<br />
(non-error) response - without the<br />
guarantee that it contains the most<br />
recent write<br />
Partition tolerance: The system<br />
continues to operate despite an<br />
arbitrary number of messages being<br />
dropped (or delayed) by the<br />
network between nodes<br />
As a result, there are two types of DFS<br />
currently available:<br />
CLUSTERED DFS<br />
Clustered DFS are made up of a closely<br />
connected cluster of nodes. They focus<br />
strictly on data consistency and are<br />
especially suitable for large-scale<br />
computing use cases at the enterprise<br />
core, such as big data analytics, highperformance<br />
computing, or databases.<br />
The consistency and availability<br />
aspects of CAP theorem are the subject<br />
of clustered DFS. But strong<br />
consistency assurances do not come<br />
cheap. They impose significant<br />
constraints on system operation and<br />
performance, especially when nodes<br />
are separated by high latency or<br />
unreliable connections.<br />
FEDERATED DFS<br />
The goal of federated DFS is to make<br />
data available over long distances<br />
while maintaining partition tolerance.<br />
Federated DFS are well-suited for<br />
weakly linked edge-to-cloud use cases,<br />
including unstructured data storage<br />
and management for remote and<br />
branch offices. Federated DFS focus on<br />
the availability and partition tolerance<br />
properties of the CAP theorem, rather<br />
than the strict consistency guarantee.<br />
In federated DFS, read and write<br />
operations on an open file are routed<br />
to a locally cached copy. When a<br />
modified file is closed, the modified<br />
sections are copied back to a central<br />
file server from the edge. Update<br />
conflicts may arise, and are resolved<br />
automatically. It could be claimed that<br />
federated DFS combines the semantics<br />
of a file system with the eventual<br />
consistency model of object storage. In<br />
this way federated DFS are optimised<br />
for use cases including archiving,<br />
backup, media libraries, mobile data<br />
access, content distribution to edge<br />
locations, content ingestion from edge<br />
to cloud, remote and branch office<br />
storage, and hybrid cloud storage.<br />
Both clustered and federated DFS<br />
have applications in the enterprise. To<br />
reap the full benefits of a DFS, IT<br />
teams must be familiar with how<br />
clustered and federated DFS differ in<br />
order to choose the option most<br />
suited to their application<br />
requirements.<br />
The market is undergoing a<br />
significant shift towards DFS and<br />
object storage, at the same time as<br />
organisations are looking for more<br />
efficient methods to not only cope<br />
with but thrive from the explosion of<br />
unstructured data. The optimal<br />
decision, whether object storage,<br />
clustered or federated DFS, or a<br />
combination, lies in careful<br />
consideration of the organisations'<br />
requirements and use cases.<br />
More info: www.ctera.com<br />
www.document-manager.com<br />
<strong>Jul</strong>y/<strong>Aug</strong>ust <strong>2021</strong><br />
@<strong>DM</strong>MagAndAwards<br />
11