The Convergence Blog

The Convergence is sponsored by Data-Mania
… it’s just another way we’re giving back to the data community from which we sprung.

The Convergence - An online community space that's dedicated to empowering operators in the data industry by providing news and education about evergreen strategies, late-breaking data & AI developments, and free or low-cost upskilling resources that you need to thrive as a leader in the data & AI space.

Storing Data in the Public Cloud: What You Should Know

Data-Mania Writer's Guild

Data-Mania Writer's Guild

Reading Time: 6 minutes

Do you know what a public cloud is, and how it can help your business’s storage needs? Read this article about storing data in the public cloud, what you should know about cloud storage, and the best practices to use it better.

Public Cloud: A Definition

A public cloud is a platform that relies on the standard cloud computing model, making resources accessible to users remotely. Such resources include applications, virtual machines, or storage. Cloud costs are typically operating expenses (OpEx) and not capital expenses (CapEx) – most cloud services are offered on a pay per use basis.

Storing Data in the Public Cloud What You Should Know

The public cloud provides an alternative approach to application development, differing from traditional on-premises IT architectures. In the typical public cloud model, a third-party vendor hosts on-demand, scalable IT resources and makes them available to users via a network connection. This connection is either over a dedicated network or the public Internet. 

In this article, I’ll explain the options for storing your data in the public cloud, and explore specific cloud storage services by the world’s leading cloud providers.

Storing Data in the Public Cloud and What You Should Know About Storage Options: Object vs File vs Block Storage

There are three common technologies to use in managing storage in the cloud:

  • Object storage

    It stores data as objects, which are self-contained units arranged in a flat hierarchy. Object storage does not use files or folders—instead, objects have metadata that facilitates organization, search, and retrieval. These can include any type of data or file type, including structured and unstructured data. This storage is elastically scalable, making it easy to share data across multiple physical storage devices.

  • File storage

    This is similar to the storage system used on a PC or file server. It stores data in files with a file extension that determines which application to use to view or edit the file. File storage can be deployed in a hard drive directly attached to a computer, or computers can remotely access files stored elsewhere using protocols like network attached storage (NAS). Cloud-based file storage makes it easy to migrate legacy applications to the cloud, because it behaves similarly to on-premise systems.

  • Block storage

    This splits data into blocks of predetermined size, with unique identifiers. When the data needs to be retrieved, it is pulled from multiple blocks and reassembled. Block storage is the storage technology used by hard disk drives, as well as enterprise storage systems like Storage Area Networks (SAN). The main advantage of block storage is that it supports high performance and high throughput scenarios.

Examples of Public Cloud Storage Services

Below are some examples of public cloud storage services:

Amazon Web Services

Amazon Web Services (AWS) is a cloud services platform. The platform provides database storage, cloud computing infrastructure, API support, content delivery, and bandwidth. In addition, it has several PaaS and IaaS services.

Key AWS storage services include:

  • Simple Storage Service (S3)

    S3 is an object storage service that is infinitely scalable and provides high data durability. With it, one can create data lakes, backup and archive data, and store static data for web and mobile applications. It has management features that let you organize access to data and automate data lifecycles.

  • Elastic File System (EFS)

    It is a serverless elastic file system that scales up and down on demand without disrupting applications. It integrates easily with legacy applications, enabling lift and shift of workloads to the cloud. EFS provides a web services interface that lets you create and configure file systems.

  • AWS FSx

    This is a service that enables running large-scale high-performance file systems. It supports Lustre, NetApp ONTAP, Windows File Server, and OpenZFS.

  • Elastic Block Store (EBS)

    A block storage solution that provisions virtual hard disks you can attach to Elastic Compute Cloud (EC2) instances. They can be based on HDD or SSD technology, and can be dynamically configured based on requirements.

Microsoft Azure

Storing Data in the Public Cloud What You Should KnowMicrosoft Azure is Microsoft’s public cloud computing platform that provides a wide range of services for computing, data storage, data analytics, and networking. Azure is a common platform for hosting databases in the cloud. Microsoft offers serverless relational databases such as Azure SQL and non-relational databases such as NoSQL.

The platform is a frequent backup and disaster recovery tool; that is why many organizations use Azure storage as an archive in order to meet their long-term data retention requirements.

Key Azure storage services include:

  • Azure Files

    It is a managed file service based on the Server Message Block (SMB) protocol. It lets you mount cloud file shares from any Windows, Linux, or macOS machine, whether on-premises or in the Azure cloud.

  • Blob Storage

    This is an object storage service that enables storage of unstructured data in any format. You can use Blob Storage in combination with Azure Data Lake to easily build an enterprise data lake and support big data analytics.

  • Disks

    These are virtual hard drives that can be mounted and accessed from an Azure virtual machine (VM).

  • Queues

    These supports asynchronous, high speed message queueing between applications in the Azure cloud.

  • Tables

    These are unstructured, key-value data store with a schemaless NoSQL design.

Google Cloud Platform

Google Cloud Platform provides PaaS and IaaS services such as data storage, computing, and networking. Moreover, this platform offers developer tools and applications for running on Google hardware. Certain offerings include App Engine, Compute Engine, Cloud Storage, Container Engine, and BigQuery.

Google Cloud Storage is a public cloud storage platform for enterprises to retain sizable unstructured data sets. Organizations can buy storage for their main or infrequently used data.

  • Google Cloud Storage

    An object storage service that can be used for production or archive data. It provides four storage tiers, enabling storing data in one or multiple Google Cloud regions, and archiving with frequent or infrequent access.

  • Cloud Filestore

    A cloud-based file service that creates file shares that can be mounted using the network attached storage (NAS) protocol, and supports high performance workloads.

  • Google Cloud Persistent Disks

    These are virtual hard drives that can be attached to Google Cloud VMs, and are used for persistent storage in Google Kubernetes Engine.

Storing Data in the Public Cloud: What You Should Know About Cloud Storage Best Practices

Below are some best practices that can help you make better use of your cloud storage:

  • Consider your cloud migration strategy

    Certainly, migrating too much data to the cloud at once can often be a mistake. Like any new system you adopt, adopting cloud gradually allows your organization to test and adapt to the new environment. Create a migration strategy and start small, by migrating smaller datasets that are not mission critical. 

  • Cloud backup and disaster recovery

    There is a common misconception that data stored in the cloud is automatically backed up. It is true that many cloud services have built-in backup and archiving features; however, you need to correctly configure them first to protect the data.

    The responsibility for data backup in the cloud rests with the cloud storage user. It is then best to consider cloud-native backup options such as storage tiering and replicating storage units to other cloud data centers.

  • Watch cloud storage costs

    Many organizations migrate to the cloud to save costs; therefore, it is important to validate that your cloud migration does indeed reduce costs. Cloud storage eliminates upfront investments in storage equipment and ongoing maintenance, and it creates an ongoing operating expense which can grow exponentially if your data volumes grow. Define a clear budget for your cloud storage deployment and track usage to ensure it does not exceed the budget.

  • Avoid vendor lock-in

    Public cloud providers have various strategies for encouraging customers to make more extensive use of their services and avoid switching to other providers. Avoid using cloud services in a way that will lock you into a specific cloud provider and make it difficult to migrate data away in the future. Prefer to use industry standard data formats and protocols to make datasets easily portable. Conduct due diligence to understand cloud provider offerings and to avoid lock-in.

Capping off: Storing data in the public cloud, what you should know

In this article, I explained the basics of public cloud storage, described the two main technologies used to store data in the cloud—object storage, file storage, and block storage, and briefly showed how the three biggest cloud providers package and provide their storage services. 

I provided several best practices that can help you make better use of cloud storage—first, consider your migration strategy before moving to the cloud. Second, remember that backup is your responsibility. Third, watch costs; and finally, avoid vendor lock in.

I hope this will be useful as you explore your organization’s use of the public cloud as an elastic, flexible storage option.

More To Explore…

If we’ve got you scratching your head with all this talk on storing data in the public cloud, we invite you to uncover your most high-potential data superpower by taking our free Data Superhero Quiz. It’s a fun, 45-second experience that will show you the most powerful data career path for you given your skillsets, passions, and personality.

Data superhero quiz

Hey! If you liked this post, I’d really appreciate it if you’d share the love by clicking one of the share buttons below!

Our newsletter is exclusively written for operators in the data & AI industry.

Hi, I'm Lillian Pierson, Data-Mania's founder. We welcome you to our little corner of the internet. Data-Mania offers fractional CMO and marketing consulting services to deep tech B2B businesses.

The Convergence community is sponsored by Data-Mania, as a tribute to the data community from which we sprung. You are welcome anytime.

Get more actionable advice by joining The Convergence Newsletter for free below.

See what 26,000 other data professionals have discovered from the powerful data science, AI, and data strategy advice that’s only available inside this free community newsletter.

Join The Convergence Newsletter for free below.
We are 100% committed to you having an AMAZING ✨ experience – that, of course, involves no spam.

Fractional CMO for deep tech B2B businesses. Specializing in go-to-market strategy, SaaS product growth, and consulting revenue growth. American expat serving clients worldwide since 2012.

© Data-Mania, 2012 - 2024+, All Rights Reserved - Terms & Conditions - Privacy Policy | PRODUCTS PROTECTED BY COPYSCAPE

The Convergence is sponsored by Data-Mania, as a tribute to the data community from which we sprung.

Get The Newsletter

See what 26,000 other data professionals have discovered from the powerful data science, AI, and data strategy advice that’s only available inside this free community newsletter.

Join The Convergence Newsletter for free below.
* Zero spam. Unsubscribe anytime.