GCPs take on database on Kubernetes

Today, more and more applications are being deployed in containers on Kubernetes—so much so that we’ve heard Kubernetes called the Linux of the cloud. Despite all that growth on the application layer, the data layer hasn’t gotten as much traction with containerization. That’s not surprising, since containerized workloads inherently have to be resilient to restarts, scale-out, virtualization, and other constraints. So handling things like state (the database), availability to other layers of the application, and redundancy for a database can have very specific requirements. That makes it challenging to run a database in a distributed environment.

However, the data layer is getting more attention, since many developers want to treat data infrastructure the same as application stacks. Operators want to use the same tools for databases and applications, and get the same benefits as the application layer in the data layer: rapid spin-up and repeatability across environments. In this blog, we’ll explore when and what types of databases can be effectively run on Kubernetes.

Before we dive into the considerations for running a database on Kubernetes, let’s briefly review our options for running databases on Google Cloud Platform (GCP) and what they’re best used for.

•Fully managed databases. This includes Cloud Spanner, Cloud Bigtable and Cloud SQL, among others. This is the low-ops choice, since Google Cloud handles many of the maintenance tasks, like backups, patching and scaling. As a developer or operator, you don’t need to mess with them. You just create a database, build your app, and let Google Cloud scale it for you. This also means you might not have access to the exact version of a database, extension, or the exact flavor of database that you want.

•Do-it-yourself on a VM. This might best be described as the full-ops option, where you take full responsibility for building your database, scaling it, managing reliability, setting up backups, and more. All of that can be a lot of work, but you have all the features and database flavors at your disposal.

•Run it on Kubernetes. Running a database on Kubernetes is closer to the full-ops option, but you do get some benefits in terms of the automation Kubernetes provides to keep the database application running. That said, it is important to remember that pods (the database application containers) are transient, so the likelihood of database application restarts or failovers is higher. Also, some of the more database-specific administrative tasks—backups, scaling, tuning, etc.—are different due to the added abstractions that come with containerization.

Tips for running your database on Kubernetes
When choosing to go down the Kubernetes route, think about what database you will be running, and how well it will work given the trade-offs previously discussed. Since pods are mortal, the likelihood of failover events is higher than a traditionally hosted or fully managed database. It will be easier to run a database on Kubernetes if it includes concepts like sharding, failover elections and replication built into its DNA (for example, ElasticSearch, Cassandra, or MongoDB). Some open source projects provide custom resources and operators to help with managing the database.

Next, consider the function that database is performing in the context of your application and business. Databases that are storing more transient and caching layers are better fits for Kubernetes. Data layers of that type typically have more resilience built into the applications, making for a better overall experience.

Finally, be sure you understand the replication modes available in the database. Asynchronous modes of replication leave room for data loss, because transactions might be committed to the primary database but not to the secondary database(s). So, be sure to understand whether you might incur data loss, and how much of that is acceptable in the context of your application.

How to deploy a database on Kubernetes
Now, let’s dive into more details on how to deploy a database on Kubernetes using StatefulSets. With a StatefulSet, your data can be stored on persistent volumes, decoupling the database application from the persistent storage, so when a pod (such as the database application) is recreated, all the data is still there. Additionally, when a pod is recreated in a StatefulSet, it keeps the same name, so you have a consistent endpoint to connect to. Persistent data and consistent naming are two of the largest benefits of StatefulSets. You can check out the Kubernetes documentation for more details.

If you need to run a database that doesn’t perfectly fit the model of a Kubernetes-friendly database (such as MySQL or PostgreSQL), consider using Kubernetes Operators or projects that wrap those database with additional features. Operators will help you spin up those databases and perform database maintenance tasks like backups and replication. For MySQL in particular, take a look at the Oracle MySQL Operator and Crunchy Data for PostgreSQL.

Operators use custom resources and controllers to expose application-specific operations through the Kubernetes API. For example, to perform a backup using Crunchy Data, simply execute pgo backup [cluster_name]. To add a Postgres replica, use pgo scale cluster [cluster_name].

There are some other projects out there that you might explore, such as Patroni for PostgreSQL. These projects use Operators, but go one step further. They’ve built many tools around their respective databases to aid their operation inside of Kubernetes. They may include additional features like sharding, leader election, and failover functionality needed to successfully deploy MySQL or PostgreSQL in Kubernetes.

While running a database in Kubernetes is gaining traction, it is still far from an exact science. There is a lot of work being done in this area, so keep an eye out as technologies and tools evolve toward making running databases in Kubernetes much more the norm.

For more details refer to this article on Here’s an article from GCP

AWS PostgresSQL with Machine Learning

Amazon Aurora with PostgreSQL compatibility is now available with machine learning capabilities, an option to export data into Amazon S3, and compatibility with updated PostgreSQL versions.

You can use Aurora to add machine learning (ML) based predictions to your applications, using a simple, optimized, and secure integration with Amazon SageMaker and Amazon Comprehend. Aurora machine learning is based on the familiar SQL programming language, so you don’t need to build custom integrations, move data around, learn separate tools, or have prior machine learning experience. This functionality is also available for Aurora with MySQL 5.7 compatibility.
Aurora machine learning supports any ML model available in SageMaker, or you can run sentiment analysis using Comprehend. It’s available for PostgreSQL 10 and 11, with no additional charge beyond the price of the AWS services that you are using. For more information, read the launch blog, the Aurora ML feature page, and the Aurora documentation.

Azure Database Consulting Solutions

We know that migrating to cloud, if you’re not there already is a daunting project.  Especially if you’re not familiar with the proper steps and preparations. We are specialized in Azure Database migrations apart from AWS database and Business Intelligence Solutions.  We’re a Microsoft partner and work closely with Microsoft resources when needed, to get your project to azure cloud.

Our Azure consulting has engineers with over a decade of Cloud experience.   As an experienced Azure Database Consulting company, Whether you wanted to migrate into Azure with significant control by constructing your own VM and installing and configuring your Enterprise or other Editions of SQL Server, or just go cloud using Azure SQL or Azure SQL DataWarehouse with minimal management, we make it all happen seamlessly with little or no impact to your current production environment.

We can also partially migrate select components of your Database or Business Intelligence, and proceed in steps, rather than migrating at once and having trouble getting familiar or facing difficulties in managing them all at once.

For our cloud skeptical customers, our Azure Database Consulting solutions also involve migrations through POCs.  We hear you asking “How does that work?”  Our POCs are planned by isolating and sampling logical portions of your applications and proofing them on the cloud.  This allows to bench test and attain various performance and other numbers, to ensure you’re getting the best of what the cloud has to offer and offer your stakeholders concrete numbers to make decisions quick.   We also balance the technologies on the cloud with your budget and make sure we have the budget friendly technologies aligned with your company goals and allow proper scaling.

We have provided extensive cloud solutions that affect 10’s of millions of dollars of revenue and saved customers millions in business intelligence delivery and logistics and costs associated to maintaining onsite hardware.

As a cloud consulting services provider, we have enabled a global cable services company with millions of end users and 100s of customers who went from zero cloud to everything cloud within 6 months to a year.  Check our Case studies With multi-terabyte datawarehouse, this specific solution has a combination of SQL and NoSQL technologies including specialty caching and Document mgmt. databases and end-to-end monitoring for the entire cloud.  Our cloud solutions also ensure global and geographically sustainable high availability and performance so end users have response times similar to locally hosted data centers.

Our SWAT (System Wide Assessment Team) is ready to provide full or partial assessments of your environment and options of which cloud technologies will be appropriate for you.

Call and ask us more on Azure Cloud Consulting, or write to us