A genuine and considerable question that comes in our mind when we hear about NoSQL databases is “Would it put back our all-time favourite SQL databases?” If no, then in what circumstances we need to play with our favouritism for SQL – ‘The King of databases with unanimous consent’? We all deserve to know this with logic rather than just accepting it with mob mentality. And for understanding it completely, we will also have to duly acknowledge the snags that we hit while working with SQL databases along with their precedence accepted unanimously.
Let’s first cast a brief look at the structure, features and foibles of both type of databases to understand the choice of database for an application.
SQL Databases: The most popular and standard design for the daily needs of most businesses for decades have been SQL databases which are also known as Relational Databases. These databases model data in tabular relations and use Structured Query Language to communicate with data. SQL is the standard for database interoperability and is the foundation for all of the popular database applications available today, from Access to Oracle.
These are useful for handling structured data that organizes elements of data and standardizes how they relate to one another and to different properties. The Relational database model is implemented in many database products either using a SQL database for processing or SQL statements for requests and database updates. These databases extend from small, desktop systems to huge cloud-based systems. They come in a variety of open source and internally supported systems, open source with commercial support systems, and commercial closed-source systems. Few of the most popular relational databases include Microsoft SQL Server, Oracle, MySQL, PostgreSQL and IBM DB2.
NoSQL Databases: The NoSQL databases which originally refer to “Non-SQL” or “Non-Relational” databases model data in non-tabular relations and use more flexible data models. These databases are designed to be used across large distrusted systems.
These are useful for managing unstructured data and can store data in any record in a schema-less or free-form fashion. NoSQL databases follow four common models for storing data, which set up the four categories of NoSQL systems viz. Document databases, KeyValue Databases, Wide-column Databases, and Graph Databases. Let’s discuss them one by one
Document Databases: This type of NoSQL databases follows document storage model and stores data as a document in JSON or XML format. Since these databases do not need to have a schema and are quite flexible, these are easy to modify. These are a good option for the applications requiring the ability to store varying attributes along with large amounts of data and are mostly used for CMS systems, blogging platforms, real-time analytics and e-commerce applications. Most popular examples of Document based databases are MongoDB and CouchDB.
Key-Value Databases: The Key-Value is quite simple in its type as it stores data as a dictionary or hash table where the key is the dictionary word, and the value is the definition of that word. Each key is unique, and the value can be in form of JSON, BLOB (Binary Large Objects) or string etc. These databases have no query language and provide a simple way to manage data. The simplicity of this model makes them fast, easy to use, scalable, portable, and flexible. These are a great choice for storing large amounts of data without a need to perform complex queries to retrieve it. Redis and DynanoDB are the most popular in this category.
Wide-Column Databases: A wide-column database is also known as column-based database. Similar to a relational database, it also stores data in tabular form with rows and columns but with a lot of flexibility as each row is not required to have the same columns. These databases club columns of related data together which can be retrieved in a single operation through a query
Wide-column databases are best to store large amounts of data with the idea of query patterns to be used. These are widely used to manage data warehouses, business intelligence, CRM, and Library Card Catalogues. Cassandra and HBase are two of the most popular databases of this type of NoSQL databases.
Graph Databases: Graph based NoSQL databases store data in Nodes and Edges where nodes store information about entities and edges define relationship between nodes. Each node and edge has a unique identifier. Graph databases are multi-relational in nature and are best to use where various data connections need to be visualized and analysed for different patterns. This being the case, these are mostly used for social networks, fraud detection, logistics, network analysis, spatial data and recommendation engines. Examples of Graph databases include Neo4j, InfiniteGraph and JanusGraph.
Which database to choose and when?
Now, since we have discussed both types of databases with examples, we should be able to make a choice on database for any application to be developed. The advantages of both databases come with a price; be it low or high.
Though the conventional SQL (Relational) databases are well known for their consistency and reliability, proved with real-world stress testing, they fall short in performance due to their rigid schema and hence are less suitable for the applications striving for flexibility and high operational speed. On the other hand, NoSQL databases indisputably have come a long way with speed, global availability and scalability due to their flexible schema and support for unstructured data but they have compromised substantially with the ACID (Atomicity, Consistency, Isolation, and Durability) properties of a transaction.
Consequently, we need to consider the following factors while selecting a SQL or NoSQL database:
Data Structure: The foremost element that can prove our choice of database “good” or “bad” is what our data looks like. If the data is primarily structured and ACID compliance is must, there is no better choice than a SQL database. Being more consistent and ensuring data integrity with ACID properties, SQL databases are a great fit for heavy duty or complex transactions and hence for transaction-oriented systems like CRM tools, accounting software, and e-commerce platforms.
Alternatively, if the data is not structured or we are not sure about the data requirements, NoSQL may be our chalk horse as it does not need a predefined schema. NoSQL facilitates storing all types of data together as and when needed.
Data Query: The next important factor to consider is the ability to query data, that is, how frequently and swiftly we need to query our data. Being well structured and organised, SQL databases are very friendly in querying even complex data whereas NoSQL databases are not much query-efficient and require extra processing on the data to run queries, which may prove to be costly. However, some Document based NoSQL databases like MongoDB and CouchDB do come with querying capabilities.
Scaling: Both SQL and NoSQL databases are highly scalable but they scale differently. So, our choice of database should be based on the future growth of our data set.
The loyalty for the ACID properties confined the SQL databases to run on a single server and they scale vertically. It plainly means that in order to scale a SQL database, the capacity of a single server needs to be increased.
On the contrary, NoSQL databases scale horizontally which implies more servers can be added any time to reinforce the growing database.
Can both SQL & NoSQL be merged?
With this long discussion on the features and foibles of SQL and NoSQL databases, one question that is hitting my mind ceaselessly is – “Is there a way to combine both the databases and get the maximum of them?”
Both types of databases have their own pros and cons and are specifically suitable for the systems of their kind. Though SQL is well established and has mass acceptance for all types of applications, many SQL databases have started adopting features of NoSQL databases to go with the varying technological needs. Accepting JSON documents as a native data type and performing queries against that data, and imposing constraints on JSON data are to name a few.
On the other hand, NoSQL databases are also embracing many knacks of conventional SQL databases along with adding SQL like query languages. For example, 2 document based NoSQL databases – MongoDB and RavenDB promise to be ACID compliant whereas MongoDB and CouchDB are as powerful in querying data as any relational database.
Examples: MySQL Document Store, an enhancement to the most popular open-source relational database MySQL, gives the provision to combine the SQL relational tables and schema-less JSON collections in the same database as well as the same application.
MongoDB, one of the most popular NoSQL databases, offers multi-document ACID transactions. DynamoDB, another NoSQL database managed by AWS also provides ACIDcompliant transaction functionality
Conclusion:
Although NoSQL databases have made their presence felt with irrefutable features like flexibility, scalability and speed by following the BASE (Basic Availability, Soft State, Eventual consistency) consistency model, it has a long way to meet the inimitable features like consistency and reliability of SQL databases. NoSQL should actually be understood as “Not Only SQL” rather than “Non SQL” which means it does support SQL along with having a lot more features and it can in no way replace the traditional SQL databases. The choice purely stands on the requirements of data structure, query handling and scalability for an application.