Buzz Word NO-SQL, Yes from last few
months, I was hearing a lot about No-SQL, but knew nothing about. This is one
more blog out of thousand more which you can find online. It is just a
collection of all information which a got while trying to know what is NO-SQL.
The try here is not to give complete information, but to introduce you to the
NO-SQL world and important term, different type and other related information
between them.
What is NO-SQL?
No-SQL is a set of database, which
stores or manages unstructured data. It is very important to understand what a
database is. A traditional database is called RDBMS, which are relational
databases. In this kind of databases, we general store structured data. When we
say structured data, it means we define tables, columns and each column has a
set of type. When we want to store any data, we convert data into the same
format which can be added as per table definition, when we do not have data. We
general added null to it. We will also have a relation between the tables using
which we combine the data while fetching data using queries.
In NO-SQL we define some structures
in some database it is called document and some it is called table. The
structures are more helpful to fetch data than defining rules for data storage.In
NO-SQL do not have any relationship between the document/tables. Each data is
separate entities. It can have a relation in java world or another world, but
it is not defined in the database. No-SQL databases don't have join which
fetching data, instead we need to run different queries to fetch data.
Why to use No-SQL Databases?
No-SQL, databases are lightweight databases, easily
scalable, high performance and can have zero downtime.
- Lightweight Databases: The ram and memory which is taken the database itself is very less. I remember, No-SQL database such as MongoDB, was one of the preferred databases to be used in mobile application to store data locally. It has the different version of it.
- Easily Scalable: Even RDBMS databases scalable, but there is a small difference between scales up and scales out. Scale up is increasing the hardware infra of the same machine, whereas the scale out is added one more machine to cluster to increase the capacity. Even RDBMS databases have the abilities to scale out, but the not an easy step. Most of the NO-SQL databases will run in a cluster, and all the machine is cluster can be a simple desktop machine which we use.
- High Performance: Most of the No-SQL bet of high performance, you can find a lot of results, which show that the response time for databases is less. But somewhere I find this comparison as apple to the orange, as we have different types of No-SQL databases, each more suits for different needs. Even No-SQL databases have their advantage and disadvantage of the data which is fetched, one of them is join table while fetching data.
- Zero Downtime : This is one more advantage for go to No-SQL database is Zero downtime, which means, cluster can be designed in such a way that if some machine fail which running or machine are down for maintains, another machine in cluster can provide the data for request and outside system (requester) which not have effect of database issue. This is an inbuilt feature for most of the databases, as this feature is due to core design on which the NO-SQL systems are developed.
What
is Normalization and De-Normalization?
Most of the Relational databases
store data in normalization, in simple word, no data duplicate, instead link
the data between table using relations like foreign keys. When we want to find
any related data, we generally use join keyword in the query, join the table
and fetch the data, the advantage here is we reduce data duplication which in
turns save the hard disk used, also we can go and update the data at only once
place or table.
In the NO-SQL world, it is said
that disk space is cheaper than the CPU & No-SQL don't understand any
relation between data, don't provide join between data so that we can join
while we are fetching the data. To overcome this difficult it is said that
create smaller tables as required by your queries and duplicate that data. For
example, if you queries have where clause on the first name and percentage
scored, then as these will be different tables, just create one more table and
add the data, while fetching the data you can fetch from this table and if need
you can fetch another record from respective tables.
But some No-SQL databases, give
embedded row inside once record, those are generally document oriented No-SQL
databases.
No-SQL
Classification
No-SQL is not a database; it is one
classification of databases based on what kind of data is stored in it and some
other features. We have different kind of databases under the No-SQL or Big
Data.
- Column: Column Key is No-SQL databases, which store data as column wise, if the data is like person data like FristName, LastName, EmailId, password than the data will be fetched in the same way.
- Document: Document oriented No-SQL database will store the data is a document, when we say document, it is text document and not binary data like the image and other file sets majorly it will be JSON object, whole JSON object will be added to the database and as part of output/response database will return json objects..
- Key-value: Key-value databases are once which are like a big hash map, where all data is stored as key and value. It works as session maintaining container, but has extra value as it can persist
- Graph: These databases can store graphical data, which are like hierarchy data, where data is more of the node based.
- Column: Accumulo, Cassandra, Druid, HBase, Vertica
- Document: Apache CouchDB, Clusterpoint, Couchbase, DocumentDB, HyperDex, Lotus Notes, MarkLogic, MongoDB, OrientDB, Qizx, RethinkDB
- Key-value: Aerospike, Couchbase, Dynamo, FairCom c-treeACE, FoundationDB, HyperDex, MemcacheDB, MUMPS, Oracle NoSQL Database, OrientDB, Redis, Riak, Berkeley DB
- Graph: AllegroGraph, InfiniteGraph, MarkLogic, Neo4J, OrientDB, Virtuoso, Stardog
- Multi-model: Alchemy Database, ArangoDB, CortexDB, FoundationDB, MarkLogic, OrientDB
Where to use No-SQL databases?
No-SQL databases are used main
when the data is huge and needs 100% availability. These databases are mainly
used for report generation and data analysis where data is huge and generate
report using it. As the scalability is easy, it will be database keep growing
like data capture for user action, capture log data, audit data. As also
used as back-end databases for micro-service architecture where each service
need specific data. As some of the databases use JSON object, it is used for
light weight web application where data is stored directly as JSON and read as
JSON back. Some of the No-SQL databases (key-value type) are used to store and
maintain session across multiple application servers.