Tuesday, October 30, 2018

Product Information Management - MicroService.

PIM

PIM is Product Information Management system. It is help full in maintaining the information of the product which is one of the backend system of many Important systems like E-Commerce site, Fulfillment system etc.
This system is addition to already existing PIM system, most of the system out there are paid and closely connect to one of the system. With below system, we tried to make this system independent and easy plugin buildable to communicate to other system in the eco-system.
It is built on micro service stack (Full Stack), which make it more easily scalable and zero downtime system.

Let jump into project and find answer to few basic like:

What is the architecture of the system?

We have N different micro service which are framed based of the functionality they add to system.
Below is list of the micro services which frame the PIM system.

·       Catalog Service: It is prime service, which act as orchestration layer as well as the handle the functionality with catalog data (adding, updating, deleting and fetching).
·       Audit Service: This layer adds the auditing functionality to the system. It maintains the audit records for the changes in the catalog data (which include who changed, what was changed).
·       Search Service: This layer adds the searching capability for the catalog data, it is powered by the Elastic search engine. As the data in present in ES, it also provides the report capacity and data analytic capacity with kibana adding to visualize data.
·       Spark service: It is service is mainly to feeding the data into and out of the system. It uses the spark batch processing capabilities to import 1st time data into system, and spark streaming capabilities to process continues input feeds data. It also provides is feature to refresh the elastic search data using data from cassandra.
·       Kafka: It is under lying messaging system between different system, also works as are secondary system for communication different system use it as failback channel. Also, it is primary channel for spark service to consume and publish the data feed to external system.
·       OAuth Service: It is service to handle the user data and maintains role based access. It is primary responsible for authenticating/creating users. It acts as the OAuth service & use the JWT tokens manager to identify and security. It can be integrated with the external IDM system.
·       Config Service: It is to maintain configuration related different service. it uses github to maintain different configuration setting based of label and env. Internal it also uses the RabbitMQ messaging system to provide auto update of configuration changes feature.

·       Eureka discovery service: It is standard discovery service provided by Netflix, it helps in load balancing and discovering different service to communicate between service.
·       Zuul Service: It act as gateway service and proxy layer. It also provides the authentication for all the api calls & works as resource server.
·       Redux/React OR Angular service: These servers are frontend server, which consume the endpoint data and paint the UI for the users. Any of these modules can be used. We personal recommend Angular as it is single page application.


Architecture of PIM

Architecture Of PIM



How to install the application in local/cloud?

      All the micro service including the databases application are docker containers. Only pre-request for installing and run the whole application in local machine.
  • Should have MVN, JDK 1.8 Installed.
  • Should have Docker installed and should be running.
Below is the list of steps to perform run the application.  
1.Clone the git repository.
    git clone
2.Build the project using MVN command.
               mvn clean install
3.Build and run the docker-compose command.
   Docker-compose up --build
4.Access the application with docker host.
   http://0.0.0.0/pim/


We are in process of create the file required to docker-swarm and kubernetes to make the docker orchestration easy.

How to do monitoring & reporting of data?

      One of the import aspect of PIM system should be monitor of the data and system as in many case PIM will be master system to maintain product data.
      Watcher: We have open source plugin in kibana called sential which provides the feature for setting up alert for any given threshold breach. Which we can setup for measuring different parameter and configuring for respective threshold.
      It also provides the feature for setting up rich report (which can have data and trend and graphical representation of data) which are scheduled for run at a given period.

What are the different features in PIM system?

      It provides several features related to creating, maintaining and disturbing of the product related information.
  • Role based access of data to view, create and update.
  • Asset level life cycle auditing.
  • Stream/batch based external system integration for updating data in & out of the system.
  • Provide search functionality which is useful for narrow down result and plot trend and data analysis.
Please find the code for the PIM system : PIM (https://github.com/Pradeep-ak/PIM).



Monday, December 26, 2016

ElasticSearch - NoSQL Search Engine

In the world of large data, it is important that we provide the user/customer a simple tool to find the right pieces of information quickly. The search engine is exactly suitable for the requirement. When it comes to the e-commerce world, it is very important as the user has lots of option to buy and Retail Company have a lot of different verity of product to offer for the customer. 

We have a lot of search engine in the market, some are commercial and licensed one like oracle Endeca, and some are open sourced search engine like Solr and ElasticSearch. Both the Solr and ElasticSearch engine offer lot of feature and similar features. Both uses the Apaches Lucene as core component as indexing the data. 

What is Elastic Search?
It is an open source (but operated by a single company) search engine. It is built thinking the cloud-based search engine in mind, basically, the indexed data is present in a different node and also have replica node for the failure of any node. It also has inbuilt node synchronization. When a new node is added to cluster the node is brought up-to-date and the system is balanced & the node is allowed to serve the query. It provides easy horizontal scaling ability. It is complete rest-API based search engine and has very high indexing throughput due to which it adds up for different use cases.

Key Point to Know
·        ES basically data is called "document", we can consider each row when compared to RBDMS.
·        ES is the schema-less search engine.
·        ES document can be structure-less, is not mandatory to follow a given structure in document.
·        Each data is called field, it is the column in a row when compared to DB.
·        We can still describe the document structure in the mapping file.
·        We still have index and type which can reduce the scope of data change and search, which in turn improve the performance.
·        All the action are performed on REST API, including updating settings.
·        All the API call will take data as in JSON format.
·        ES support the nest document, we can see in examples.

Best use case of ElasticSearch

Elastic Search is used in different technology stacks, as it is widely popular for text-based searching, it is used a search engine for different log analysis tools such as ELK (ElasticSearch, Logstash, Kibana) used for data analysis tools to get trends and reports.

As it has best writing thought put, it is also used as No-SQL data search, where back end it is supported by the no-SQL databases such as Cassandra, it will fill the gap of search the data ability as it is good in very high indexing throughput, the data changes are fastly and easily consumed. Elasticsearch engine has plugin which is used to synchronize between.

It is used in e-commerce world as a search engine as it supports the facet (aggregations), auto complete, fuzzy search, It is not the popular once such as solr and endeca. But slowly we see few of the retail site are powered by the ElasticSearch. 

Getting Started with ES
Get before getting started, let install java jre (as ElasticSearch in developed in java), fiddler (any tool to create rest request)/POST plugin in the browser.
  • Download the Elastic search from the site ElasticSearch site
  • Zip the archive, We Can find some folder such as (Bin, Config, Data, Lib, logs, Modules, Plugin). 
    • Config contains 'elasticsearch.yml' which provides configuration for nodes, backup.
    • Bin folder contains the bat files to start the search engine.
  • To start the search engine, go to bin folder and run the elasticsearch.bat file.
  • Default the port for search is 9200 (Can be changed in elasticsearch.yml).
  • Go to browser and access http://localhost:9200/
  • You should access the page which provides cluster information lucene_version.
So now the ES engine is up and run with the default setting. Next step will be loading data to search from.

Data Indexing
As we know the ES Engine is API based engine, we have 2 kinds of API for data upload. 
·        Single document create, update and delete.
·        Bulk create, update and delete.
Single document
          ES have API which are perform action on single document, it can be used when we have to operate on one document at a time.
Curl XPOST http://localhost:9200/<index>/<type>/1 -d {
“Id”:”1”,
“Name”: “Pradeep”
“Address”: {
“Street”: “sapient office”,
“City”: “Bangalore”,
“Zip code”: “560098”,
“Country”: “India”
},
“Location”: [34.05, -118.98],
“Rating”: “4.5”
}

Here we can see that the data can be sapientOffice and type as employee. Using the above curl we can create a new record or update existing document at id=1, both the operation will use the POST method.
The address is the one of the example of nest document, which is supported by ES.

Curl XDELETE http://localhost:9200/<index>/<type>/1 , will delete the document from the ES.

Bulk Document
          ES also provide API for bulk upload of the data for indexing. Below is the syntax of the API for bulk upload of data.
Curl XPOST http://localhost:9200/<index>/<type>/_bulk -d {
{“index”:{}}
{“Name”: “Pradeep”, “Address”: {“Street”: “sapient office”,“City”: “Bangalore”,“Zip code”: “560098”,“Country”: “India”},“Location”: [34.05, -118.98],“Rating”: “4.5”}
{“index”:{}}
{“Name”: “Pradeep”, “Address”: {“Street”: “sapient office”,“City”: “Bangalore”,“Zip code”: “560098”,“Country”: “India”},“Location”: [34.05, -118.98],“Rating”: “4.5”}
{“index”:{}}
{“Name”: “Pradeep”, “Address”: {“Street”: “sapient office”,“City”: “Bangalore”,“Zip code”: “560098”,“Country”: “India”},“Location”: [34.05, -118.98],“Rating”: “4.5”}
}

Here we can see that the data can be sapientOffice and type as employee. Using the above curl we can create a new record or update existing documents, both the operation will use the POST method.

Curl XDELETE http://localhost:9200/<index>/<type>/, will delete all the documents under the under the from the ES.

So now we know how to load the data in ES, let see how to get the data from ES.

ES Query
          One of the key functionality of the search engine is how fast we can retrieve the data and how relevant the data is. ES provide different set of syntax of query for fetching the data and which can be modified to suit our requirement.

Again the query to fetch the data is over API calls and request and response is in the JSON format. ES provides a rich, flexible, query language called the query DSL (domain-specific language), which allows us to build much more complicated, robust queries.

All the search related query are under the “_search” API domain.

Let see different kinds of queries.
1.     Below query will provide all the document under the all type and all index.
Curl XGet http://localhost:9200/<index>/<type>/_search  -d{
“query”:{
“match_all”:{}
}
}

2.     Below query will provide all the document under the all type of index .
Curl XGet http://localhost:9200/<index>/<type>/_search  -d{
“query”:{
“match_all”:{}
}
}

3.     Below query will provide all the document under the type of index .
Curl XGet http://localhost:9200/<index>/<type>/_search -d{
“query”:{
“match_all”:{}
}
}

4.     Below query will provide all the document under the type of index for search term “Pradeep” anywhere (Any field) in document.
Curl XGet http://localhost:9200/<index>/<type>/_search -d{
“query”:{
“query_string”:{
“query”:”Pradeep”
}
}
}

5.     Below query will provide all the document under the type of index for search term “Pradeep” in field Name or address’s street field in document.
Curl XGet http://localhost:9200/<index>/<type>/_search -d{
“query”:{
“query_string”:{
“query”:”Pradeep”,
“fields”:[“Name”,”address.street”]
}
}
}

Using Filter (Provide boundary for search)
6.     Below query will provide all the document under the type of index for search term “Pradeep” in field Name or address’s street field in document and also has the rating in range off.
Curl XGet http://localhost:9200/<index>/<type>/_search -d{
“query”:{
“Filtered”:{
“filter”:{
“range”:{
“rating”:{
“gte”:4.0
}
}
},“query_string”:{
“query”:”Pradeep”,
“fields”:[“Name”,”address.street”]
}
}//query ends
}//Filtered ends
}

7.     Below query will provide all the document under the type of index has the rating in range off.
Curl XGet http://localhost:9200/<index>/<type>/_search -d{
“query”:{
“Filtered”:{
“filter”:{
“range”:{
“rating”:{
“gte”:4.0
}
}
}
}//Filtered ends
}













Thursday, May 12, 2016

Buzzz Word "NO-SQL"

Buzz Word NO-SQL, Yes from last few months, I was hearing a lot about No-SQL, but knew nothing about. This is one more blog out of thousand more which you can find online. It is just a collection of all information which a got while trying to know what is NO-SQL. The try here is not to give complete information, but to introduce you to the NO-SQL world and important term, different type and other related information between them.

What is NO-SQL?

No-SQL is a set of database, which stores or manages unstructured data. It is very important to understand what a database is. A traditional database is called RDBMS, which are relational databases. In this kind of databases, we general store structured data. When we say structured data, it means we define tables, columns and each column has a set of type. When we want to store any data, we convert data into the same format which can be added as per table definition, when we do not have data. We general added null to it. We will also have a relation between the tables using which we combine the data while fetching data using queries.
In NO-SQL we define some structures in some database it is called document and some it is called table. The structures are more helpful to fetch data than defining rules for data storage.In NO-SQL do not have any relationship between the document/tables. Each data is separate entities. It can have a relation in java world or another world, but it is not defined in the database. No-SQL databases don't have join which fetching data, instead we need to run different queries to fetch data.

Why to use No-SQL Databases?

No-SQL, databases are lightweight databases, easily scalable, high performance and can have zero downtime.

  • Lightweight Databases: The ram and memory which is taken the database itself is very less. I remember, No-SQL database such as MongoDB, was one of the preferred databases to be used in mobile application to store data locally. It has the different version of it.
  • Easily Scalable: Even RDBMS databases scalable, but there is a small difference between scales up and scales out. Scale up is increasing the hardware infra of the same machine, whereas the scale out is added one more machine to cluster to increase the capacity.  Even RDBMS databases have the abilities to scale out, but the not an easy step. Most of the NO-SQL databases will run in a cluster, and all the machine is cluster can be a simple desktop machine which we use.
  • High Performance: Most of the No-SQL bet of high performance, you can find a lot of results, which show that the response time for databases is less. But somewhere I find this comparison as apple to the orange, as we have different types of No-SQL databases, each more suits for different needs. Even No-SQL databases have their advantage and disadvantage of the data which is fetched, one of them is join table while fetching data.
  •  Zero Downtime : This is one more advantage for go to No-SQL database is Zero downtime, which means, cluster can be designed in such a way that if some machine fail which running or machine are down for maintains, another machine in cluster can provide the data for request and outside system (requester) which not have effect of database issue. This is an inbuilt feature for most of the databases, as this feature is due to core design on which the NO-SQL systems are developed.


What is Normalization and De-Normalization?

Most of the Relational databases store data in normalization, in simple word, no data duplicate, instead link the data between table using relations like foreign keys. When we want to find any related data, we generally use join keyword in the query, join the table and fetch the data, the advantage here is we reduce data duplication which in turns save the hard disk used, also we can go and update the data at only once place or table. 
In the NO-SQL world, it is said that disk space is cheaper than the CPU & No-SQL don't understand any relation between data, don't provide join between data so that we can join while we are fetching the data. To overcome this difficult it is said that create smaller tables as required by your queries and duplicate that data. For example, if you queries have where clause on the first name and percentage scored, then as these will be different tables, just create one more table and add the data, while fetching the data you can fetch from this table and if need you can fetch another record from respective tables.
But some No-SQL databases, give embedded row inside once record, those are generally document oriented No-SQL databases.

No-SQL Classification

No-SQL is not a database; it is one classification of databases based on what kind of data is stored in it and some other features. We have different kind of databases under the No-SQL or Big Data.
  • Column: Column Key is No-SQL databases, which store data as column wise, if the data is like person data like FristName, LastName, EmailId, password than the data will be fetched in the same way.
  • Document: Document oriented No-SQL database will store the data is a document, when we say document, it is text document and not  binary data like the image and other file sets majorly it will be JSON object, whole JSON object will be added to the database and as part of output/response database will return json objects..
  • Key-value: Key-value databases are once which are like a big hash map, where all data is stored as key and value. It works as session maintaining container, but has extra value as it can persist
  • Graph: These databases can store graphical data, which are like hierarchy data, where data is more of the node based.
  1. Column: Accumulo, Cassandra, Druid, HBase, Vertica
  2. Document: Apache CouchDB, Clusterpoint, Couchbase, DocumentDB, HyperDex, Lotus Notes, MarkLogic, MongoDB, OrientDB, Qizx, RethinkDB
  3. Key-value: Aerospike, Couchbase, Dynamo, FairCom c-treeACE, FoundationDB, HyperDex, MemcacheDB, MUMPS, Oracle NoSQL Database, OrientDB, Redis, Riak, Berkeley DB
  4. Graph: AllegroGraph, InfiniteGraph, MarkLogic, Neo4J, OrientDB, Virtuoso, Stardog
  5. Multi-model: Alchemy Database, ArangoDB, CortexDB, FoundationDB, MarkLogic, OrientDB

Where to use No-SQL databases?

No-SQL databases are used main when the data is huge and needs 100% availability. These databases are mainly used for report generation and data analysis where data is huge and generate report using it. As the scalability is easy, it will be database keep growing like data capture for user action, capture log data, audit data.  As also used as back-end databases for micro-service architecture where each service need specific data. As some of the databases use JSON object, it is used for light weight web application where data is stored directly as JSON and read as JSON back. Some of the No-SQL databases (key-value type) are used to store and maintain session across multiple application servers.


Why is this part of ATG Blog?

Coming Soon.... “ATG to Cassandra (No-SQL) integration plugin code”.

Wednesday, April 8, 2015

Single Page Checkout in ATG

Today as most of the e-commerce sites are focusing on how to make the user experience more smother and easier, which ultimately result in more order conversion. One of the step words making the user experience better is Single Page Checkout (SPC). Let’s see what is SPC, how can it be implemented using ATG and how will it be achieved using custom made frameworks to make the more easier and simple to develop and maintain.

What is SPC or Single Page Checkout?

It is basically make the user add the details and complete the checkout in single page instead of moving through multiple pages. As browser started supporting the AJAX functionalities, this feature is more easy to implement, as the forms are submitted using Ajax and response are handled using JavaScript .
We really do not need any OOTB support as the major change is just how we show it user and not the server side updating the data. We already have the OOTB handlers which are defined for handling Shipping, Billing, payment and order confirmation. It is our designing skills, how do we use them to achieve the same, as till now the most of the JSPs will be customized as per requirement.

How SPC can be implemented using ATG?

                The whole functionality can be divided into Ajax Form submission using JavaScript, Handling the request and perform the validation and biz operations & response which can be xml or json which will be used by the java script to display.

Let’s have a look at different sub functionalist involved:
Ajax Form submission using JavaScript: We have lot of JavaScript framework which provides the OOTB APIs for Ajax form submission and render the response like JQuery. It basically have event listener for the submit button click and serialize the form and submit it using Ajax feature, it also provide a call back functions to perform the action once receiving the successfully (HTML status as 200) response.

Form Handlers: We have the OOTB from handlers like shippingGroupFormHandlers, billingFormHandlers and paymentGroupFromHandlers can be used to perform the validation and Biz logic for updating the order with respect to users input data. It will be work as we used them in multiple page checkout system.

JSP & Droplets: The response will be generated using JSPs and droplets. It can be in various formats like HTML, Json & XML.

Different Issue and possible solution or designs

                When we work with the checkout, one of the major issues is various checkout flows and handling them in SPC is added issue, as JSP is not refreshed and till the whole checkout journey is completed. We handle the flow in the form handlers, but as seen above we have different from handler coming in to perform different task at various level of the checkout journey. Adding the code in form handler to decide the next action/form in form handler will spend the code across and make the modification and maintains of the code more difficult.

It is good to have the whole logic at once place to decide the status of order & the next action/ form which is to be populated to user. It can be achieved by combination of the 2 simple components of ATG, JSP and Droplet.

We can create a JSP to which all the request will be made, and form may be submitted to different from handlers. So the control will first come to one of the form handlers, after that the JSP will be involved, where JSP will call the Droplet. Droplet will go through various data check to decide which all oparam will be involved to generate the respective section and forms. 



Wednesday, November 5, 2014

StoreCreditCard in ATG

This page will provide you overview of the store credit card, idea about store credit and how it work in atg application.  We trying to answer simple question like what is store credit card, how it is used in retail domain, how it works/implemented in ATG?

What is store credit card?
Store Credit Card is a kind of credit card where the role of bank is played by the store, and only the payment can be made in the store or at the online site of the issuing retailer. It can also be consider as mode of payment, where the credit is provided by store itself and can be used to pay for the orders.

By now I guess you might have got idea about store credit card, now let’s move to next question 

How it is used in Retail domain?
In the retail domain many giant company provide some point or credit amount which can be used to make purchase in their store or online sites. It will be for many reasons as sometimes it is goodwill gesture for the new customer, or sometimes it is a kind of discount when customer is not total happy with the product but will to pay less.  It return of point will be in form of the point in the loyalty card, or return of money in the credit card which is maintained by the store, so that it can be used for paying for other orders.
  • The loyalty points more over when purchasing of certain product or make a purchase of more than a fixed amount. Loyalty points are than converted into some discount coupons which can be redeem against the order for the offer.
  • Store Credit card are the amount when are provide by the store or customer agent in some special case, than you will be having a fixed amount when can be used to make payment at store POC or at the online site of the same retailer

How it works/implemented in ATG?
This is more of a technical detail of the how store credit used in ATG; it will be implemented in different way in different retail platform, even in the ATG platform we will find it working in different style as per the customization made in accordance to the retailer. But any where it is another mode of making the payment.

     Introducing the store credit card make the site as multiple payment/ split payment capable site as every site will be having the basic credit card payment mode enable. So it good to just have an idea about the “how split/ multi payment mode working in the ATG”. In simple term the user will be able to use multiple ways to pay the order, and mean while user can choose to pay in multiple way with amount spilt across the different mode.

     Store credit card is part of the claimable repository in ATG and added in the system by CSC Agent or any third party integration (If the credits are distributed/maintained outside ATG system).
Store credits are related to profiles using the property owner Id which is a profile ID of the user which it belong to. When user is taken to payment page, we can display the available store credit and the amount each store credit card have with them. Store credit card can be used partially or completely as on the implementation of the retailer. Once the user decide to use the store credit card, than he/she will have to add the store credit card number and the amount need to take from SCC. User can use multiple SCC (as per OOTB) functionality, which can be customized as per the retailers requirements.

     Each SCC as consider as a different payment group and are added to the order with its order payment relationship will have the amount which needs to take from the SCC.

     It is will be consider same of the credit card, in the ATG system, by say this we mean that the store credit payment group will also have 3 major functionality associated to it , that is authorize, debit and credit.

Below are the task performed as part of the 3 functionality.
Authorize:  It will add the amount to the authorized amount property of the store credit card, as it will be locked for the order and cannot be used for the other order. It is similar to the authorization calls which are made the bank in the case of credit card, but as our site itself is act as bank for the store credit card, we don’t have any external call to be made.

Debit: It will debit the amount from the credit card, and the amount will reduced the amount in the amount Remaining property of the credit card, and now this means that the amount is consumed by the store, the amount is also reduced by the authorized amount property. It is similar to that of a settlement which is made to bank when order is fulfilled. This call is also made in the fulfillment pipeline of the commerce pipeline.

Credit: It will credit the amount back to store credit card & the amount will be added to the remaining amount and it is done in case the order is return or the refund performed whereas the order was paid using store credit card. It is similar to that of refunding calls made to back to return back the amount in case of order return or refunding is performed by user/store.

Let us see some of the important classes which are involved in the store Credit implementation
  •  AvailableStoreCredits: It is droplet used to display the store credit information to user. It will take the profile as input and provides list of store credits as store Credits, and also the total (sum of the entire amount in store credits) amount of the credit which can be used to pay for the order.
  • PaymentGroupFormHandler: We don’t have in build  functionality, we can add the custom code to add the payment group and the CommerceIdentifierPaymentInfoContainer, than the call the applyPaymentGroups, to apply the store credit card.
  • StoreCredit: This is the bean class for the store credit item descriptor in order to handle the payment group related to the store credits. It hold the key information need to related between the order and the store credit, such as when it was applied, amount which is taken from the SCC,  profileID,  StoreCreditNumber,  authorized status object, debit status object and credit status object.
  • StoreCreditStatus : This is bean class for the Store credit Status Item descriptor in order repository. It is property in the storeCredit, which hold the information related to status, like transaction ID, Transaction time, expiration time and type of status like authorization, debit or credit.
  • OrderTools:  Order tools is one of the important and main component as it hold the information about the bean to item descriptor mapping and bean to type mapping for creating the store credit.
  • GenericStoreCreditInfo : It is bean class used by the store StoreCreditCardProcessor , it contain the basic information related to store credit to perform the authorization, debit or credit, It is input parameter for the method and filled in the ProcCreateStoreCreditInfo to passed it the StoreCreditCardProcessor to perform the operations.
  • StoreCreditCardProcessorImpl  : It is class which is called by the paymentManager to perform the authorization, debit or credit related to store Credit. It classes the respective method of claimable Manager to perform modification to store Credit Card and update the order with appropriate Status Information.
  • PaymentManager: It is component which runs the processes through payment chain to perform the 3 operation on the store Credit. It adding the required data in the args & runs the process chain which intern call the processing methods to perform operations and also update the states in payment Group (store Credit). It also has the mapping of the PaymentGroup bean (StoreCredit) to the respective processing components. Chain name mapped to the bean class.

 The Store credit processing can be called from the Payment Pipeline and commerce Pipeline.