Bye Bye 2015! HNY 2016!

Another year has passed. We are towards the end of the year and It is time to reflect on 2015. Every year, there will be always good and bad things. This year is no different in that way for me. May be the percentage of not so good was more in 2015 :(

Good things to remember in 2015

  • Towards the end of 2014, had an opportunity to work on creating a SaaS based Real time streaming solution for a large security company. 2015 started with the win of this project and what a way to kick start the year!
  • Got an wonderful opportunity to work on integrating data from various marketing and digital channels and create a data lake for the world’s leading postal company.
  • Got an opportunity to work with a team to implement in-memory (Hana based) business intelligence platform for the world’s largest ketchup manufacturer. Worked with their Enterprise Architecture team to create a security blueprint and IDM implementation roadmap.
  • Did a fair amount of work with DevOps in 2014. Got a great opportunity to present/demonstrate DevOps capabilities to a great set of technical folks. Helped in winning the deal.
  • Got opportunities to understand and work with teams to implement HOLAP and Stream analytics.
  • Got exposure to Customer Service in an Enterprise SaaS world.
  • Understood the difference between Reporting and Dashboards (in a hard way).
  • The way I looked at SaaS architecture changed dramatically. Now, I look at the Maturity models based on the business landscape.

Personally, the start-up bug started biting me towards the end of 2014. We were close to starting a Services business on our own. Explored couple of Product ideas as well. Took one of the product idea and detailed it to a great extent. Did a fair amount of Customer Discovery and Validation exercise.

What I initially thought as a vertical add-on, eventually became a mammoth horizontal platform idea. Realized that it would take at least 3 to 4 years before can do anything with it. Dropped it after living & breathing the idea for almost 3 months!

Learnt a lot in the whole process.  Though it was a failed attempt, at least learnt what it means to take a hypothesis, visualize, conceptualize and start something :)

If you don’t code on your own, better don’t get into the start-up thought  process.

Thanks to Neelam, Subhajit, Sudhakar, Ashish, Gayathri and Sendhil for their help in the validation exercise.

Made some not so good (crazy) decisions with my career this year! Though it was not a great decision, at least i don’t have the thought that I haven’t tried anything new anymore.

No matter how many mistakes you make or how slow you progress, you are still way ahead of everyone who isn’t trying ~ Tony Robbins

From a technology front, 2015 was an year full of Data related projects for me. My understanding on this space has become much better in 2015.

Overall, 2015 was a decent year! An year filled with career adventures, self-disruption, lots and lots of learning’s!

Hope 2016 will be a better year!

Happy New Year 2016!

Wishing you all a Very happy new year!

May the NEW YEAR bring you GOOD HEALTHPEACE and HAPPINESS.

Image Source: http://happynewyear2.com/tag/happy-new-year-2016-greeting-cards/

“Data is long-term, Applications are temporary.”

Think data first. Data is long-term, applications are temporary. I recently happened to read this in one of the blog post. I couldn’t agree more. Data remains one of the most strategic projects for most of the companies.

Every fifth person you talk to, every other start up you come across and job postings has something or other to mention about data, analytics etc. But, when I speak to the guys whoever I come across in my ecosystem, lot of guys think it is only doing cool stuff in R.

Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.

If someone is an application developer for the last 10 years, can he/she suddenly become an expert in statistics and become an expert in Algorithms? Suddenly you start calling yourself a Data Scientist? May be… Nothing is impossible. But if that’s what is your passion you wouldn’t be an application developer for the last 10 years. Right?

Is there anything else one can learn and contribute in the data world? Thought of sharing couple of valuable links which can give you a very good idea on the various aspects and where one can fit in.

#1 Will Balkanization of Data Science led to one Empire or many Republics? Via http://www.kdnuggets.com/2015/11/balkanization-data-science.html
#2 Becoming a Data Scientist via http://nirvacana.com/thoughts/becoming-a-data-scientist/
#3 Difference between Data Engineering and Data Science via http://www.galvanize.com/blog/difference-between-data-engineering-and-data-science/
#4 The world of data science: Who does what in the data world? Via http://cloudtweaks.com/2015/11/booming-world-data-science/matrix-1013612_640

Data is one of the hottest stack right now and it is growing at a crazy speed. It would be extremely difficult for any single individual to cope up with this change unless one’s basics are right.

Once you have the basics right, it is about Meta learning and evolving from there.

Working with various large scale data related projects for the last 15 months, following is my high level list of items one need to know to have a reasonable understanding of data (Big/Small). This list is no specific order. :(

General A Basic overview of what is Descriptive, Diagnostic, Prescriptive, Predictive and Cognitive Analytics? Understanding of the concepts and difference
Data Warehouses
  • OLAP VS OLTP
  • Dimensional Modelling (Star Schemas, Snowflake Schemas)
  • Difference between Multi-Dimensional, Relational, Hybrid
  • In-Memory OLAP
No SQL Databases
  • CAP Theorem
  • If you are from application development, this is where the most important change would be. So far, you would have dealt primarily with Key-Value stores and Document Stores. For Analytics purpose (Write Efficient), it is important to start understanding column databases (E.g.: Cassandra) and Graph (E.g.:Neo4J). This is again a big shift from what you would have done as an application developer. Spend some time on it.
  • In-Memory databases in general.
  • Apart from Cassandra and Neo4J, get an understanding of what MemSQL offers. Yes, it is MemSQL and not MySQL J seems very impressive.
Outside EDWs
  • MPPs/PDWs – Difference between traditional EDWs and MPPs?
  • DWH on cloud AWS Redshift, Azure SQL Data Warehouse
Data Mining
  • What does it mean?
  • Data Mining Algorithms
Hadoop
  • Hadoop and Various Hadoop Components
  • When to use Hadoop?
  • Parallelization and Map Reduce Fundamentals
Outside Hadoop
  • Difference between Hadoop, Spark and Storm (I personally prefer SPARK. RDDs give me the same comfort what I had with ADO.NET)
  • When to use Hadoop/Spark/Storm over MPP?
ETL
  • Data Munging/Wrangling
  • Scrubbing
  • Transforming
  • Reading and Loading Data
  • Exception Handling
  • Jobs/Tasks
Real time Analytics Working with Stream: Real time Analytics is something everyone talks about. But without understanding what it means by Stream processing you will never be able to figure out this.
From an application background

  • Reactive Architecture (Responsive, Resilient, Elastic and Message driven)
  • Understand the difference between an Event and a Transaction.
  • Event Processing(CQRS, Actor Model[Akka], Complex Event Processing)

If you don’t understand the above, then it would be difficult to move forward. Spend time on these before moving forward to other items
Messaging/Data bus

  • Kafka

Processing Streams

  • Spark/Storm

Lambda Architecture

Machine Learning Machine Learning

  • Difference between Data Mining and Machine Learning
  • ML Algorithms

Couple of very good posts to read in this
Machine Learning for Programmers: Leap from developer to machine learning practitioner via http://machinelearningmastery.com/machine-learning-for-programmers/
What Every Manager Should Know About Machine Learning via https://hbr.org/2015/07/what-every-manager-should-know-about-machine-learning
Most of what we are doing can be achieved at some level using Excel Analytics Data Pack. In fact, I would say Excel is the most powerful tool out there.

Recommendation Engines
  • Collaborative Filtering
  • Content-based Filtering
  • Hybrid

Once you are clear with the concepts start implementing using Apache Mahout

Communication Protocols
  • JSON, AVRO, Protocol Buffer, and Thrift: If you are from application development – you would have used JSON extensively. It is time to understand the other ones as well. I keep arguing this with my friend Sendhil (IMO, AVRO seems to be the way to go – where things are evolving and need for self-documentation – Cowboys Friendly).
Time Series
  • Modelling
  • Databases (OpenTSDB)
  • Forecasting
  • Trend Analysis
Modern day HOLAP Engines
  • Apache Kylin (My favourite at this point)
Data Visualization Self-Service is the Mantra here. Read this article: Data Scientists Should be Good Storytellers

Most of the people in an organization cannot understand the outcome of analytics, however they do need the proof of analysis and data. Data storytellers incorporate data and analytics in a compelling way as their stories involve real people and organizations” via https://dzone.com/articles/data-scientists-should-be-good-storytellers

  • How to represent data (Graphs/Charts)?
  • Excel Power Pivot/ Power BI (Polybase)
  • Lumira
  • D3.js
Deep Learning Though it may or may not be important at this point, try to understand what is deep learning. Read this : Deep Learning in a Nutshell: Core Concepts via http://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
Data Lake One of my favorite topic and something I learnt after burning my hands is with data lake

  • Understand what Data Lakes mean? Why do you need one? How to build a data lake on your own?
  • Extract Load and Transform (ELT)
  • ELT vs ETL

Read this: https://azure.microsoft.com/en-in/solutions/data-lake/

Language Though there is a bunch of things to do with Python, R, Java etc. My choice is Scala (I love the way the language allows you to express. Wish someone can afford me as a developer again J)

If you have a good grasp on above, then it is time for you to figure our when to use what (Creating Solutions).

 “If all you have is a hammer, everything looks like a nail”

Read this:  The Ethics of Wielding an Analytical Hammer via http://sloanreview.mit.edu/article/the-ethics-of-wielding-an-analytical-hammer/

Data is having an impact on business models and profitability. It’s hard to find a non-trivial application that doesn’t use data in a significant manner ~ Ben Lorica, O’Reilly Media

Ok, this looks like a large list. Where do I start?

  1. Focus on the basics. Get a good overview of the ecosystem
  2. Decide your area of specialization.
  3. Focus on your specialization and build skills.
  4. Iterate and change course as required.
  • If you are more than 10 years of experience, understand the business situation and figure out when to use what. May be pick 1 or 2 items and start implementing in your environment.
  • If you are less than 10 years of experience, pick up a scenario and try to implement this and see if it makes any business sense.

What I have not covered in the list? I haven’t gone into the details of

  1. Hadoop Ecosystem and components (Pig/Hive etc.)
  2. Algorithms
    1. Nearest Neighbour
    2. K-Means Clustering
    3. Linear Regression
    4. Decision Trees etc.
  3. R in detail
  4. Infrastructure
    1. Env Setup
    2. Zookeeper, Yarn, Mesos
    3. Replication
  5. Vertical Industry Solutions
  6. Operational Systems (like Splunk)
  7. Data Governance

I keep hearing/seeing people who have never seen more than 1 GB of data saying that they do Big Data Analytics. Don’t learn or do something for the sake of doing it.

There is no short cut to a place worth going.

My favorite books on this topic.

If you want to know more about what I am learning, you can follow me in Twitter

Happy Learning!

Airtel : Customer Service : Does it really exist?

I have been having issues with my Airtel broadband connection for the last 6 months regularly. I would have called their customer support at least once every 15 days. Somehow things were working till last Friday, and whatever has happened in the last 5 days, i have decided to terminate my internet connection with airtel (after 8 years).

Find the Sequence of Events here.

Fri23rd Oct 1310day 23rd (1:10 PM)
I started getting only 3 Mbps speed (instead of 16 Mbps). I called their customer support and the support guy spent 15 minutes with me over phone trying to change settings, restarting etc. After 15 minutes, he said he will raise a compliant and I will get a resolution in 24 hours.

Friday 23rd (3:00 PM)
I get a call from one of the field engineers regarding the issue and he asked me what time he can visit me to troubleshoot the issue

Friday 23rd (5:00 PM)

A field engineer visited my place. Tried something for 15 minutes (note: i had a slow internet connection. After the engineer did something my internet connection stopped working). He said he is going to get a test modem and will be back in 10 minutes.

I was waiting for him till 6:45, lost my patience and called him. No response. After 30 minutes (2 tries), he finally responded to my call. He said there is some issue in the backend and it will get resolved in an hour (23rd 8:00 PM).

Satu24th Oct 0841rday 24th (08:39 AM)

Internet connection is still not working. No calls from the support team. Got furious. Called the Customer Care and said I had enough and I am cancelling my connection.

 

Saturday 24th (07:00 PM)

Karthik from Airtel Special resolution team calls. He spoke to me for 30 minutes. Tried understanding the issues and said he is shocked that I had to call them so many times in last 6 months. Asked me to give him a chance and he will waive my fees for 15 days. After 30 minutes of discussion, I said ok… will give you one more day, but make sure my connection works.

Sunday 25th (07:00 PM)

Karthik from Airtel Special resolution team calls. Asks how is your connection speed now? Is everything working? I told him that no one from Airtel reached out to me in 24 hours and how will the issue get resolved. Karthik says he got an email from his team that the issue is resolved, hence he wanted to talk to me and show the assurance.

He realized that nothing has happened and wants one more day to fix the issue. Since I don’t have anything to lose, I told him Fine, but I don’t have any hopes.

PS : Thanks to Karthik. I think he gave a sincere try.

26th Oct 1429Monday 26th (14:27 PM)

I get a SMS from Airtel that thanks for contacting us and your issue will be resolved by 11:00 AM on 27th. (Wow!! I didn’t call them. Great SLA game guys!)

Monday 27th (19:00 PM)

Someone from Airtel field support calls and asks for my house location (?). He says that I have asked for a replacement router.

26th Oct 2140Monday 27th (21:45 PM)

I call airtel support again and said, please CANCEL my connection. I don’t have any hope of getting internet connectivity now.

Tuesday 28th (8:30 AM)

Support engineer from Airtel calls and says When can he come and look into the issue. He mentions that you have raised a compliant only yesterday. I lost my cool and told him that I cancelled the connection.

Tuesday 28th (10:00 AM)

Another call from some service team. Wants one more chance. I told her that I don’t have any energy left to explain her and thanked her for calling after 5 days.

Tuesday 28th (11:00 AM)

Another call from Airtel Support saying, I have asked for WIFI connection. I told her, Thanks for calling and disconnected my call.

Few more pointers

  • My airtel cable has been replaced atleast twice or thrice in last 3 months quoting problem with the cable(?)
  • My modem has been replaced only couple of months back
  • I have called their so called highest level of escalation (appellate) also in last 15 days. Not even a call back after that to check whether my issues are resolved.

Following are my conclusion after 5 days of frustration.

  • I understand Airtel doesn’t have lot of competition in Indian market. They have the maximum share.
  • I have this connection for last 8 years. Airtel don’t seem to be caring about their existing customers. There are enough people in India to give business to them and their business seems to be growing.
  • SLA games with their outsourced field service seems to be a normal way of life.

I decided to pay the 1500 RS (What I was paying to airtel) to some other provider. I understand that it is not lot of money for a company like airtel. But I don’t want to spend 1500 for 20 days of internet connection every month.

Quote : 7 Rules of Life

  1. Make peace with your past so it won’t screw up the present.
  2. What others think of you is none of your business.
  3. Time heals almost everything. Give it time.
  4. Don’t compare your life to others and don’t judge them. You have no idea what their journey is all about.
  5. Stop thinking too much, it’s alright not to know the answers. They will come to you when you least expect it.
  6. No one is in charge of your happiness, except you.
  7. Smile. You don’t own all the problems in the world

via
http://emilysquotes.com/7-rules-of-life/

Quote

“Cities were always like people, showing their varying personalities to the traveler. Depending on the city and on the traveler, there might begin a mutual love, or dislike, friendship, or enmity. Where one city will rise a certain individual to glory, it will destroy another who is not suited to its personality. Only through travel can we know where we belong or not, where we are loved and where we are rejected.”
― Roman Payne, Cities & Countries

Microservices : Reading List

Modern day businesses requires agility to survive and to be a leader. If you translate this business requirement into technology requirement, this means X Deploys a day (Time to market).

The big bloated, complex applications that we have built over a period of time is not allowing us to meet this X Deploys a day without compromising quality. If there is a way to decompose the big bloated monolith application blocks into smaller chunks it will help the business to extend, manage and deploy and eventually the X Deploys a day could become a reality.

How do we get there? Is there a way to achieve this? Microservices (lots of small applications) is one of the ways that could help in achieving this.

Microservices means developing a single, small, meaningful functional feature as single service, each service has its own process and communicate with lightweight mechanism, deployed in single or multiple servers.
Source

Additional Reading List
The Twelve-Factor App
http://12factor.net/

Microservices Reading List
http://www.mattstine.com/microservices

Understanding Microservices
http://kpbird.com/2014/11/Monolithic-vs-MicroService-Architecture/
http://shakayumi.tumblr.com/post/95688359079/whats-the-big-idea-with-microservices
http://kpbird.com/2014/06/Microservice-Architecture-A-Quick-Guide/
http://www.infoq.com/articles/microservices-intro
http://www.slideshare.net/mstine/microservices-cf-summit
http://java.dzone.com/articles/microservice-architecture
http://tech.gilt.com/post/35711763311/how-gilt-com-give-came-to-be

Microservices Architecture and Scalability
http://www.pst.ifi.lmu.de/Lehre/wise-14-15/mse/microservice-architectures.pdf
http://technologyconversations.com/2015/01/26/microservices-development-with-scala-spray-mongodb-docker-and-ansible/

Microservices Patterns
http://blog.arungupta.me/microservice-design-patterns/
http://microservices.io/patterns/index.html

Simon Brown’s Video : Software Architecture & Balance with Agility
https://vimeo.com/user22258446/review/79382531/91467930a4

Books
Building Microservices
Software Architecture for Developers

Frameworks
http://gilliam.github.io/concepts.html
http://projects.spring.io/spring-boot/
http://fabric8.io/
http://azure.microsoft.com/en-us/campaigns/service-fabric/