Big Data for beginners

I am pretty sure almost everybody connected to IT in anyway must have heard the term Big Data. Now, before you close the article thinking it doesn’t have to do anything with Siebel, STOP. What I have realized in my new stint is that technology is changing too fast. If you confine yourself with just one skill, you will soon become dinosaur of the IT world and that means that you well on your way to become extinct. Even if you are not proficient you should keep yourself knowledgeable with what is happening in IT world.

BigData is the buzz word today just like cloud was a few years ago. So, this post is intended to introduce you the basics of Big Data and what role it plays in the CRM world. So, let’s get started:

What is Big Data?
You can find a lot of mumbo jumbo and complicated definitions online if you try to search but what I understand is that a large amount of datasets both structured and unstructured is known as Big Data. So, you might ask why just not use the term Data?

I agree, it can just be called data but then it wouldn’t catch your eye, will it? So, some marketing wizard devised this term just like somebody came up with term Cloud. Now that we know the Big Data is just plain old data in massive volume then that brings up the next question.

Why there is so much hype around Big Data?
The hype is because the volume of digital data increased so drastically due to which the traditional methods to crunch and analyzed this data failed miserably. Traditional data warehousing tools and techniques analyze this data were too expensive and slow to be of any use. This gave birth to new set of tools that were created specifically to solve these problems. The aim was to crunch massive amount of data (Think petabytes) in least amount of time and with least cost.

These tools promised to provide businesses to find out meaning out of this vast amount of meaning less data in jiffy. The message that got conveyed was that big data can transform your business and solve all your business problems. No wonder everybody thought they have found the ultimate business intelligence tool and it became so popular.

What is Hadoop?

Hadoop is some time used as a synonym to Big Data but it is not true. As I pointed above Big Data is just a concept a marketing term created to draw attention. Data was always there but now the quantity has become massive and sources or formats have become diverse. So, simply put Hadoop is a java based tool to implement Big Data concepts to crunch data. It can use your cluster of commodity hardware to store and analyze this data. It has a programming component Map Reduce to write algorithms and analyze this data.

Is Hadoop the only tool available?

No, Hadoop is not the only tool available to implement Big Data concepts but it is by far the most popular tool. In a practical environment you will always use a bunch of tools in addition to big data to solve your business problems.

What kind of jobs are out there for Big Data?

Jobs in Big Data can be classified broadly into following categories.

Programmers: People who actually write MapReduce algorithms using Java or other programming languages.
Administrator: People who setup big data tools and perform administrative (very similar to Siebel Administrators) tasks such as starting/stopping services etc.
Architects: People who are responsible for creating a road map and identifying different Big Data tools that are needed based on business problems that needs solving.

What is the future of Big Data?

The amount of data is just going to increase as is the complexity and nature of data. There would always be need of new tools to explore this data. So, I would say that Big Data has just started and has a long road ahead. Big Data is still in conceptual stage and most of the projects undertaken have not been able to justify the investments made but things are changing. Tools are becoming sophisticated, easy to implement. Hadoop 2.0 is completely different from original Hadoop and there are many more such as Hortonworks, MapR, Spark that are making waves.


These were some of the questions I thought to get you guys started with Big Data. In case if you are interested to know more let me know your specific queries or questions. Do Let me know your thoughts via comments.

7 Responses to Big Data for beginners

Leave a Reply