Data Infrastructure, Data Pipeline and Analytics – Useful Links – September, 2016

Some of the Interesting links i have read in the last couple of weeks around Data, In-Memory Databases, Pipeline Development and Analytics.
In Search of Database Nirvana
An excellent post providing an in-depth look at the possibilities and the challenges for companies that long for a single query engine to rule them all.
https://www.oreilly.com/ideas/in-search-of-database-nirvana
http://www.slideshare.net/RohitJain0813/in-search-of-database-nirvana-the-challenges-of-delivering-hybrid-transactionanalytical-processing
Aerospike Vs Cassandra Comparison
Comparison on Aerospike with Apache Cassandra. Cassandra is a columnar NoSQL database that is great for ingesting and analyzing hundreds of terabytes of data stored on rotational disks. Aerospike is an in-memory, NoSQL database, a key-value store that can run purely in RAM and is also optimized for storing data in Flash (SSDs).
http://www.aerospike.com/when-to-use-aerospike-vs-cassandra/
An overview of Apache Streaming Technologies
A very good comparison comparing technologies around simple event processors, stream processors, and complex event processors.
https://databaseline.wordpress.com/2016/03/12/an-overview-of-apache-streaming-technologies/
Flow based Programming
Flow-Based Programming defines applications using the metaphor of a “data factory”. It views an application not as a single, sequential process, which starts at a point in time, and then does one thing at a time until it is finished, but as a network of asynchronous processes communicating by means of streams of structured data chunks, called “information packets” (IPs).
http://www.jpaulmorrison.com/fbp/introduction.html
Hadoop Deployment Cheat Sheet
If you are using, or planning to use the Hadoop framework for big data and Business Intelligence (BI) this document can help you navigate some of the technology and terminology, and guide you in setting up and configuring the system.
http://jethro.io/hadoop-deployment-cheat-sheet/
Amazon Redshift for Custom Analytics
Experience summary on building Custom Analytics on top of Redshift
https://www.alooma.com/blog/custom-analytics-amazon-redshift
Building Analytics at 500px
Experience summary on how they have built the ecosystem
https://medium.com/@samson_hu/building-analytics-at-500px-92e9a7005c83#.pdkk7xrui
How Artificial Intelligence Will Kickstart the Internet of Things?
IoT will produce a tsunami of big data, with the rapid expansion of devices and sensors connected to the Internet of Things continues, the sheer volume of data being created by them will increase to an astronomical level. This data will hold extremely valuable insights into what’s working well or what’s not.
https://datafloq.com/read/Artificial-Intelligence-Kickstart-Internet-Things/1776

Happy Learning!

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s