htrunk.com
How Apache Spark complements Hadoop
Apache Spark is a general-purpose lightening fast data processing engine, suitable for use in a wide range of circumstances. Spark leverages the hadoop’s strength for cluster management and data persistence and compliance. Spark was developed in 2009 in UC Berkeley’s AMPLab and open sourced in 2010, Apache Spark. According to stats on Apache.org, Spark can “run