Want to see how fast we can summarize a billion rows of data? Don’t blink. It takes just six seconds.
Remember the good ol’ days, before Apache Spark 2.0, when life was simpler and processing 173 GB of data took a whole 30 seconds? You actually had time to sip your coffee before the bar charts rendered.
Those days are gone. In this post-Spark 2.0 world, using the combined force of the Syncfusion Big Data Platform and the Dashboard Platform, you can process and visualize that same amount of data within six seconds. Advances in Apache Spark have increased Syncfusion’s data processing speed by a factor of five to ten times, generally speaking.
You can find all the details on how this is done by reading “Using Business Dashboards to Summarize a Billion Records in Seconds,” a whitepaper written by two of Syncfusion’s big data experts.
The data used in the speed test described in the whitepaper was sourced from information the New York City Taxi & Limousine Commission collected from approximately 1.1 billion taxi trips that occurred from 2009 to 2015. That’s a lot of fares.
After using the Big Data Platform to upload the data, tune its performance, cache it, and partition it, we then called upon the Dashboard Platform to create a visualization of it.
Running on nothing more than commodity hardware that anyone can access, massive amounts of data were visualized in just a few seconds by using these Syncfusion platforms. Getting started with them is easy. Simply refer to the following Syncfusion resources, and you’ll be on your way.
- Whitepaper: “Using Business Dashboards to Summarize a Billion Records in Seconds.”
- Videos: “Syncfusion Tutorial: Big Data Cluster Manager Introduction,” and Syncfusion’s Dashboard Video Tutorial series.
- Product Webpage: Big Data Platform Homepage; Dashboard Platform Homepage.
- Documentation: Big Data Platform; Dashboard Platform.
- E-books: Spark Succinctly; Hadoop Succinctly; Hive Succinctly; HDInsight Succinctly; Statistics Fundamentals Succinctly.