Visualization has come into vogue in the past years. Historically this was the world of straight reporting and key performance indicators. This is not the case now. FlowingData has brought interactive data analysis to the forefront. This is due to the ability for us to process massive amounts of data and the uptake in what I like to call fluid data systems. If you have seen the books Beautiful Visualization, Data Analysis or Visualize This then you know there are several choices of frameworks and libraries to choose from in this area. As of late I have been asked a ton of questions about what/why/when of these libraries so I decided to put them all in one place. While no means comprehensive (e.g. I didn’t include Haskell, MATLAB,’R” et al) Here is the list. Hit me up if you find something else as I will keep a running list.
Although Axiis claims open source Flex is not open source. Axiis is an open source data visualization framework designed for beginner and expert developers alike. Whether you are building elegant charts for executive briefings or exploring the boundaries of advanced data visualization research, Axiis has something for you. Axiis provides both pre-built visualization components as well as abstract layout patterns and rendering classes that allow you to create your own unique visualizations.
Axiis is built upon the Degrafa graphics framework and Adobe Flex 3.
We have used prefuse in the past and it is very flexible. JAVA based and can be modified to run extremely efficient. That said one must have a programming background.
Prefuse is a set of software tools for creating rich interactive data visualizations. The original prefuse toolkit provides a visualization framework for the Java programming language. The prefuse flare toolkit provides visualization and animation tools for ActionScript and the Adobe Flash Player.
Prefuse supports a rich set of features for data modeling, visualization, and interaction. It provides optimized data structures for tables, graphs, and trees, a host of layout and visual encoding techniques, and support for animation, dynamic queries, integrated search, and database connectivity. Prefuse is written in Java, using the Java 2D graphics library, and is easily integrated into Java Swing applications or web applets. Prefuse is licensed under the terms of a BSD license, and can be freely used for both commercial and non-commercial purposes.
Protovis composes custom views of data with simple marks such as bars and dots. Unlike low-level graphics libraries that quickly become tedious for visualization, Protovis defines marks through dynamic properties that encode data, allowing inheritance, scales and layouts to simplify construction.
The toolkit implements advanced features of information visualization like TreeMaps, an adapted visualization of trees based on theSpaceTree, a focus+context technique to plot Hyperbolic Trees, a radial layout of trees with advanced animations -called RGraph and other visualizations.
In November 2010 the toolkit was acquired by the Sencha Labs Foundation.
Processing.js is the sister project of the popular Processing visual programming language, designed for the web. Processing.js makes your data visualizations, digital art, interactive animations, educational graphs, video games, etc. work using web standards and without any plug-ins. You write code using the Processing language, include it in your web page, and Processing.js does the rest. It’s not magic, but almost.
While more of a diagraming solution than a visualization mxGraph/JGraph is used in conjunction with the Swing library. JGraph has been providing leading diagramming software components since 2001, first with ever popular JGraph Swing library, then in 2006 with the leading edge development of mxGraph. The libraries are designed to drop in and offer you the complete range of functionality required to add complex diagramming to your application or web site.
If your into snake charming and use Python then MatPlotLib is for you. While the installation is not straightforward the flexibility is there from a user perspective. I see a huge uptake in this library and personally use it.
This library works extremely well for development on the iPhone. If you need a complete graphing functionality that is cost effective download this lib.
Core Plot is a plotting framework for Mac OS X and iOS. It provides 2D visualization of data, and is tightly integrated with Apple technologies like Core Animation, Core Data, and Cocoa Bindings.
This is a derivative of the InfoViz libs from sencha . I do not know much about them but it looks very useful.
Written in C++ the framework enables the development of algorithms, visual encodings, interaction techniques, data models, and domain-specific visualizations. One of the goal of Tulip is to facilitates the reuse of components and allows the developers to focus on programming their application. This development pipeline makes the framework efficient for research prototyping as well as the development of end-user applications. They have libraries for several implementations (Python,Open GL etc).
We tried out ggobi a long time ago. While it does appear noteworthy we opted not to utilize it on the respective project. GGobi is an open source visualization program for exploring high-dimensional data. It provides highly dynamic and interactive graphics such as tours, as well as familiar graphics such as the scatterplot, barchart and parallel coordinates plots. Plots are interactive and linked with brushing and identification.
Mondrian is a general purpose statistical data-visualization system. It features outstanding interactive visualization techniques for data of almost any kind, and has particular strengths, compared to other tools, for working with Categorical Data, Geographical Data and LARGE Data.
All plots in Mondrian are fully linked, and offer many interactions and queries. Any case selected in a plot in Mondrian is highlighted in all other plots.
Currently implemented plots comprise Histograms, Boxplots y by x, Scatterplots, Barcharts, Mosaicplots, Missing Value Plots, Parallel Coordinates/Boxplots, SPLOMs and Maps.
Mondrian works with data in standard tab-delimited or comma-separated ASCII files and can load data from R workspaces. There is basic support for working directly on data in Databases (please email for further info).
Mondrian is written in JAVA and is distributed as a native application (wrapper) for MacOS X and Windows. Linux users need to start the jar-file. It is heavily used and leveraged against ‘R’.
Once again while this is not a comprehensive list it is more than adequate to get your feet wet in the world of visualization and datascience. One of the most salient things you can do for your self is a little scripting or programming. Also a little math doesnt hurt either. So do not be scared of a summation sign. Dust off those old stat books and get to it then you can start delving into more advanced topics in BigData.
Go Big or Go Home,