Review on InfoZoom

 

Presented on Feb 1, 2006


OverView

 

InfoZoom is a visualization software that enables users to display and quickly analyze large amount of tabular data.  Although InfoZoom is mainly designed for business analysis, it can be used in many other areas.  InfoZoom works like a zoom lens.  Users can easily magnify any section of information and quickly find the data dependencies hidden in dataset.

This software tool is developed by HumanIT, whose headquarter is in Bonn, Germany.  HumanIT is a wholly-owned subsidiary of the ERP provider, proALPHA Software AG, Weilerbach.

In this report, we will firstly discuss how to get this software and the data source format of InfoZoom.  Then we will investigate three view modes of Infozoon.  Finally, we will review some interesting and useful features of InfoZoom, including Filter, Derived Attributes, Color Coding and Graphics Report.

 

How to Get

 

You can get this software from http://www.infozoom.com/enu/download/index.htm.  InfoZoom Viewer is free, but you can only view InfoZoom files using it.  If you want to import datasets of yourself, you must download InfoZoom Light Edition.  You need buy it or just get a trial version of thirty days.

 

Data Source

 

InfoZoom defined a kind of file format used by itself whose suffix is foc, fox or fop.  It also provides an import tool to convert text file or Microsoft Excel file into InfoZoom file.  Moreover, users can access their data through ODBC or OLE DB connection.

To demo Infozoom, I downloaded Cars dataset from http://lib.stat.cmu.edu/datasets/ and converted it into Microsoft Excel file, and then imported into InfoZoom.  This dataset descried 392 cars.  Each car has seven attributes, including MPG, Cylinders, Horsepower, Weight, Acceleration, Year and Origin(1,3,'USA','Japan','Europe').

 

Three View Modes

 

InfoZoom provides three view modes.  They are wide table, compressed table and overview.

The wide table is a classic 2-D table.  Each record occupies a column and each attribute occupies a row.  Users can scroll the table vertically and horizontally if InfoZoom window is not big enough to hold all of records or attributes.  Figure 1 shows a sample of wide table.

Figure 1 Wide table of Cars Dataset

In the compressed table, all object columns are pushed together to fit all records onto one screen width.  Numerical values are represented by the vertical position of dots.  The numbers themselves also appear if there is enough space.  Horizontal scrolling is no longer needed.  In this view mode, users can sort individual attributes in ascending or descending order.  Therefore, users can easily obtain distribution of one attribute or the relationship among attributes in visual styles.  In figure 2, Cars dataset is presented in a compressed table and sorted by the number of cylinders.  We can easily find the roughly positive relationship between Cylinders and Horsepower.

Figure 2 Compressed Table of Cars Dataset Sorted by Cylinders

In the overview mode, the contents of each attributes row are sorted independent of one another.  The identical value is placed in a cell and the width of the cell represents the frequency of this value.  We can easily find the distribution of all attributes by overview mode, but one column does not correspond to one object any longer.  Figure 3 shows the overview mode of cars dataset.

Figure 3 Overview of Cars Dataset

Filter

Filter is a useful feature to facilitate users to search data or find patters in a dataset.  To use this feature, users need firstly select a specific value or value range on an attribute, so those records with the selected values on this attribute are selected.  And then users can zoom in these selected records to look for patterns of the dataset.  In figure 4, those cars with 8 cylinders are selected and zoomed in.

Figure 4 A Magnified View of Cars with 8 Cylinders

Derived Attribute

Derived attributes refers to new attributes defined by users whose values are calculated from the values of one or two already existing attributes.  In figure 5, we defined a derived attribute called MPG_On_Cylinder based on MPG and Cylinders.  InfoZoom will split the dataset into groups by the value of Cylinders, and then separately calculate the average value of MPG for each group.  Figure 6 shows the compressed table of the dataset with two derived attributes.

Figure 5 the Dialog to Insert Derived Attributes

Figure 6 the Compressed Table of Cars Dataset with Two Derived Attributes

 

Color Coding

 

Color Coding means that users can assign records with different color by the value of one attributes.  In figure 7, cars with 8 cylinders are red, 6 cylinders are green and 4 cylinders are blue.

Figure 7 Color Coding by Cylinders

 

Graphics Report

 

InfoZoom can generate graphics reports with a bunch of styles, including Pie, Donut, Bar, Horiz Bar, Line, Area, Point, Bubble, Volume, and so on.  Figure 8 shows a graphics report to show the relationship between Cylinders and MPG_On_Cylinder, a derived attribute defined in the previous section.

Figure 8 the Graphics Report of Cylinder and Average Values of MPG