Wednesday, January 10, 2007

CSVPNG – A Handy Utility for Readily-Reusable (R-R) Data

The T4 & Friends project at HP we mentioned in earlier posts was at its foundation a project that converted trend data from its original complex format into Readily-Reusable (R-R) format as CSV files. We discovered that any trend data could be so converted and that in most cases the transformation was straightforward, relatively painless and often quite rapid.

As a result of the T4 project, more complex data from a variety of sources was converted to R-R format. Those with interests in this data were then able to enjoy the possibility of looking at it with Excel, loading it into their favorite database or in many cases, analyzing and reporting on what the data meant using TLViz.

We sometimes find it convenient to use the metaphor that

  1. data is collected ‘upstream’,
  2. it is saved in R-R format in historical ‘reservoirs’ for possible future use, and
  3. it is fed, as needed into ‘downstream’ tools for analysis, reporting, collaboration, …

Since no single such downstream tool does everything you might want, the availability of growing reservoirs of R-R data in CSV format systematically increased the possible value of building new tools or incrementally improving existing tools to take advantage and convert the potential value into real value.

CSVPNG (CSV file to Portable Network Graph Utility) is one such development that has blossomed in the world of ready-reusability. Developed in 2003 by Pat Moran of Hewlett Packard, CSVPNG’s initial purpose was to live up to its name – that is to convert CSV trend data automatically into a set of PNG graphics. These were then all embedded in a single output HTML page. Just this one feature alone added considerable value by automating the output of any sets of standard charts that you might always want to examine.

CSVPNG has then proceeded over the past 4 years to transform itself through a series of more than 100 improvements and extensions. You can download a copy of this great tool from the T4 & Friends web page or from TrendsThatMatter.

CSVPNG is written in C and has been made available for a steadily growing number of platforms including DOS under Windows, Linux, HP-UX, and OpenVMS on both Alpha and Integrity servers.

Here’s a brief rundown of some of the things you can do with R-R CSV data using CSVPNG. For a full list of possible benefits, be sure to examine the CSVPNG.TXT file thoroughly after you have downloaded the kit. If you are generating readily-reusable data in CSV format, CSVPNG will surely save you time in managing this data and will help make sure that you don't have to reinvent the wheel.

  1. Automatically create a selected set of graphs and format as html or PDF
  2. Slice and dice files to select out just the factors you want and just the time periods you are interested in. The trimmed down result can be automatically graphed and/or used to generate a new reduced file that satisfies the R-R rules. These files could then be fed into other tools such as TLViz or Excel.
  3. Combine R-R data from several files into a single synchronized file
  4. Search using expert rules for “interesting” conditions and then display only those charts for cases where one or more conditions were met
  5. Carryout column arithmetic to create new factors by recombining several original factors.

Here’s a sample graphic output from CSVPNG showing how it can find and highlight a peak period as well as display the original data with moving average value all in a single chart. The blue represents the raw data value, the red line shows the moving average, and the highlighted time interval identifies the peak interval.

The more R-R data you have to deal with, the more valuable you are likely to find the capabilities for CSVPNG for helping you automate and keep on top of all this data.

No comments: