table of contents
STDA(1) | User Commands | STDA(1) |
NAME¶
stda - Simple Tools for Data Analysis (STDA)
DESCRIPTION¶
STDA includes some primary tools for data analysis. You can evaluate sums, averages, integrals, derivatives, histograms or probability distribution functions of 1-d data, and eventually plot the results. The programs are stand-alone tools (supporting the standard UNIX input and output pipelines) intended for data processing from the command line. It should be noted that all but one of the scripts use awk and core system utilities. For plotting you have to install Gnuplot (see http://gnuplot.info) since 'muplot' is a wrapper around it. In summary, the package provides utilities for straightforward analysis of data series where a complex analytical approach is not needed and where an ultimate numerical precision with floating-point numbers is not critical. Some general examples of application cases include evaluating usage statistics from server logfiles, determining a response time distribution from a series of queries to a [remote] service, producing a plot from multiple data files, etc.
This software should be considered as an open project to be extended with new command-line driven utilities helpful for performing common data analysis tasks. Any contributions and suggestions are welcome.
Following programs are included in the distribution:
* maphimbu - histogram builder for 1-d numerical and text data
* mintegrate - average/sum/integral/derivative of 1-d numerical data
* mmval - find minimum and maximum value in a dataset
* muplot - plot a multi-curve figure from multiple dataset using Gnuplot
* nnum - produce a series of equally separated integers or floats
* prefield - prepare input file for 'muplot' to plot 2-d fields by arrows
EXAMPLES¶
- Evaluate the current apache2 logfile and make an unique list of the hostnames (respectively ip-addresses) sorted by the total number of their http requests:
- maphimbu -rs2 /var/log/apache2/access.log
- On a X terminal plot the probability function and the cumulative distribution function of a 'sin(x)' data sample:
- nnum -3.14159 3.14159 0.00001 "sin(x)" "%.17f %.7f" | maphimbu -d0.01 -x1 -ns1 | mintegrate -d0.01 -x1 -y3 -S | muplot lp - 1:3,4
COPYRIGHT¶
Copyright © 2009, 2011-2014 Dimitar Ivanov <dimitar.ivanov@mirendom.net>
License: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it. There is NO
WARRANTY, to the extent permitted by law.
August 2014 | stda 1.3.1 |