Tuesday, October 30, 2007

Hancock - A C Style Computer Programming Language for Datamining Datastreams

Wired has a blog article about AT&T Research Labs' Hancock Project. Hancock is a C style programming language for data mining on the fly. It parses and filters data streams on their way into a database which is different from how data mining is usually done which is via searching the database itself. This means that Hancock coded programs are damn fast and very efficient which is what one would expect from a C style programming language. Such speed also implies that any carefully coded Hancock program can pull data off the wire in almost real time! More documentation can be found on the project web page. The postscript paper and a pdf version describe the language.

It appears that AT&T gave the government more than they asked for. Verizon must do it the old fashioned way, by hand, because according to Wired the FBI wasn't as pleased. Marketing will release those records to Corporate Security only when Security pries them from their cold dead fingers. The Hancock distribution can be downloaded and used for non-commercial uses. It runs on SGI, x86-linux, and Solaris.

Labels:


Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?