Locked History Actions

Diff for "Spinn3rHadoopDataSet"

Differences between revisions 3 and 6 (spanning 3 versions)
Revision 3 as of 2014-09-05 22:18:20
Size: 66
Editor: NikoColneric
Comment:
Revision 6 as of 2014-09-05 22:22:32
Size: 466
Editor: NikoColneric
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Describe Spinn3rHadoopDataSet here. = Spinn3r data set on Hadoop cluster =
Line 3: Line 3:
= Data records versions = This page provides all informaton about Spinn3r data set stored on Hadoop cluster.

=== Data records versions ===
There are several verison in which the records are stored. It is important to know which version are you processings, since depending on the version you know what fields are available and, how was the test preprocessed. For example in some versions there is no capital letters, no raw html fileds, etc.
 

Spinn3r data set on Hadoop cluster

This page provides all informaton about Spinn3r data set stored on Hadoop cluster.

Data records versions

There are several verison in which the records are stored. It is important to know which version are you processings, since depending on the version you know what fields are available and, how was the test preprocessed. For example in some versions there is no capital letters, no raw html fileds, etc.