SNAP Library 4.0, Developer Reference
2017-07-27 13:18:06
SNAP, a general purpose, high performance system for analysis and manipulation of large networks
|
Table class: Relational table with columnar data storage. More...
#include <table.h>
Classes | |
class | TLoadVecInit |
Public Member Functions | |
void | AddIntCol (const TStr &ColName) |
Adds an integer column with name ColName . More... | |
void | AddFltCol (const TStr &ColName) |
Adds a float column with name ColName . More... | |
void | AddStrCol (const TStr &ColName) |
Adds a string column with name ColName . More... | |
void | GroupByIntColMP (const TStr &GroupBy, THashMP< TInt, TIntV > &Grouping, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with integer values, using OpenMP multi-threading. More... | |
TTable () | |
TTable (TTableContext *Context) | |
TTable (const Schema &S, TTableContext *Context) | |
TTable (TSIn &SIn, TTableContext *Context) | |
TTable (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) | |
Constructor to build table out of a hash table of int->int. More... | |
TTable (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) | |
Constructor to build table out of a hash table of int->float. More... | |
TTable (const TTable &Table) | |
Copy constructor. More... | |
TTable (const TTable &Table, const TIntV &RowIds) | |
void | SaveSS (const TStr &OutFNm) |
Saves table schema and content to a TSV file. More... | |
void | SaveBin (const TStr &OutFNm) |
Saves table schema and content to a binary file. More... | |
void | Save (TSOut &SOut) |
Saves table schema and content to a binary format. More... | |
void | Dump (FILE *OutF=stdout) const |
Prints table contents to a text file. More... | |
void | AddRow (const TTableRow &Row) |
Adds row with values taken from given TTableRow. More... | |
TTableContext * | GetContext () |
Returns the context. More... | |
TTableContext * | ChangeContext (TTableContext *Context) |
Changes the current context. Moves all object items to the new context. More... | |
TInt | GetColIdx (const TStr &ColName) const |
Gets index of column ColName among columns of the same type in the schema. More... | |
TInt | GetIntVal (const TStr &ColName, const TInt &RowIdx) |
Gets the value of integer attribute ColName at row RowIdx . More... | |
TFlt | GetFltVal (const TStr &ColName, const TInt &RowIdx) |
Gets the value of float attribute ColName at row RowIdx . More... | |
TStr | GetStrVal (const TStr &ColName, const TInt &RowIdx) const |
Gets the value of string attribute ColName at row RowIdx . More... | |
TInt | GetStrMapById (TInt ColIdx, TInt RowIdx) const |
Gets the integer mapping of the string at column ColIdx at row RowIdx . More... | |
TInt | GetStrMapByName (const TStr &ColName, TInt RowIdx) const |
Gets the integer mapping of the string at column ColName at row RowIdx . More... | |
TStr | GetStrValById (TInt ColIdx, TInt RowIdx) const |
Gets the value of the string attribute at column ColIdx at row RowIdx . More... | |
TStr | GetStrValByName (const TStr &ColName, const TInt &RowIdx) const |
Gets the value of the string attribute at column ColName at row RowIdx . More... | |
TIntV | GetIntRowIdxByVal (const TStr &ColName, const TInt &Val) const |
Gets the rows containing Val in int column ColName . More... | |
TIntV | GetStrRowIdxByMap (const TStr &ColName, const TInt &Map) const |
Gets the rows containing int mapping Map in str column ColName . More... | |
TIntV | GetFltRowIdxByVal (const TStr &ColName, const TFlt &Val) const |
Gets the rows containing Val in flt column ColName . More... | |
TInt | RequestIndexInt (const TStr &ColName) |
Creates Index for Int Column ColName . More... | |
TInt | RequestIndexFlt (const TStr &ColName) |
Creates Index for Flt Column ColName . More... | |
TInt | RequestIndexStrMap (const TStr &ColName) |
Creates Index for Str Column ColName . More... | |
TStr | GetStr (const TInt &KeyId) const |
Gets the string with KeyId . More... | |
TInt | GetIntValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx) |
Get the integer value at column ColIdx and row RowIdx . More... | |
TFlt | GetFltValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx) |
Get the float value at column ColIdx and row RowIdx . More... | |
Schema | GetSchema () |
Gets the schema of this table. More... | |
TVec< PNEANet > | ToGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx) |
Creates a sequence of graphs based on values of column SplitAttr and windows specified by JumpSize and WindowSize. More... | |
TVec< PNEANet > | ToVarGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals) |
Creates a sequence of graphs based on values of column SplitAttr and intervals specified by SplitIntervals. More... | |
TVec< PNEANet > | ToGraphPerGroup (TStr GroupAttr, TAttrAggr AggrPolicy) |
Creates a sequence of graphs based on grouping specified by GroupAttr. More... | |
PNEANet | ToGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx) |
Creates the graph sequence one at a time. More... | |
PNEANet | ToVarGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals) |
Creates the graph sequence one at a time. More... | |
PNEANet | ToGraphPerGroupIterator (TStr GroupAttr, TAttrAggr AggrPolicy) |
Creates the graph sequence one at a time. More... | |
PNEANet | NextGraphIterator () |
Calls to this must be preceded by a call to one of the above ToGraph*Iterator functions. More... | |
TBool | IsLastGraphOfSequence () |
Checks if the end of the graph sequence is reached. More... | |
TStr | GetSrcCol () const |
Gets the name of the column to be used as src nodes in the graph. More... | |
void | SetSrcCol (const TStr &Src) |
Sets the name of the column to be used as src nodes in the graph. More... | |
TStr | GetDstCol () const |
Gets the name of the column to be used as dst nodes in the graph. More... | |
void | SetDstCol (const TStr &Dst) |
Sets the name of the column to be used as dst nodes in the graph. More... | |
void | AddEdgeAttr (const TStr &Attr) |
Adds column to be used as graph edge attribute. More... | |
void | AddEdgeAttr (TStrV &Attrs) |
Adds columns to be used as graph edge attributes. More... | |
void | AddSrcNodeAttr (const TStr &Attr) |
Adds column to be used as src node atribute of the graph. More... | |
void | AddSrcNodeAttr (TStrV &Attrs) |
Adds columns to be used as src node attributes of the graph. More... | |
void | AddDstNodeAttr (const TStr &Attr) |
Adds column to be used as dst node atribute of the graph. More... | |
void | AddDstNodeAttr (TStrV &Attrs) |
Adds columns to be used as dst node attributes of the graph. More... | |
void | AddNodeAttr (const TStr &Attr) |
Handles the common case where src and dst both belong to the same "universe" of entities. More... | |
void | AddNodeAttr (TStrV &Attrs) |
Handles the common case where src and dst both belong to the same "universe" of entities. More... | |
void | SetCommonNodeAttrs (const TStr &SrcAttr, const TStr &DstAttr, const TStr &CommonAttrName) |
Sets the columns to be used as both src and dst node attributes. More... | |
TStrV | GetSrcNodeIntAttrV () const |
Gets src node int attribute name vector. More... | |
TStrV | GetDstNodeIntAttrV () const |
Gets dst node int attribute name vector. More... | |
TStrV | GetEdgeIntAttrV () const |
Gets edge int attribute name vector. More... | |
TStrV | GetSrcNodeFltAttrV () const |
Gets src node float attribute name vector. More... | |
TStrV | GetDstNodeFltAttrV () const |
Gets dst node float attribute name vector. More... | |
TStrV | GetEdgeFltAttrV () const |
Gets edge float attribute name vector. More... | |
TStrV | GetSrcNodeStrAttrV () const |
Gets src node str attribute name vector. More... | |
TStrV | GetDstNodeStrAttrV () const |
Gets dst node str attribute name vector. More... | |
TStrV | GetEdgeStrAttrV () const |
Gets edge str attribute name vector. More... | |
TAttrType | GetColType (const TStr &ColName) const |
Gets type of column ColName . More... | |
TInt | GetNumRows () const |
Gets total number of rows in this table. More... | |
TInt | GetNumValidRows () const |
Gets number of valid, i.e. not deleted, rows in this table. More... | |
THash< TInt, TInt > | GetRowIdMap () const |
Gets a map of logical to physical row ids. More... | |
TRowIterator | BegRI () const |
Gets iterator to the first valid row of the table. More... | |
TRowIterator | EndRI () const |
Gets iterator to the last valid row of the table. More... | |
TRowIteratorWithRemove | BegRIWR () |
Gets iterator with reomve to the first valid row. More... | |
TRowIteratorWithRemove | EndRIWR () |
Gets iterator with reomve to the last valid row. More... | |
void | GetPartitionRanges (TIntPrV &Partitions, TInt NumPartitions) const |
Partitions the table into NumPartitions and populate Partitions with the ranges. More... | |
void | Rename (const TStr &Column, const TStr &NewLabel) |
Renames a column. More... | |
void | Unique (const TStr &Col) |
Removes rows with duplicate values in given column. More... | |
void | Unique (const TStrV &Cols, TBool Ordered=true) |
Removes rows with duplicate values in given columns. More... | |
void | Select (TPredicate &Predicate, TIntV &SelectedRows, TBool Remove=true) |
Selects rows that satisfy given Predicate . More... | |
void | Select (TPredicate &Predicate) |
void | Classify (TPredicate &Predicate, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, TIntV &SelectedRows, TBool Remove=true) |
Selects rows using atomic compare operation. More... | |
void | SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp) |
void | ClassifyAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomicConst (const TStr &Col, const TPrimitive &Val, TPredComp Cmp, TIntV &SelectedRows, PTable &SelectedTable, TBool Remove=true, TBool Table=true) |
Selects rows where the value of Col matches given primitive Val . More... | |
template<class T > | |
void | SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp) |
template<class T > | |
void | SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, PTable &SelectedTable) |
template<class T > | |
void | ClassifyAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp) |
void | SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp, PTable &SelectedTable) |
void | SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp) |
void | SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp, PTable &SelectedTable) |
void | SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp) |
void | SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp, PTable &SelectedTable) |
void | Group (const TStrV &GroupBy, const TStr &GroupColName, TBool Ordered=true, TBool UsePhysicalIds=true) |
Groups rows depending on values of GroupBy columns. More... | |
void | Count (const TStr &CountColName, const TStr &Col) |
Counts number of unique elements. More... | |
void | Order (const TStrV &OrderBy, TStr OrderColName="", TBool ResetRankByMSC=false, TBool Asc=true) |
Orders the rows according to the values in columns of OrderBy (in descending lexicographic order). More... | |
void | Aggregate (const TStrV &GroupByAttrs, TAttrAggr AggOp, const TStr &ValAttr, const TStr &ResAttr, TBool Ordered=true) |
Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr. More... | |
void | AggregateCols (const TStrV &AggrAttrs, TAttrAggr AggOp, const TStr &ResAttr) |
Aggregates attributes in AggrAttrs across columns. More... | |
TVec< PTable > | SpliceByGroup (const TStrV &GroupByAttrs, TBool Ordered=true) |
Splices table into subtables according to a grouping statement. More... | |
PTable | Join (const TStr &Col1, const TTable &Table, const TStr &Col2) |
Performs equijoin. More... | |
PTable | Join (const TStr &Col1, const PTable &Table, const TStr &Col2) |
PTable | ThresholdJoin (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2, TInt Threshold, TBool PerJoinKey=false) |
PTable | SelfJoin (const TStr &Col) |
Joins table with itself, on values of Col . More... | |
PTable | SelfSimJoin (const TStrV &Cols, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
PTable | SelfSimJoinPerGroup (const TStr &GroupAttr, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
PTable | SelfSimJoinPerGroup (const TStrV &GroupBy, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
PTable | SimJoin (const TStrV &Cols1, const TTable &Table, const TStrV &Cols2, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
void | SelectFirstNRows (const TInt &N) |
Selects first N rows from the table. More... | |
void | Defrag () |
Releases memory of deleted rows, and defrags. More... | |
void | StoreIntCol (const TStr &ColName, const TIntV &ColVals) |
Adds entire int column to table. More... | |
void | StoreFltCol (const TStr &ColName, const TFltV &ColVals) |
Adds entire flt column to table. More... | |
void | StoreStrCol (const TStr &ColName, const TStrV &ColVals) |
Adds entire str column to table. More... | |
void | UpdateFltFromTable (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0) |
void | UpdateFltFromTableMP (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0) |
void | SetFltColToConstMP (TInt UpdateColIdx, TFlt DefaultFltVal) |
PTable | Union (const TTable &Table) |
Returns union of this table with given Table . More... | |
PTable | Union (const PTable &Table) |
PTable | UnionAll (const TTable &Table) |
Returns union of this table with given Table , preserving duplicates. More... | |
PTable | UnionAll (const PTable &Table) |
void | UnionAllInPlace (const TTable &Table) |
Same as TTable::ConcatTable. More... | |
void | UnionAllInPlace (const PTable &Table) |
PTable | Intersection (const TTable &Table) |
Returns intersection of this table with given Table . More... | |
PTable | Intersection (const PTable &Table) |
PTable | Minus (TTable &Table) |
Returns table with rows that are present in this table but not in given Table . More... | |
PTable | Minus (const PTable &Table) |
PTable | Project (const TStrV &ProjectCols) |
Returns table with only the columns in ProjectCols . More... | |
void | ProjectInPlace (const TStrV &ProjectCols) |
Keeps only the columns specified in ProjectCols . More... | |
void | ColGenericOp (const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op) |
Performs columnwise arithmetic operation. More... | |
void | ColGenericOpMP (TInt ArgColIdx1, TInt ArgColIdx2, TAttrType ArgType1, TAttrType ArgType2, TInt ResColIdx, TArithOp op) |
void | ColAdd (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise addition. See TTable::ColGenericOp. More... | |
void | ColSub (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise subtraction. See TTable::ColGenericOp. More... | |
void | ColMul (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise multiplication. See TTable::ColGenericOp. More... | |
void | ColDiv (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise division. See TTable::ColGenericOp. More... | |
void | ColMod (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise modulus. See TTable::ColGenericOp. More... | |
void | ColMin (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs min of two columns. See TTable::ColGenericOp. More... | |
void | ColMax (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs max of two columns. See TTable::ColGenericOp. More... | |
void | ColGenericOp (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr, TArithOp op, TBool AddToFirstTable) |
Performs columnwise arithmetic operation with column of given table. More... | |
void | ColAdd (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise addition with column of given table. More... | |
void | ColSub (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise subtraction with column of given table. More... | |
void | ColMul (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise multiplication with column of given table. More... | |
void | ColDiv (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise division with column of given table. More... | |
void | ColMod (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise modulus with column of given table. More... | |
void | ColGenericOp (const TStr &Attr1, const TFlt &Num, const TStr &ResAttr, TArithOp op, const TBool floatCast) |
Performs arithmetic op of column values and given Num . More... | |
void | ColGenericOpMP (const TInt &ColIdx1, const TInt &ColIdx2, TAttrType ArgType, const TFlt &Num, TArithOp op, TBool ShouldCast) |
void | ColAdd (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs addition of column values and given Num . More... | |
void | ColSub (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs subtraction of column values and given Num . More... | |
void | ColMul (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs multiplication of column values and given Num . More... | |
void | ColDiv (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs division of column values and given Num . More... | |
void | ColMod (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs modulus of column values and given Num . More... | |
void | ColConcat (const TStr &Attr1, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="") |
Concatenates two string columns. More... | |
void | ColConcat (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="", TBool AddToFirstTable=true) |
Concatenates string column with column of given table. More... | |
void | ColConcatConst (const TStr &Attr1, const TStr &Val, const TStr &Sep="", const TStr &ResAttr="") |
Concatenates column values with given string value. More... | |
void | ReadIntCol (const TStr &ColName, TIntV &Result) const |
Reads values of entire int column into Result . More... | |
void | ReadFltCol (const TStr &ColName, TFltV &Result) const |
Reads values of entire float column into Result . More... | |
void | ReadStrCol (const TStr &ColName, TStrV &Result) const |
Reads values of entire string column into Result . More... | |
void | InitIds () |
Adds explicit row ids, initialize hash set mapping ids to physical rows. More... | |
PTable | IsNextK (const TStr &OrderCol, TInt K, const TStr &GroupBy, const TStr &RankColName="") |
Distance based filter. More... | |
void | PrintSize () |
void | PrintContextSize () |
TSize | GetMemUsedKB () |
Returns approximate memory used by table in [KB]. More... | |
TSize | GetContextMemUsedKB () |
Returns approximate memory used by table context in [KB]. More... | |
Static Public Member Functions | |
static void | SetMP (TInt Value) |
static TInt | GetMP () |
static TStr | NormalizeColName (const TStr &ColName) |
Adds suffix to column name if it doesn't exist. More... | |
static TStrV | NormalizeColNameV (const TStrV &Cols) |
Adds suffix to column name if it doesn't exist. More... | |
static PTable | New () |
static PTable | New (TTableContext *Context) |
static PTable | New (const Schema &S, TTableContext *Context) |
static PTable | New (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Returns pointer to a table constructed from given int->int hash. More... | |
static PTable | New (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Returns pointer to a table constructed from given int->float hash. More... | |
static PTable | New (const PTable Table) |
Returns pointer to a new table created from given Table . More... | |
static void | GetSchema (const TStr &InFNm, Schema &S, const char &Separator= '\t') |
Returns pointer to a new table created from given Table , with name set to TableName . More... | |
static PTable | LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const char &Separator= '\t', TBool HasTitleLine=false) |
Loads table from spread sheet (TSV, CSV, etc). Note: HasTitleLine = true is not supported. Please comment title lines instead. More... | |
static PTable | LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const TIntV &RelevantCols, const char &Separator= '\t', TBool HasTitleLine=false) |
Loads table from spread sheet - but only load the columns specified by RelevantCols. Note: HasTitleLine = true is not supported. Please comment title lines instead. More... | |
static PTable | Load (TSIn &SIn, TTableContext *Context) |
Loads table from a binary format. More... | |
static PTable | LoadShM (TShMIn &ShMIn, TTableContext *Context) |
Static constructor to load table from memory. More... | |
static PTable | TableFromHashMap (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Builds table from hash table of int->int. More... | |
static PTable | TableFromHashMap (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Builds table from hash table of int->float. More... | |
static PTable | GetNodeTable (const PNEANet &Network, TTableContext *Context) |
Extracts node TTable from PNEANet. More... | |
static PTable | GetEdgeTable (const PNEANet &Network, TTableContext *Context) |
Extracts edge TTable from PNEANet. More... | |
static PTable | GetEdgeTablePN (const PNGraphMP &Network, TTableContext *Context) |
Extracts edge TTable from parallel graph PNGraphMP. More... | |
static PTable | GetFltNodePropertyTable (const PNEANet &Network, const TIntFltH &Property, const TStr &NodeAttrName, const TAttrType &NodeAttrType, const TStr &PropertyAttrName, TTableContext *Context) |
Extracts node and edge property TTables from THash. More... | |
Protected Member Functions | |
void | InvalidatePhysicalGroupings () |
void | InvalidateAffectedGroupings (const TStr &Attr) |
void | IncrementNext () |
Increments the next vector and set last, NumRows and NumValidRows. More... | |
void | ClassifyAux (const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
Adds a label attribute with positive labels on selected rows and negative labels on the rest. More... | |
const char * | GetContextKey (TInt Val) const |
Gets the Key of the Context StringVals pool. Used by ToGraph method in conv.cpp. More... | |
TStr | GetStrVal (TInt ColIdx, TInt RowIdx) const |
Gets the value in column with id ColIdx at row RowIdx . More... | |
void | AddStrVal (const TInt &ColIdx, const TStr &Val) |
Adds Val in column with id ColIdx . More... | |
void | AddStrVal (const TStr &Col, const TStr &Val) |
Adds Val in column with name Col . More... | |
TStr | GetIdColName () const |
Gets name of the id column of this table. More... | |
TStr | GetSchemaColName (TInt Idx) const |
Gets name of the column with index Idx in the schema. More... | |
TAttrType | GetSchemaColType (TInt Idx) const |
Gets type of the column with index Idx in the schema. More... | |
void | AddSchemaCol (const TStr &ColName, TAttrType ColType) |
Adds column with name ColName and type ColType to the schema. More... | |
TBool | IsColName (const TStr &ColName) const |
void | AddColType (const TStr &ColName, TPair< TAttrType, TInt > ColType) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
void | AddColType (const TStr &ColName, TAttrType ColType, TInt Index) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
void | DelColType (const TStr &ColName) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
TPair< TAttrType, TInt > | GetColTypeMap (const TStr &ColName) const |
Gets column type and index of ColName . More... | |
TStr | RenumberColName (const TStr &ColName) const |
Returns a re-numbered column name based on number of existing columns with conflicting names. More... | |
TStr | DenormalizeColName (const TStr &ColName) const |
Removes suffix to column name if exists. More... | |
Schema | DenormalizeSchema () const |
Removes suffix to column names in the Schema. More... | |
TBool | IsAttr (const TStr &Attr) |
Checks if Attr is an attribute of this table schema. More... | |
void | AddTable (const TTable &T) |
Adds all the rows of the input table. Allows duplicate rows (not a union). More... | |
void | ConcatTable (const PTable &T) |
Appends all rows of T to this table, and recalculate indices. More... | |
void | AddRow (const TRowIterator &RI) |
Adds row corresponding to RI . More... | |
void | AddRow (const TIntV &IntVals, const TFltV &FltVals, const TStrV &StrVals) |
Adds row with values corresponding to the given vectors by type. More... | |
void | AddGraphAttribute (const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst) |
Adds names of columns to be used as graph attributes. More... | |
void | AddGraphAttributeV (TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst) |
Adds vector of names of columns to be used as graph attributes. More... | |
void | CheckAndAddIntNode (PNEANet Graph, THashSet< TInt > &NodeVals, TInt NodeId) |
Checks if given NodeId is seen earlier; if not, add it to Graph and hashmap NodeVals . More... | |
template<class T > | |
TInt | CheckAndAddFltNode (T Graph, THash< TFlt, TInt > &NodeVals, TFlt FNodeVal) |
Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals . More... | |
void | AddEdgeAttributes (PNEANet &Graph, int RowId) |
Adds attributes of edge corresponding to RowId to the Graph . More... | |
void | AddNodeAttributes (TInt NId, TStrV NodeAttrV, TInt RowId, THash< TInt, TStrIntVH > &NodeIntAttrs, THash< TInt, TStrFltVH > &NodeFltAttrs, THash< TInt, TStrStrVH > &NodeStrAttrs) |
Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values). More... | |
PNEANet | BuildGraph (const TIntV &RowIds, TAttrAggr AggrPolicy) |
Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes. More... | |
void | InitRowIdBuckets (int NumBuckets) |
Initializes the RowIdBuckets vector which will be used for the graph sequence creation. More... | |
void | FillBucketsByWindow (TStr SplitAttr, TInt JumpSize, TInt WindowSize, TInt StartVal, TInt EndVal) |
Fills RowIdBuckets with sets of row ids. More... | |
void | FillBucketsByInterval (TStr SplitAttr, TIntPrV SplitIntervals) |
Fills RowIdBuckets with sets of row ids. More... | |
TVec< PNEANet > | GetGraphsFromSequence (TAttrAggr AggrPolicy) |
Returns a sequence of graphs. More... | |
PNEANet | GetFirstGraphFromSequence (TAttrAggr AggrPolicy) |
Returns the first graph of the sequence. More... | |
PNEANet | GetNextGraphFromSequence () |
Returns the next graph in sequence corresponding to RowIdBuckets. More... | |
template<class T > | |
T | AggregateVector (TVec< T > &V, TAttrAggr Policy) |
Aggregates vector into a single scalar value according to a policy. More... | |
void | GroupingSanityCheck (const TStr &GroupBy, const TAttrType &AttrType) const |
Checks if grouping key exists and matches given attr type. More... | |
template<class T > | |
void | GroupByIntCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with integer values. More... | |
template<class T > | |
void | GroupByFltCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with float values. Returns hash table with grouping. More... | |
template<class T > | |
void | GroupByStrCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with string values. Returns hash table with grouping. More... | |
template<class T > | |
void | UpdateGrouping (THash< T, TIntV > &Grouping, T Key, TInt Val) const |
Template for utility function to update a grouping hash map. More... | |
template<class T > | |
void | UpdateGrouping (THashMP< T, TIntV > &Grouping, T Key, TInt Val) const |
Template for utility function to update a parallel grouping hash map. More... | |
void | PrintGrouping (const THash< TGroupKey, TIntV > &Grouping) const |
TInt | CompareRows (TInt R1, TInt R2, const TAttrType &CompareByType, const TInt &CompareByIndex, TBool Asc=true) |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More... | |
TInt | CompareRows (TInt R1, TInt R2, const TVec< TAttrType > &CompareByTypes, const TIntV &CompareByIndices, TBool Asc=true) |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More... | |
TInt | GetPivot (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc) |
Gets pivot element for QSort. More... | |
TInt | Partition (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc) |
Partitions vector for QSort. More... | |
void | ISort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs insertion sort on given vector V . More... | |
void | QSort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs QSort on given vector V . More... | |
void | Merge (TIntV &V, TInt Idx1, TInt Idx2, TInt Idx3, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Helper function for parallel QSort. More... | |
void | QSortPar (TIntV &V, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs QSort in parallel on given vector V . More... | |
bool | IsRowValid (TInt RowIdx) const |
Checks if RowIdx corresponds to a valid (i.e. not deleted) row. More... | |
TInt | GetLastValidRowIdx () |
Gets the id of the last valid row of the table. More... | |
void | RemoveFirstRow () |
Removes first valid row of the table. More... | |
void | RemoveRow (TInt RowIdx, TInt PrevRowIdx) |
Removes row with id RowIdx . More... | |
void | KeepSortedRows (const TIntV &KeepV) |
Removes all rows that are not mentioned in the SORTED vector KeepV . More... | |
void | SetFirstValidRow () |
Sets the first valid row of the TTable. More... | |
PTable | InitializeJointTable (const TTable &Table) |
Initializes an empty table for the join of this table with the given table. More... | |
void | AddJointRow (const TTable &T1, const TTable &T2, TInt RowIdx1, TInt RowIdx2) |
Adds joint row T1[RowIdx1]<=>T2[RowIdx2]. More... | |
void | ThresholdJoinInputCorrectness (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2) |
void | ThresholdJoinCountCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntPr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType) |
PTable | ThresholdJoinOutputTable (const THash< TIntPr, TIntTr > &Counters, TInt Threshold, const TTable &Table) |
void | ThresholdJoinCountPerJoinKeyCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntTr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType) |
PTable | ThresholdJoinPerJoinKeyOutputTable (const THash< TIntTr, TIntTr > &Counters, TInt Threshold, const TTable &Table) |
void | ResizeTable (int RowCount) |
Resizes the table to hold RowCount rows. More... | |
int | GetEmptyRowsStart (int NewRows) |
Gets the start index to a chunk of empty rows of size NewRows . More... | |
void | AddSelectedRows (const TTable &Table, const TIntV &RowIDs) |
Adds rows from Table that correspond to ids in RowIDs . More... | |
void | AddNRows (int NewRows, const TVec< TIntV > &IntColsP, const TVec< TFltV > &FltColsP, const TVec< TIntV > &StrColMapsP) |
Adds NewRows rows from the given vectors for each column type. More... | |
void | AddNJointRowsMP (const TTable &T1, const TTable &T2, const TVec< TIntPrV > &JointRowIDSet) |
Adds rows from T1 and T2 to this table in a parallel manner. Used by Join. More... | |
void | UpdateTableForNewRow () |
Updates table state after adding one or more rows. More... | |
void | GroupAux (const TStrV &GroupBy, THash< TGroupKey, TPair< TInt, TIntV > > &Grouping, TBool Ordered, const TStr &GroupColName, TBool KeepUnique, TIntV &UniqueVec, TBool UsePhysicalIds=true) |
Helper function for grouping. More... | |
void | StoreGroupCol (const TStr &GroupColName, const TVec< TPair< TInt, TInt > > &GroupAndRowIds) |
Parallel helper function for grouping. - we currently don't support such parallel grouping by complex keys. More... | |
void | Reindex () |
Reinitializes row ids. More... | |
void | AddIdColumn (const TStr &IdColName) |
Adds a column of explicit integer identifiers to the rows. More... | |
void | GetCollidingRows (const TTable &T, THashSet< TInt > &Collisions) |
Gets set of row ids of rows common with table T . More... | |
Static Protected Member Functions | |
static void | LoadSSPar (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine) |
Parallelly loads data from input file at InFNm into NewTable. Only work when NewTable has no string columns. More... | |
static void | LoadSSSeq (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine) |
Sequentially loads data from input file at InFNm into NewTable. More... | |
static TInt | CompareKeyVal (const TInt &K1, const TInt &V1, const TInt &K2, const TInt &V2) |
static TInt | CheckSortedKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static void | ISortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static TInt | GetPivotKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static TInt | PartitionKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static void | QSortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
Protected Attributes | |
TTableContext * | Context |
Execution Context. More... | |
Schema | Sch |
Table Schema. More... | |
TCRef | CRef |
TInt | NumRows |
Number of rows in the table (valid and invalid). More... | |
TInt | NumValidRows |
Number of valid rows in the table (i.e. rows that were not logically removed). More... | |
TInt | FirstValidRow |
Physical index of first valid row. More... | |
TInt | LastValidRow |
Physical index of last valid row. More... | |
TIntV | Next |
A vector describing the logical order of the rows. More... | |
TVec< TIntV > | IntCols |
Next [i] is the successor of row i . Table iterators follow the order dictated by Next More... | |
TVec< TFltV > | FltCols |
Data columns of floating point attributes. More... | |
TVec< TIntV > | StrColMaps |
Data columns of integer mappings of string attributes. More... | |
THash< TStr, TPair< TAttrType, TInt > > | ColTypeMap |
TStr | IdColName |
A mapping from column name to column type and column index among columns of the same type. More... | |
TIntIntH | RowIdMap |
Mapping of permanent row ids to physical id. More... | |
THash< TStr, THash< TInt, TIntV > > | IntColIndexes |
Indexes for Int Columns. More... | |
THash< TStr, THash< TInt, TIntV > > | StrMapColIndexes |
Indexes for String Columns. More... | |
THash< TStr, THash< TFlt, TIntV > > | FltColIndexes |
Indexes for Float Columns. More... | |
THash< TStr, GroupStmt > | GroupStmtNames |
Maps user-given grouping statement names to their group-by attributes. More... | |
THash< GroupStmt, THash< TInt, TGroupKey > > | GroupIDMapping |
Maps grouping statements to their (group id –> group-by key) mapping. More... | |
THash< GroupStmt, THash < TGroupKey, TIntV > > | GroupMapping |
Maps grouping statements to their (group-by key –> group id) mapping. More... | |
TStr | SrcCol |
Column (attribute) to serve as src nodes when constructing the graph. More... | |
TStr | DstCol |
Column (attribute) to serve as dst nodes when constructing the graph. More... | |
TStrV | EdgeAttrV |
List of columns (attributes) to serve as edge attributes. More... | |
TStrV | SrcNodeAttrV |
List of columns (attributes) to serve as source node attributes. More... | |
TStrV | DstNodeAttrV |
List of columns (attributes) to serve as destination node attributes. More... | |
TStrTrV | CommonNodeAttrs |
List of attribute pairs with values common to source and destination and their common given name. More... | |
TVec< TIntV > | RowIdBuckets |
Partitioning of row ids into buckets corresponding to different graph objects when generating a sequence of graphs. More... | |
TInt | CurrBucket |
Current row id bucket - used when generating a sequence of graphs using an iterator. More... | |
TAttrAggr | AggrPolicy |
Aggregation policy used for solving conflicts between different values of an attribute of the same node. More... | |
TInt | IsNextDirty |
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing GetPartitionRanges. More... | |
Static Protected Attributes | |
static const TInt | Last = -1 |
Special value for Next vector entry - last row in table. More... | |
static const TInt | Invalid = -2 |
Special value for Next vector entry - logically removed row. More... | |
static TInt | UseMP = 1 |
Global switch for choosing multi-threaded versions of TTable functions. More... | |
Private Member Functions | |
void | GenerateColTypeMap (THash< TStr, TPair< TInt, TInt > > &ColTypeIntMap) |
void | LoadTableShM (TShMIn &ShMIn, TTableContext *ContextTable) |
Friends | |
class | TPt< TTable > |
class | TRowIterator |
class | TRowIteratorWithRemove |
template<class PGraph > | |
PGraph | TSnap::ToGraph (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy) |
int | TSnap::LoadCrossNet (TCrossNet &Graph, PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV) |
int | TSnap::LoadMode (TModeNet &Graph, PTable Table, const TStr &NCol, TStrV &NodeAttrV) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToGraphMP (PTable Table, const TStr &SrcCol, const TStr &DstCol) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToGraphMP3 (PTable Table, const TStr &SrcCol, const TStr &DstCol) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP2 (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy) |
TTable::TTable | ( | ) |
TTable::TTable | ( | TTableContext * | Context | ) |
Definition at line 305 of file table.cpp.
TTable::TTable | ( | const Schema & | S, |
TTableContext * | Context | ||
) |
Definition at line 308 of file table.cpp.
References AddColType(), AddSchemaCol(), atFlt, atInt, atStr, FltCols, IntCols, TVec< TVal, TSizeTy >::Len(), and StrColMaps.
TTable::TTable | ( | TSIn & | SIn, |
TTableContext * | Context | ||
) |
Definition at line 378 of file table.cpp.
References GenerateColTypeMap().
TTable::TTable | ( | const THash< TInt, TInt > & | H, |
const TStr & | Col1, | ||
const TStr & | Col2, | ||
TTableContext * | Context, | ||
const TBool | IsStrKeys = false |
||
) |
Constructor to build table out of a hash table of int->int.
Definition at line 385 of file table.cpp.
References AddColType(), AddSchemaCol(), atInt, atStr, THash< TKey, TDat, THashFunc >::GetDatV(), THash< TKey, TDat, THashFunc >::GetKeyV(), InitIds(), IntCols, IsNextDirty, Last, Next, NumRows, and StrColMaps.
TTable::TTable | ( | const THash< TInt, TFlt > & | H, |
const TStr & | Col1, | ||
const TStr & | Col2, | ||
TTableContext * | Context, | ||
const TBool | IsStrKeys = false |
||
) |
Constructor to build table out of a hash table of int->float.
Definition at line 412 of file table.cpp.
References AddColType(), AddSchemaCol(), atFlt, atInt, atStr, FltCols, THash< TKey, TDat, THashFunc >::GetDatV(), THash< TKey, TDat, THashFunc >::GetKeyV(), InitIds(), IntCols, IsNextDirty, Last, Next, NumRows, and StrColMaps.
|
inline |
Copy constructor.
Definition at line 919 of file table.h.
Definition at line 438 of file table.cpp.
References AddSelectedRows(), ColTypeMap, FirstValidRow, FltCols, InitIds(), IntCols, IsNextDirty, LastValidRow, TVec< TVal, TSizeTy >::Len(), NumRows, NumValidRows, and StrColMaps.
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 651 of file table.h.
References THash< TKey, TDat, THashFunc >::AddDat(), ColTypeMap, and NormalizeColName().
Referenced by AddColType(), AddFltCol(), AddIdColumn(), AddIntCol(), AddStrCol(), ClassifyAux(), GenerateColTypeMap(), Order(), ProjectInPlace(), Rename(), StoreFltCol(), StoreGroupCol(), StoreIntCol(), StoreStrCol(), and TTable().
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 656 of file table.h.
References AddColType(), and NormalizeColName().
|
inline |
Adds column to be used as dst node atribute of the graph.
Definition at line 1180 of file table.h.
References AddGraphAttribute().
Referenced by AddNodeAttr().
|
inline |
Adds columns to be used as dst node attributes of the graph.
Definition at line 1182 of file table.h.
References AddGraphAttributeV().
|
inline |
Adds column to be used as graph edge attribute.
Definition at line 1172 of file table.h.
References AddGraphAttribute().
|
inline |
Adds columns to be used as graph edge attributes.
Definition at line 1174 of file table.h.
References AddGraphAttributeV().
|
inlineprotected |
Adds attributes of edge corresponding to RowId
to the Graph
.
Definition at line 3395 of file table.cpp.
References atFlt, atInt, atStr, EdgeAttrV, FltCols, GetColIdx(), GetColType(), GetStrVal(), IntCols, and TVec< TVal, TSizeTy >::Len().
Referenced by BuildGraph().
void TTable::AddFltCol | ( | const TStr & | ColName | ) |
Adds a float column with name ColName
.
Definition at line 4680 of file table.cpp.
References AddColType(), AddSchemaCol(), atFlt, FltCols, and NumRows.
Referenced by Aggregate(), AggregateCols(), and ColGenericOp().
|
protected |
Adds names of columns to be used as graph attributes.
Definition at line 985 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), DstNodeAttrV, EdgeAttrV, IsColName(), NormalizeColName(), SrcNodeAttrV, and TExcept::Throw().
Referenced by AddDstNodeAttr(), AddEdgeAttr(), and AddSrcNodeAttr().
Adds vector of names of columns to be used as graph attributes.
Definition at line 992 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), DstNodeAttrV, EdgeAttrV, IsColName(), TVec< TVal, TSizeTy >::Len(), NormalizeColName(), SrcNodeAttrV, and TExcept::Throw().
Referenced by AddDstNodeAttr(), AddEdgeAttr(), and AddSrcNodeAttr().
|
protected |
Adds a column of explicit integer identifiers to the rows.
Definition at line 1900 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddColType(), THash< TKey, TDat, THashFunc >::AddDat(), AddSchemaCol(), atInt, BegRI(), THash< TKey, TDat, THashFunc >::Clr(), EndRI(), IntCols, TVec< TVal, TSizeTy >::Len(), NumRows, TVec< TVal, TSizeTy >::Reserve(), and RowIdMap.
Referenced by InitIds().
void TTable::AddIntCol | ( | const TStr & | ColName | ) |
Adds an integer column with name ColName
.
Definition at line 4673 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atInt, IntCols, TVec< TVal, TSizeTy >::Len(), and NumRows.
Referenced by Aggregate(), AggregateCols(), and ColGenericOp().
|
protected |
Adds joint row T1[RowIdx1]<=>T2[RowIdx2].
Definition at line 1957 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), TVec< TVal, TSizeTy >::Empty(), FltCols, IntCols, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, NumValidRows, RowIdMap, and StrColMaps.
|
protected |
Adds rows from T1 and T2 to this table in a parallel manner. Used by Join.
Definition at line 4442 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddDat(), Assert, THash< TKey, TDat, THashFunc >::Clr(), FltCols, TPair< TVal1, TVal2 >::GetVal1(), TPair< TVal1, TVal2 >::GetVal2(), IntCols, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, NumValidRows, ResizeTable(), RowIdMap, StrColMaps, and TExcept::Throw().
|
inline |
Handles the common case where src and dst both belong to the same "universe" of entities.
Definition at line 1184 of file table.h.
References AddDstNodeAttr(), and AddSrcNodeAttr().
|
inline |
Handles the common case where src and dst both belong to the same "universe" of entities.
Definition at line 1186 of file table.h.
References AddDstNodeAttr(), and AddSrcNodeAttr().
|
inlineprotected |
Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values).
Definition at line 3414 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddKey(), atFlt, atInt, CommonNodeAttrs, FltCols, GetColIdx(), GetColType(), THash< TKey, TDat, THashFunc >::GetDat(), GetStrVal(), IntCols, THash< TKey, TDat, THashFunc >::IsKey(), and TVec< TVal, TSizeTy >::Len().
Referenced by BuildGraph().
|
protected |
Adds NewRows
rows from the given vectors for each column type.
Definition at line 4421 of file table.cpp.
References FltCols, GetEmptyRowsStart(), IntCols, TVec< TVal, TSizeTy >::Len(), Next, and StrColMaps.
|
protected |
Adds row corresponding to RI
.
Definition at line 4295 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, FltCols, GetColIdx(), GetColType(), TRowIterator::GetFltAttr(), TRowIterator::GetIntAttr(), GetSchemaColName(), TRowIterator::GetStrMapByName(), IdColName, IntCols, TVec< TVal, TSizeTy >::Len(), Sch, StrColMaps, and UpdateTableForNewRow().
|
protected |
Adds row with values corresponding to the given vectors by type.
Definition at line 4317 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddStrVal(), FltCols, IntCols, TVec< TVal, TSizeTy >::Len(), and UpdateTableForNewRow().
|
inline |
Adds row with values taken from given TTableRow.
Definition at line 1002 of file table.h.
References AddRow(), TTableRow::GetFltVals(), TTableRow::GetIntVals(), and TTableRow::GetStrVals().
Referenced by AddRow().
Adds column with name ColName
and type ColType
to the schema.
Definition at line 642 of file table.h.
References TVec< TVal, TSizeTy >::Add(), NormalizeColName(), and Sch.
Referenced by AddFltCol(), AddIdColumn(), AddIntCol(), AddStrCol(), ClassifyAux(), GenerateColTypeMap(), GroupAux(), Order(), StoreFltCol(), StoreIntCol(), StoreStrCol(), and TTable().
Adds rows from Table
that correspond to ids in RowIDs
.
Definition at line 4399 of file table.cpp.
References FltCols, GetEmptyRowsStart(), IntCols, TVec< TVal, TSizeTy >::Len(), Next, and StrColMaps.
Referenced by TTable().
|
inline |
Adds column to be used as src node atribute of the graph.
Definition at line 1176 of file table.h.
References AddGraphAttribute().
Referenced by AddNodeAttr().
|
inline |
Adds columns to be used as src node attributes of the graph.
Definition at line 1178 of file table.h.
References AddGraphAttributeV().
void TTable::AddStrCol | ( | const TStr & | ColName | ) |
Adds a string column with name ColName
.
Definition at line 4687 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atStr, TVec< TVal, TSizeTy >::Len(), NumRows, and StrColMaps.
Referenced by ColConcat(), and ColConcatConst().
Adds Val
in column with id ColIdx
.
Definition at line 971 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), TStrHash< TDat, TStringPool, THashFunc >::AddKey(), Context, StrColMaps, and TTableContext::StringVals.
Referenced by AddRow(), and AddStrVal().
Adds Val
in column with name Col
.
Definition at line 977 of file table.cpp.
References AddStrVal(), atStr, GetColIdx(), GetColType(), and TExcept::Throw().
|
protected |
Adds all the rows of the input table. Allows duplicate rows (not a union).
Definition at line 3975 of file table.cpp.
References TVec< TVal, TSizeTy >::AddV(), atFlt, atInt, atStr, FirstValidRow, FltCols, GetColIdx(), GetColType(), GetSchemaColName(), IdColName, IntCols, Invalid, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, NumValidRows, Sch, StrColMaps, and TExcept::Throw().
Referenced by ConcatTable(), and UnionAllInPlace().
void TTable::Aggregate | ( | const TStrV & | GroupByAttrs, |
TAttrAggr | AggOp, | ||
const TStr & | ValAttr, | ||
const TStr & | ResAttr, | ||
TBool | Ordered = true |
||
) |
Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr.
Definition at line 1585 of file table.cpp.
References aaCount, aaMean, TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), AddFltCol(), AddIntCol(), atFlt, atInt, atStr, THashMP< TKey, TDat, THashFunc >::BegI(), THash< TKey, TDat, THashFunc >::BegI(), THashMP< TKey, TDat, THashFunc >::EndI(), THash< TKey, TDat, THashFunc >::EndI(), FltCols, GetColIdx(), GetColType(), THashMP< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::GetKey(), GetMP(), GroupAux(), GroupByFltCol(), GroupByIntCol(), GroupByIntColMP(), GroupByStrCol(), GroupMapping, IdColName, IntCols, IsColName(), THash< TKey, TDat, THashFunc >::Len(), TVec< TVal, TSizeTy >::Len(), NormalizeColNameV(), NumValidRows, and TExcept::Throw().
Referenced by Count().
Aggregates attributes in AggrAttrs across columns.
Definition at line 1750 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddFltCol(), AddIntCol(), atFlt, atInt, BegRI(), EndRI(), FltCols, GetColIdx(), GetColTypeMap(), IntCols, TVec< TVal, TSizeTy >::Len(), and TExcept::Throw().
Aggregates vector into a single scalar value according to a policy.
Aggregate vector into a single scalar value according to a policy. Used for choosing an attribute value for a node when this node appears in several records and has conflicting attribute values
Definition at line 1544 of file table.h.
References aaCount, aaFirst, aaLast, aaMax, aaMean, aaMedian, aaMin, aaSum, TVec< TVal, TSizeTy >::Len(), and TVec< TVal, TSizeTy >::Sort().
|
inline |
Gets iterator to the first valid row of the table.
Definition at line 1241 of file table.h.
References FirstValidRow, and TRowIterator.
Referenced by AddIdColumn(), AggregateCols(), ChangeContext(), ColConcat(), ColConcatConst(), ColGenericOp(), Dump(), GetCollidingRows(), GetFltRowIdxByVal(), GetIntRowIdxByVal(), GetStrRowIdxByMap(), GroupAux(), GroupByFltCol(), GroupByIntCol(), GroupByStrCol(), Intersection(), IsNextK(), Join(), Minus(), Order(), ReadFltCol(), ReadIntCol(), ReadStrCol(), Reindex(), RequestIndexFlt(), RequestIndexInt(), RequestIndexStrMap(), SaveSS(), Select(), SelectAtomic(), SelectAtomicConst(), SelectFirstNRows(), SelfSimJoinPerGroup(), SimJoin(), StoreFltCol(), StoreIntCol(), StoreStrCol(), ThresholdJoinCountCollisions(), ThresholdJoinCountPerJoinKeyCollisions(), Union(), and UpdateFltFromTable().
|
inline |
Gets iterator with reomve to the first valid row.
Definition at line 1245 of file table.h.
References FirstValidRow, and TRowIteratorWithRemove.
Referenced by KeepSortedRows(), Select(), SelectAtomic(), and SelectAtomicConst().
Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes.
Definition at line 3445 of file table.cpp.
References AddEdgeAttributes(), AddNodeAttributes(), AggrPolicy, Assert, atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::BegI(), TVec< TVal, TSizeTy >::BegI(), CheckAndAddFltNode(), Context, DstCol, DstNodeAttrV, EdgeAttrV, THash< TKey, TDat, THashFunc >::EndI(), TVec< TVal, TSizeTy >::EndI(), FltCols, GetColIdx(), GetColType(), THash< TKey, TDat, THashFunc >::GetDat(), TStrHash< TDat, TStringPool, THashFunc >::GetKey(), IntCols, THash< TKey, TDat, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), TNEANet::New(), SrcCol, SrcNodeAttrV, StrColMaps, and TTableContext::StringVals.
Referenced by GetGraphsFromSequence(), and GetNextGraphFromSequence().
TTableContext * TTable::ChangeContext | ( | TTableContext * | Context | ) |
Changes the current context. Moves all object items to the new context.
Definition at line 921 of file table.cpp.
References TStrHash< TDat, TStringPool, THashFunc >::AddKey(), atStr, BegRI(), Context, EndRI(), GetColIdx(), GetSchemaColName(), GetSchemaColType(), GetStrVal(), TVec< TVal, TSizeTy >::Len(), Sch, StrColMaps, TTableContext::StringVals, and TInt::Val.
|
protected |
Checks if given NodeVal
is seen earlier; if not, add it to Graph
and hashmap NodeVals
.
Definition at line 1533 of file table.h.
References THash< TKey, TDat, THashFunc >::AddDat(), THash< TKey, TDat, THashFunc >::AddKey(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::IsKey(), and THash< TKey, TDat, THashFunc >::Len().
Referenced by BuildGraph().
|
inlineprotected |
Checks if given NodeId
is seen earlier; if not, add it to Graph
and hashmap NodeVals
.
Definition at line 3388 of file table.cpp.
References THashSet< TKey, THashFunc >::AddKey(), and THashSet< TKey, THashFunc >::IsKey().
Definition at line 5310 of file table.cpp.
References CompareKeyVal().
Referenced by QSortKeyVal().
void TTable::Classify | ( | TPredicate & | Predicate, |
const TStr & | LabelName, | ||
const TInt & | PositiveLabel = 1 , |
||
const TInt & | NegativeLabel = 0 |
||
) |
Definition at line 2805 of file table.cpp.
References ClassifyAux(), and Select().
void TTable::ClassifyAtomic | ( | const TStr & | Col1, |
const TStr & | Col2, | ||
TPredComp | Cmp, | ||
const TStr & | LabelName, | ||
const TInt & | PositiveLabel = 1 , |
||
const TInt & | NegativeLabel = 0 |
||
) |
Definition at line 2866 of file table.cpp.
References ClassifyAux(), and SelectAtomic().
|
inline |
Definition at line 1301 of file table.h.
References ClassifyAux(), and SelectAtomicConst().
|
protected |
Adds a label attribute with positive labels on selected rows and negative labels on the rest.
Definition at line 4694 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atInt, IntCols, TVec< TVal, TSizeTy >::Len(), and NumRows.
Referenced by Classify(), ClassifyAtomic(), and ClassifyAtomicConst().
Performs columnwise addition. See TTable::ColGenericOp.
Definition at line 4816 of file table.cpp.
References aoAdd, and ColGenericOp().
void TTable::ColAdd | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise addition with column of given table.
Definition at line 4949 of file table.cpp.
References aoAdd, and ColGenericOp().
void TTable::ColAdd | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs addition of column values and given Num
.
Definition at line 5063 of file table.cpp.
References aoAdd, and ColGenericOp().
void TTable::ColConcat | ( | const TStr & | Attr1, |
const TStr & | Attr2, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" |
||
) |
Concatenates two string columns.
Definition at line 5083 of file table.cpp.
References TStrHash< TDat, TStringPool, THashFunc >::AddKey(), AddStrCol(), atStr, BegRI(), Context, EndRI(), GetColIdx(), GetColTypeMap(), IsAttr(), StrColMaps, TTableContext::StringVals, TExcept::Throw(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
void TTable::ColConcat | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Concatenates string column with column of given table.
Definition at line 5117 of file table.cpp.
References TStrHash< TDat, TStringPool, THashFunc >::AddKey(), AddStrCol(), atStr, BegRI(), Context, EndRI(), GetColIdx(), GetColTypeMap(), TRowIterator::GetRowIdx(), TRowIterator::GetStrAttr(), IsAttr(), NumValidRows, StrColMaps, TTableContext::StringVals, TExcept::Throw(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
void TTable::ColConcatConst | ( | const TStr & | Attr1, |
const TStr & | Val, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" |
||
) |
Concatenates column values with given string value.
Definition at line 5182 of file table.cpp.
References TStrHash< TDat, TStringPool, THashFunc >::AddKey(), AddStrCol(), atStr, BegRI(), Context, EndRI(), GetColIdx(), GetColTypeMap(), IsAttr(), StrColMaps, TTableContext::StringVals, TExcept::Throw(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
Performs columnwise division. See TTable::ColGenericOp.
Definition at line 4828 of file table.cpp.
References aoDiv, and ColGenericOp().
void TTable::ColDiv | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise division with column of given table.
Definition at line 4964 of file table.cpp.
References aoDiv, and ColGenericOp().
void TTable::ColDiv | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs division of column values and given Num
.
Definition at line 5075 of file table.cpp.
References aoDiv, and ColGenericOp().
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
const TStr & | Attr2, | ||
const TStr & | ResAttr, | ||
TArithOp | op | ||
) |
Performs columnwise arithmetic operation.
Performs Attr1 OP Attr2 and stores it in Attr1 If ResAttr != "", result is stored in a new column ResAttr
Definition at line 4752 of file table.cpp.
References AddFltCol(), AddIntCol(), aoAdd, aoDiv, aoMax, aoMin, aoMod, aoMul, aoSub, atFlt, atInt, atStr, BegRI(), ColGenericOpMP(), EndRI(), FltCols, GetColIdx(), GetColTypeMap(), GetMP(), IntCols, IsAttr(), TExcept::Throw(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
Referenced by ColAdd(), ColDiv(), ColMax(), ColMin(), ColMod(), ColMul(), and ColSub().
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr, | ||
TArithOp | op, | ||
TBool | AddToFirstTable | ||
) |
Performs columnwise arithmetic operation with column of given table.
Definition at line 4844 of file table.cpp.
References AddFltCol(), AddIntCol(), aoAdd, aoDiv, aoMod, aoMul, aoSub, atFlt, atInt, atStr, BegRI(), EndRI(), FltCols, GetColIdx(), GetColTypeMap(), TRowIterator::GetFltAttr(), TRowIterator::GetIntAttr(), TRowIterator::GetRowIdx(), IntCols, IsAttr(), NumValidRows, TExcept::Throw(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResAttr, | ||
TArithOp | op, | ||
const TBool | floatCast | ||
) |
Performs arithmetic op of column values and given Num
.
Definition at line 4975 of file table.cpp.
References AddFltCol(), AddIntCol(), aoAdd, aoDiv, aoMod, aoMul, aoSub, atFlt, atInt, atStr, BegRI(), ColGenericOpMP(), EndRI(), FltCols, GetColIdx(), GetColTypeMap(), GetMP(), IntCols, IsAttr(), TExcept::Throw(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
void TTable::ColGenericOpMP | ( | TInt | ArgColIdx1, |
TInt | ArgColIdx2, | ||
TAttrType | ArgType1, | ||
TAttrType | ArgType2, | ||
TInt | ResColIdx, | ||
TArithOp | op | ||
) |
Definition at line 4708 of file table.cpp.
References aoAdd, aoDiv, aoMax, aoMin, aoMod, aoMul, aoSub, atFlt, atInt, FltCols, TRowIterator::GetFltAttr(), TRowIterator::GetIntAttr(), GetPartitionRanges(), TRowIterator::GetRowIdx(), IntCols, TVec< TVal, TSizeTy >::Len(), and TExcept::Throw().
Referenced by ColGenericOp().
void TTable::ColGenericOpMP | ( | const TInt & | ColIdx1, |
const TInt & | ColIdx2, | ||
TAttrType | ArgType, | ||
const TFlt & | Num, | ||
TArithOp | op, | ||
TBool | ShouldCast | ||
) |
Definition at line 5032 of file table.cpp.
References aoAdd, aoDiv, aoMod, aoMul, aoSub, atFlt, atInt, FltCols, TRowIterator::GetFltAttr(), TRowIterator::GetIntAttr(), GetPartitionRanges(), TRowIterator::GetRowIdx(), IntCols, TVec< TVal, TSizeTy >::Len(), and TExcept::Throw().
Performs max of two columns. See TTable::ColGenericOp.
Definition at line 4840 of file table.cpp.
References aoMax, and ColGenericOp().
Performs min of two columns. See TTable::ColGenericOp.
Definition at line 4836 of file table.cpp.
References aoMin, and ColGenericOp().
Performs columnwise modulus. See TTable::ColGenericOp.
Definition at line 4832 of file table.cpp.
References aoMod, and ColGenericOp().
void TTable::ColMod | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise modulus with column of given table.
Definition at line 4969 of file table.cpp.
References aoMod, and ColGenericOp().
void TTable::ColMod | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs modulus of column values and given Num
.
Definition at line 5079 of file table.cpp.
References aoMod, and ColGenericOp().
Performs columnwise multiplication. See TTable::ColGenericOp.
Definition at line 4824 of file table.cpp.
References aoMul, and ColGenericOp().
void TTable::ColMul | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise multiplication with column of given table.
Definition at line 4959 of file table.cpp.
References aoMul, and ColGenericOp().
void TTable::ColMul | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs multiplication of column values and given Num
.
Definition at line 5071 of file table.cpp.
References aoMul, and ColGenericOp().
Performs columnwise subtraction. See TTable::ColGenericOp.
Definition at line 4820 of file table.cpp.
References aoSub, and ColGenericOp().
void TTable::ColSub | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise subtraction with column of given table.
Definition at line 4954 of file table.cpp.
References aoSub, and ColGenericOp().
void TTable::ColSub | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs subtraction of column values and given Num
.
Definition at line 5067 of file table.cpp.
References aoSub, and ColGenericOp().
|
staticprotected |
Definition at line 5297 of file table.cpp.
Referenced by CheckSortedKeyVal(), GetPivotKeyVal(), ISortKeyVal(), and PartitionKeyVal().
|
inlineprotected |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics).
Definition at line 3064 of file table.cpp.
References atFlt, atInt, atStr, TStr::CStr(), FltCols, GetStrVal(), and IntCols.
Referenced by CompareRows(), GetPivot(), ISort(), Merge(), Partition(), and QSort().
|
inlineprotected |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics).
Definition at line 3088 of file table.cpp.
References CompareRows(), and TVec< TVal, TSizeTy >::Len().
|
inlineprotected |
Appends all rows of T
to this table, and recalculate indices.
Definition at line 683 of file table.h.
References AddTable(), and Reindex().
Counts number of unique elements.
Count the number of appearences of the different elements of column . Record results in column CountCol
Definition at line 1802 of file table.cpp.
References aaCount, TVec< TVal, TSizeTy >::Add(), and Aggregate().
void TTable::Defrag | ( | ) |
Releases memory of deleted rows, and defrags.
Also updates meta-data as row indices have changed Need some liveness analysis of columns
Definition at line 3311 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), Assert, FirstValidRow, FltCols, GetColIdx(), IdColName, IntCols, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, NumValidRows, RowIdMap, and StrColMaps.
|
inlineprotected |
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 661 of file table.h.
References ColTypeMap, THash< TKey, TDat, THashFunc >::DelKey(), and NormalizeColName().
Referenced by Rename().
Removes suffix to column name if exists.
Definition at line 4648 of file table.cpp.
References TStr::GetCh(), TStr::GetSubStr(), TStr::Len(), TVec< TVal, TSizeTy >::Len(), and Sch.
Referenced by DenormalizeSchema(), and Save().
|
protected |
Removes suffix to column names in the Schema.
Definition at line 4665 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), DenormalizeColName(), TVec< TVal, TSizeTy >::Len(), and Sch.
Referenced by Dump(), GetSchema(), and SaveSS().
void TTable::Dump | ( | FILE * | OutF = stdout | ) | const |
Prints table contents to a text file.
Definition at line 887 of file table.cpp.
References atFlt, atInt, atStr, BegRI(), DenormalizeSchema(), EndRI(), GetSchemaColName(), GetSchemaColType(), TVec< TVal, TSizeTy >::Len(), and Sch.
Referenced by SaveSS().
|
inline |
Gets iterator to the last valid row of the table.
Definition at line 1243 of file table.h.
References Last, and TRowIterator.
Referenced by AddIdColumn(), AggregateCols(), ChangeContext(), ColConcat(), ColConcatConst(), ColGenericOp(), Dump(), GetCollidingRows(), GetFltRowIdxByVal(), GetIntRowIdxByVal(), GetStrRowIdxByMap(), GroupAux(), GroupByFltCol(), GroupByIntCol(), GroupByStrCol(), Intersection(), IsNextK(), Join(), Minus(), Order(), ReadFltCol(), ReadIntCol(), ReadStrCol(), Reindex(), RequestIndexFlt(), RequestIndexInt(), RequestIndexStrMap(), SaveSS(), Select(), SelectAtomic(), SelectAtomicConst(), SelectFirstNRows(), SelfSimJoinPerGroup(), SimJoin(), StoreFltCol(), StoreIntCol(), StoreStrCol(), ThresholdJoinCountCollisions(), ThresholdJoinCountPerJoinKeyCollisions(), Union(), and UpdateFltFromTable().
|
inline |
Gets iterator with reomve to the last valid row.
Definition at line 1247 of file table.h.
References Last, and TRowIteratorWithRemove.
Fills RowIdBuckets with sets of row ids.
Fill RowIdBuckets with sets of row ids, partitioned on the value of the column SplitAttr, according to the intervals specified by SplitIntervals. Called by ToVarGraphSequence and ToVarGraphSequenceIterator.
Definition at line 3599 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), GetColIdx(), InitRowIdBuckets(), IntCols, Invalid, TVec< TVal, TSizeTy >::Len(), Next, and RowIdBuckets.
Referenced by ToVarGraphSequence(), and ToVarGraphSequenceIterator().
|
protected |
Fills RowIdBuckets with sets of row ids.
Fill RowIdBuckets with sets of row ids partitioned on the value of the column SplitAttr, according to the windows specified by JumpSize and WindowSize. Called by ToGraphSequence and ToGraphSequenceIterator.
Definition at line 3547 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), Assert, GetColIdx(), InitRowIdBuckets(), IntCols, Invalid, TVec< TVal, TSizeTy >::Len(), TInt::Mn, TInt::Mx, Next, and RowIdBuckets.
Referenced by ToGraphSequence(), and ToGraphSequenceIterator().
Definition at line 337 of file table.cpp.
References AddColType(), AddSchemaCol(), atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::Clr(), TVec< TVal, TSizeTy >::Clr(), ColTypeMap, TPair< TVal1, TVal2 >::GetVal1(), TPair< TVal1, TVal2 >::GetVal2(), IsNextDirty, and Sch.
Referenced by LoadTableShM(), and TTable().
Gets index of column ColName
among columns of the same type in the schema.
Definition at line 1013 of file table.h.
References ColTypeMap, THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::IsKey(), and NormalizeColName().
Referenced by AddEdgeAttributes(), AddNodeAttributes(), AddRow(), AddStrVal(), AddTable(), Aggregate(), AggregateCols(), BuildGraph(), ChangeContext(), ColConcat(), ColConcatConst(), ColGenericOp(), Defrag(), FillBucketsByInterval(), FillBucketsByWindow(), TRowIterator::GetFltAttr(), GetFltVal(), TRowIterator::GetIntAttr(), GetIntVal(), TRowIteratorWithRemove::GetNextFltAttr(), TRowIteratorWithRemove::GetNextIntAttr(), TRowIterator::GetStrMapByName(), GetStrMapByName(), GetStrVal(), GroupAux(), GroupByFltCol(), GroupByIntCol(), GroupByIntColMP(), GroupByStrCol(), Join(), Order(), ProjectInPlace(), ReadFltCol(), ReadIntCol(), ReadStrCol(), Reindex(), RemoveFirstRow(), RemoveRow(), Select(), SelectAtomic(), SelectAtomicConst(), SelfSimJoinPerGroup(), SpliceByGroup(), ThresholdJoin(), UpdateFltFromTable(), and UpdateFltFromTableMP().
Gets set of row ids of rows common with table T
.
Definition at line 4014 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THashSet< TKey, THashFunc >::AddKey(), atFlt, atInt, atStr, BegRI(), EndRI(), GetColTypeMap(), GetIdColName(), GroupAux(), IdColName, THash< TKey, TDat, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), NormalizeColName(), Sch, TExcept::Throw(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
Referenced by Intersection(), Minus(), and Union().
Gets type of column ColName
.
Definition at line 1227 of file table.h.
References ColTypeMap, THash< TKey, TDat, THashFunc >::GetDat(), and NormalizeColName().
Referenced by AddEdgeAttributes(), AddNodeAttributes(), AddRow(), AddStrVal(), AddTable(), Aggregate(), BuildGraph(), GetDstNodeFltAttrV(), GetDstNodeIntAttrV(), GetDstNodeStrAttrV(), GetEdgeFltAttrV(), GetEdgeIntAttrV(), GetEdgeStrAttrV(), GetSrcNodeFltAttrV(), GetSrcNodeIntAttrV(), GetSrcNodeStrAttrV(), GroupingSanityCheck(), IsNextK(), Join(), Order(), Project(), ReadFltCol(), ReadIntCol(), ReadStrCol(), Select(), SelectAtomic(), SelectAtomicConst(), SelfSimJoinPerGroup(), SimJoin(), ThresholdJoin(), ThresholdJoinInputCorrectness(), Unique(), UpdateFltFromTable(), and UpdateFltFromTableMP().
Gets column type and index of ColName
.
Definition at line 666 of file table.h.
References ColTypeMap, THash< TKey, TDat, THashFunc >::GetDat(), and NormalizeColName().
Referenced by AggregateCols(), ColConcat(), ColConcatConst(), ColGenericOp(), GetCollidingRows(), GroupAux(), InitializeJointTable(), and Rename().
|
inline |
|
inlineprotected |
Gets the Key of the Context StringVals pool. Used by ToGraph method in conv.cpp.
Definition at line 622 of file table.h.
References Context, TStrHash< TDat, TStringPool, THashFunc >::GetKey(), and TTableContext::StringVals.
TSize TTable::GetContextMemUsedKB | ( | ) |
Returns approximate memory used by table context in [KB].
Definition at line 3969 of file table.cpp.
References Context, TStrHash< TDat, TStringPool, THashFunc >::GetMemUsed(), and TTableContext::StringVals.
Referenced by PrintContextSize().
|
inline |
Gets the name of the column to be used as dst nodes in the graph.
Definition at line 1165 of file table.h.
References DstCol.
TStrV TTable::GetDstNodeFltAttrV | ( | ) | const |
Gets dst node float attribute name vector.
Definition at line 1049 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, DstNodeAttrV, FltCols, GetColType(), and TVec< TVal, TSizeTy >::Len().
TStrV TTable::GetDstNodeIntAttrV | ( | ) | const |
Gets dst node int attribute name vector.
Definition at line 1016 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atInt, DstNodeAttrV, GetColType(), IntCols, and TVec< TVal, TSizeTy >::Len().
TStrV TTable::GetDstNodeStrAttrV | ( | ) | const |
Gets dst node str attribute name vector.
Definition at line 1082 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atStr, DstNodeAttrV, GetColType(), TVec< TVal, TSizeTy >::Len(), and StrColMaps.
TStrV TTable::GetEdgeFltAttrV | ( | ) | const |
Gets edge float attribute name vector.
Definition at line 1060 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, EdgeAttrV, FltCols, GetColType(), and TVec< TVal, TSizeTy >::Len().
TStrV TTable::GetEdgeIntAttrV | ( | ) | const |
Gets edge int attribute name vector.
Definition at line 1027 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atInt, EdgeAttrV, GetColType(), IntCols, and TVec< TVal, TSizeTy >::Len().
TStrV TTable::GetEdgeStrAttrV | ( | ) | const |
Gets edge str attribute name vector.
Definition at line 1094 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atStr, EdgeAttrV, GetColType(), TVec< TVal, TSizeTy >::Len(), and StrColMaps.
|
static |
Extracts edge TTable from PNEANet.
Definition at line 3741 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, TNEANet::TEdgeI::GetDstNId(), TNEANet::TEdgeI::GetFltAttrNames(), TNEANet::TEdgeI::GetId(), TNEANet::TEdgeI::GetIntAttrNames(), TNEANet::TEdgeI::GetSrcNId(), TNEANet::TEdgeI::GetStrAttrNames(), Last, TVec< TVal, TSizeTy >::Len(), and New().
|
static |
Extracts edge TTable from parallel graph PNGraphMP.
Definition at line 3799 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), Assert, atInt, TNGraphMP::TEdgeI::GetDstNId(), TNGraphMP::TEdgeI::GetSrcNId(), TVec< TVal, TSizeTy >::Len(), and New().
|
protected |
Gets the start index to a chunk of empty rows of size NewRows
.
Definition at line 4376 of file table.cpp.
References Assert, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, and NumValidRows.
Referenced by AddNRows(), and AddSelectedRows().
Returns the first graph of the sequence.
Return the first graph of the sequence corresponding to the sets of row ids in RowIdBuckets. This is used by the ToGraph*Iterator functions.
Definition at line 3628 of file table.cpp.
References AggrPolicy, CurrBucket, and GetNextGraphFromSequence().
Referenced by ToGraphSequenceIterator(), and ToVarGraphSequenceIterator().
|
static |
Extracts node and edge property TTables from THash.
Definition at line 3852 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::GetDat(), Last, and New().
Gets the rows containing Val in flt column ColName
.
Returns the RowIdxs in the float column given by ColName which have value Val, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5453 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), BegRI(), EndRI(), FltColIndexes, THash< TKey, TDat, THashFunc >::GetDat(), and THash< TKey, TDat, THashFunc >::IsKey().
Gets the value of float attribute ColName
at row RowIdx
.
Definition at line 1024 of file table.h.
References FltCols, and GetColIdx().
Referenced by IsNextK().
Get the float value at column ColIdx
and row RowIdx
.
Definition at line 1120 of file table.h.
References FltCols.
Returns a sequence of graphs.
Return a sequence of graphs, each constructed from the set of row ids corresponding to a particular bucket in RowIdBuckets.
Definition at line 3616 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), BuildGraph(), TVec< TVal, TSizeTy >::Len(), and RowIdBuckets.
Referenced by ToGraphSequence(), and ToVarGraphSequence().
|
inlineprotected |
Gets name of the id column of this table.
Definition at line 636 of file table.h.
References IdColName.
Referenced by GetCollidingRows(), Intersection(), Minus(), RemoveFirstRow(), RemoveRow(), SpliceByGroup(), Union(), and UnionAll().
Gets the rows containing Val in int column ColName
.
Returns the RowIdxs in the integer column given by ColName which have value Val, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5410 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), BegRI(), EndRI(), THash< TKey, TDat, THashFunc >::GetDat(), IntColIndexes, and THash< TKey, TDat, THashFunc >::IsKey().
Gets the value of integer attribute ColName
at row RowIdx
.
Definition at line 1020 of file table.h.
References GetColIdx(), and IntCols.
Referenced by IsNextK().
Get the integer value at column ColIdx
and row RowIdx
.
Definition at line 1116 of file table.h.
References IntCols.
|
protected |
Gets the id of the last valid row of the table.
TSize TTable::GetMemUsedKB | ( | ) |
Returns approximate memory used by table in [KB].
Definition at line 3940 of file table.cpp.
References FltCols, THash< TKey, TDat, THashFunc >::GetMemUsed(), TVec< TVal, TSizeTy >::GetMemUsed(), GroupIDMapping, GroupMapping, IntCols, TVec< TVal, TSizeTy >::Len(), Next, RowIdBuckets, RowIdMap, and StrColMaps.
Referenced by PrintSize().
|
inlinestatic |
Definition at line 527 of file table.h.
References UseMP.
Referenced by Aggregate(), ColGenericOp(), Join(), LoadSS(), Order(), SelectAtomicConst(), SetFltColToConstMP(), UpdateFltFromTable(), and UpdateFltFromTableMP().
|
protected |
Returns the next graph in sequence corresponding to RowIdBuckets.
Returns the next graph in sequence corresponding to RowIdBuckets. This is used to iterate over the graph sequence by constructing one graph at a time. Called by NextGraphIterator().
Definition at line 3634 of file table.cpp.
References AggrPolicy, BuildGraph(), CurrBucket, TVec< TVal, TSizeTy >::Len(), and RowIdBuckets.
Referenced by GetFirstGraphFromSequence(), and NextGraphIterator().
|
static |
Extracts node TTable from PNEANet.
Definition at line 3689 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, TNEANet::TNodeI::GetFltAttrNames(), TNEANet::TNodeI::GetId(), TNEANet::TNodeI::GetIntAttrNames(), TNEANet::TNodeI::GetStrAttrNames(), Last, TVec< TVal, TSizeTy >::Len(), and New().
|
inline |
|
inline |
Gets number of valid, i.e. not deleted, rows in this table.
Definition at line 1234 of file table.h.
References NumValidRows.
Referenced by Join().
Partitions the table into NumPartitions
and populate Partitions
with the ranges.
Definition at line 1177 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), FirstValidRow, IsNextDirty, TVec< TVal, TSizeTy >::Len(), Next, NumValidRows, and TVec< TVal, TSizeTy >::Reserve().
Referenced by ColGenericOpMP(), GroupByIntColMP(), Join(), SelectAtomicConst(), SetFltColToConstMP(), and UpdateFltFromTableMP().
|
protected |
Gets pivot element for QSort.
Definition at line 3110 of file table.cpp.
References CompareRows(), and TInt::GetRnd().
Referenced by Partition().
Definition at line 5338 of file table.cpp.
References CompareKeyVal(), and TInt::GetRnd().
Referenced by PartitionKeyVal().
Gets a map of logical to physical row ids.
Definition at line 1237 of file table.h.
References RowIdMap.
Returns pointer to a new table created from given Table
, with name set to TableName
.
Automatically detects the Schema of a input file (data is assumed to be in tsv format)
Definition at line 455 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, TSsParser::Eof(), TSsParser::GetFlds(), TStr::GetSubStr(), TSsParser::IsCmt(), TSsParser::IsFlt(), TSsParser::IsInt(), TStr::Len(), TSsParser::Next(), TVec< TVal, TSizeTy >::PutAll(), and TExcept::Throw().
|
inline |
Gets the schema of this table.
Definition at line 1125 of file table.h.
References DenormalizeSchema().
Gets name of the column with index Idx
in the schema.
Definition at line 638 of file table.h.
References Sch.
Referenced by AddRow(), AddTable(), ChangeContext(), Dump(), InitializeJointTable(), ProjectInPlace(), and SaveSS().
Gets type of the column with index Idx
in the schema.
Definition at line 640 of file table.h.
References Sch.
Referenced by ChangeContext(), Dump(), InitializeJointTable(), ProjectInPlace(), and SaveSS().
|
inline |
Gets the name of the column to be used as src nodes in the graph.
Definition at line 1158 of file table.h.
References SrcCol.
TStrV TTable::GetSrcNodeFltAttrV | ( | ) | const |
Gets src node float attribute name vector.
Definition at line 1038 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, FltCols, GetColType(), TVec< TVal, TSizeTy >::Len(), and SrcNodeAttrV.
TStrV TTable::GetSrcNodeIntAttrV | ( | ) | const |
Gets src node int attribute name vector.
Definition at line 1005 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atInt, GetColType(), IntCols, TVec< TVal, TSizeTy >::Len(), and SrcNodeAttrV.
TStrV TTable::GetSrcNodeStrAttrV | ( | ) | const |
Gets src node str attribute name vector.
Definition at line 1071 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atStr, GetColType(), TVec< TVal, TSizeTy >::Len(), SrcNodeAttrV, and StrColMaps.
Gets the string with KeyId
.
Definition at line 1109 of file table.h.
References Context, TStrHash< TDat, TStringPool, THashFunc >::GetKey(), and TTableContext::StringVals.
Gets the integer mapping of the string at column ColIdx
at row RowIdx
.
Definition at line 1033 of file table.h.
References StrColMaps.
Gets the integer mapping of the string at column ColName
at row RowIdx
.
Definition at line 1038 of file table.h.
References GetColIdx(), and StrColMaps.
Gets the rows containing int mapping Map in str column ColName
.
Returns the RowIdxs in the string column given by ColName which have the string with integer mapping Map, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5431 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), BegRI(), EndRI(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::IsKey(), and StrMapColIndexes.
Gets the value in column with id ColIdx
at row RowIdx
.
Definition at line 626 of file table.h.
References Context, TStrHash< TDat, TStringPool, THashFunc >::GetKey(), StrColMaps, and TTableContext::StringVals.
Referenced by AddEdgeAttributes(), AddNodeAttributes(), ChangeContext(), CompareRows(), TRowIteratorWithRemove::GetNextStrAttr(), TRowIterator::GetStrAttr(), GetStrVal(), GetStrValById(), GetStrValByName(), IsNextK(), and Order().
Gets the value of string attribute ColName
at row RowIdx
.
Definition at line 1028 of file table.h.
References GetColIdx(), and GetStrVal().
Gets the value of the string attribute at column ColIdx
at row RowIdx
.
Definition at line 1043 of file table.h.
References GetStrVal().
Gets the value of the string attribute at column ColName
at row RowIdx
.
Definition at line 1048 of file table.h.
References GetStrVal().
void TTable::Group | ( | const TStrV & | GroupBy, |
const TStr & | GroupColName, | ||
TBool | Ordered = true , |
||
TBool | UsePhysicalIds = true |
||
) |
Groups rows depending on values of GroupBy
columns.
Specify columns to group by, name of column in new table, whether to treat columns as ordered If name of column is an empty string, no column is created
Definition at line 1569 of file table.cpp.
References GroupAux(), NormalizeColName(), and NormalizeColNameV().
Referenced by Join(), and SelfSimJoinPerGroup().
|
protected |
Helper function for grouping.
If KeepUnique is true, UniqueVec will be modified to contain a row from each group If KeepUnique is false, then normal grouping is done and a new column is added depending on whether GroupColName is empty
Definition at line 1322 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), AddSchemaCol(), atFlt, atInt, atStr, BegRI(), EndRI(), GetColIdx(), GetColTypeMap(), GroupIDMapping, GroupMapping, GroupStmtNames, IdColName, IntCols, IsColName(), TVec< TVal, TSizeTy >::ISort(), TVec< TVal, TSizeTy >::Len(), NormalizeColNameV(), StoreGroupCol(), TExcept::Throw(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
Referenced by Aggregate(), GetCollidingRows(), Group(), SelfSimJoinPerGroup(), SpliceByGroup(), and Unique().
|
protected |
Groups/hashes by a single column with float values. Returns hash table with grouping.
Definition at line 1626 of file table.h.
References atFlt, BegRI(), EndRI(), FltCols, GetColIdx(), GroupingSanityCheck(), IdColName, IntCols, IsRowValid(), TVec< TVal, TSizeTy >::Len(), and TExcept::Throw().
Referenced by Aggregate(), Join(), and Unique().
|
protected |
Groups/hashes by a single column with integer values.
Group/hash by a single column with integer values. Returns hash table with grouping. IndexSet tells what rows to consider (vector of physical row ids). It is used only if All == true. Note that the IndexSet option is currently not used anywhere.
Definition at line 1598 of file table.h.
References atInt, BegRI(), EndRI(), GetColIdx(), GroupingSanityCheck(), IdColName, IntCols, IsRowValid(), TVec< TVal, TSizeTy >::Len(), and TExcept::Throw().
Referenced by Aggregate(), Join(), ThresholdJoin(), Unique(), and UpdateFltFromTable().
void TTable::GroupByIntColMP | ( | const TStr & | GroupBy, |
THashMP< TInt, TIntV > & | Grouping, | ||
TBool | UsePhysicalIds = true |
||
) | const |
Groups/hashes by a single column with integer values, using OpenMP multi-threading.
Definition at line 1225 of file table.cpp.
References atInt, THashMP< TKey, TDat, THashFunc >::Gen(), GetColIdx(), TRowIterator::GetIntAttr(), GetPartitionRanges(), TRowIterator::GetRowIdx(), GroupingSanityCheck(), IdColName, TVec< TVal, TSizeTy >::Len(), NumValidRows, and TExcept::Throw().
Referenced by Aggregate(), Join(), and UpdateFltFromTableMP().
|
protected |
Groups/hashes by a single column with string values. Returns hash table with grouping.
Definition at line 1653 of file table.h.
References atStr, BegRI(), EndRI(), GetColIdx(), GroupingSanityCheck(), IdColName, IntCols, IsRowValid(), TVec< TVal, TSizeTy >::Len(), StrColMaps, and TExcept::Throw().
Referenced by Aggregate(), Join(), ThresholdJoin(), and Unique().
|
protected |
Checks if grouping key exists and matches given attr type.
Definition at line 1215 of file table.cpp.
References GetColType(), IsColName(), and TExcept::Throw().
Referenced by GroupByFltCol(), GroupByIntCol(), GroupByIntColMP(), and GroupByStrCol().
|
protected |
Increments the next vector and set last, NumRows and NumValidRows.
Definition at line 2255 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), TVec< TVal, TSizeTy >::Empty(), Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, and NumValidRows.
Initializes an empty table for the join of this table with the given table.
Definition at line 1916 of file table.cpp.
References Assert, atFlt, atInt, atStr, Context, FltCols, GetColTypeMap(), GetSchemaColName(), GetSchemaColType(), IdColName, IntCols, TVec< TVal, TSizeTy >::Len(), New(), Sch, StrColMaps, TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
Referenced by IsNextK(), Join(), SelfSimJoinPerGroup(), SimJoin(), ThresholdJoinOutputTable(), and ThresholdJoinPerJoinKeyOutputTable().
void TTable::InitIds | ( | ) |
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition at line 1883 of file table.cpp.
References AddIdColumn(), and IdColName.
Referenced by TTable().
|
protected |
Initializes the RowIdBuckets vector which will be used for the graph sequence creation.
Definition at line 3535 of file table.cpp.
References TVec< TVal, TSizeTy >::Clr(), TVec< TVal, TSizeTy >::Gen(), TVec< TVal, TSizeTy >::Len(), and RowIdBuckets.
Referenced by FillBucketsByInterval(), and FillBucketsByWindow().
Returns intersection of this table with given Table
.
Definition at line 4567 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), BegRI(), Context, EndRI(), GetCollidingRows(), GetIdColName(), THashSet< TKey, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), New(), and Sch.
Definition at line 1422 of file table.h.
References Intersection().
Referenced by Intersection().
|
protected |
|
protected |
Checks if Attr
is an attribute of this table schema.
Definition at line 4628 of file table.cpp.
References IsColName().
Referenced by ColConcat(), ColConcatConst(), and ColGenericOp().
Definition at line 646 of file table.h.
References ColTypeMap, THash< TKey, TDat, THashFunc >::IsKey(), and NormalizeColName().
Referenced by AddGraphAttribute(), AddGraphAttributeV(), Aggregate(), GroupAux(), GroupingSanityCheck(), IsAttr(), Join(), Project(), ProjectInPlace(), ReadFltCol(), ReadIntCol(), ReadStrCol(), Rename(), SelfSimJoinPerGroup(), SetDstCol(), SetSrcCol(), SimJoin(), ThresholdJoinInputCorrectness(), and UpdateFltFromTable().
TBool TTable::IsLastGraphOfSequence | ( | ) |
Checks if the end of the graph sequence is reached.
Definition at line 3685 of file table.cpp.
References CurrBucket, TVec< TVal, TSizeTy >::Len(), and RowIdBuckets.
PTable TTable::IsNextK | ( | const TStr & | OrderCol, |
TInt | K, | ||
const TStr & | GroupBy, | ||
const TStr & | RankColName = "" |
||
) |
Distance based filter.
Creates a table T' where the rows are joint rows (T[r1],T[r2]) such that r2 is one of the successive rows to r1 when this table is ordered by OrderCol, and both r1 and r2 have the same value of GroupBy column
Definition at line 3891 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, BegRI(), TStr::Empty(), EndRI(), GetColType(), GetFltVal(), GetIntVal(), GetStrVal(), InitializeJointTable(), Last, Next, and Order().
|
protected |
Performs insertion sort on given vector V
.
Definition at line 3096 of file table.cpp.
References CompareRows().
Referenced by QSort().
Definition at line 5321 of file table.cpp.
References CompareKeyVal().
Referenced by QSortKeyVal().
|
inlineprotected |
Checks if RowIdx
corresponds to a valid (i.e. not deleted) row.
Definition at line 801 of file table.h.
Referenced by GroupByFltCol(), GroupByIntCol(), and GroupByStrCol().
Performs equijoin.
Perform equi-join with given columns - i.e. keep tuple pairs where this->Col1 == Table->Col2 Implementation: Hash-Join - build a hash out of the smaller table hash the larger table and check for collisions
Definition at line 2272 of file table.cpp.
References atFlt, atInt, atStr, BegRI(), TStr::CStr(), EndRI(), GetColIdx(), GetColType(), THash< TKey, TDat, THashFunc >::GetDat(), TRowIterator::GetFltAttr(), TRowIterator::GetIntAttr(), GetMP(), GetNumValidRows(), GetPartitionRanges(), TRowIterator::GetRowIdx(), TRowIterator::GetStrMapById(), Group(), GroupByFltCol(), GroupByIntCol(), GroupByIntColMP(), GroupByStrCol(), InitializeJointTable(), IsColName(), THash< TKey, TDat, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), NumValidRows, TVec< TVal, TSizeTy >::Reserve(), and TExcept::Throw().
Referenced by Join(), and SelfJoin().
Definition at line 1360 of file table.h.
References Join().
|
protected |
Removes all rows that are not mentioned in the SORTED vector KeepV
.
Definition at line 1152 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddKey(), BegRIWR(), TRowIteratorWithRemove::GetNextRowIdx(), Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), and TRowIteratorWithRemove::RemoveNext().
Referenced by Unique().
|
inlinestatic |
Loads table from a binary format.
TTableContext Context
must be provided as a parameter and loaded separately from a table load as it can be shared among multiple tables. Context
can be loaded either before and after the table load, but must be available for operations that require string values (as opposed to string references).
Definition at line 971 of file table.h.
References TTable().
|
inlinestatic |
Static constructor to load table from memory.
Cannot perform operations that edit the edge vectors of nodes or perform illegal operations on any internal hashes (deletion or swapping keys)
Definition at line 975 of file table.h.
References LoadTableShM(), and TTable().
|
static |
Loads table from spread sheet (TSV, CSV, etc). Note: HasTitleLine = true is not supported. Please comment title lines instead.
Definition at line 795 of file table.cpp.
Referenced by TempMotifCounter::TempMotifCounter().
|
static |
Loads table from spread sheet - but only load the columns specified by RelevantCols. Note: HasTitleLine = true is not supported. Please comment title lines instead.
Definition at line 757 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atStr, GetMP(), TVec< TVal, TSizeTy >::Len(), LoadSSPar(), LoadSSSeq(), and New().
|
staticprotected |
Parallelly loads data from input file at InFNm into NewTable. Only work when NewTable has no string columns.
Definition at line 507 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, TPt< TRec >::Clr(), TSsParserMP::CountNewLinesInRange(), TSsParserMP::GetFlds(), TSsParserMP::GetFltFromFldV(), TSsParserMP::GetIntFromFldV(), TSsParserMP::GetStartPosV(), TSsParserMP::GetStreamLen(), TSsParserMP::GetStreamPos(), Last, TVec< TVal, TSizeTy >::Len(), TSsParserMP::Next(), TSsParserMP::NextFromIndex(), NormalizeColName(), TSsParserMP::SetStreamPos(), TSsParserMP::SkipCommentLines(), and TExcept::Throw().
Referenced by LoadSS().
|
staticprotected |
Sequentially loads data from input file at InFNm into NewTable.
Definition at line 669 of file table.cpp.
References Assert, atFlt, atInt, atStr, TPt< TRec >::Clr(), TSsParser::GetFlds(), TSsParser::GetFlt(), TSsParser::GetInt(), Last, TVec< TVal, TSizeTy >::Len(), TSsParser::Next(), NormalizeColName(), and TExcept::Throw().
Referenced by LoadSS().
|
private |
Definition at line 360 of file table.cpp.
References Context, FirstValidRow, FltCols, GenerateColTypeMap(), IntCols, LastValidRow, THash< TKey, TDat, THashFunc >::LoadShM(), TVec< TVal, TSizeTy >::LoadShM(), Next, NumRows, NumValidRows, and StrColMaps.
Referenced by LoadShM().
|
protected |
Helper function for parallel QSort.
Definition at line 3178 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), and CompareRows().
Referenced by QSortPar().
Returns table with rows that are present in this table but not in given Table
.
Definition at line 4592 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), BegRI(), Context, EndRI(), GetCollidingRows(), GetIdColName(), THashSet< TKey, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), New(), and Sch.
Definition at line 1425 of file table.h.
References Minus().
Referenced by Minus().
|
inlinestatic |
Definition at line 932 of file table.h.
References TTable().
Referenced by GetEdgeTable(), GetEdgeTablePN(), GetFltNodePropertyTable(), GetNodeTable(), InitializeJointTable(), Intersection(), LoadSS(), Minus(), Project(), SelfSimJoinPerGroup(), SpliceByGroup(), TableFromHashMap(), Union(), and UnionAll().
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
PNEANet TTable::NextGraphIterator | ( | ) |
Calls to this must be preceded by a call to one of the above ToGraph*Iterator functions.
Definition at line 3681 of file table.cpp.
References GetNextGraphFromSequence().
Adds suffix to column name if it doesn't exist.
Definition at line 530 of file table.h.
References TStr::GetCh(), and TStr::Len().
Referenced by AddColType(), AddGraphAttribute(), AddGraphAttributeV(), AddSchemaCol(), DelColType(), GetColIdx(), GetCollidingRows(), GetColType(), GetColTypeMap(), Group(), IsColName(), LoadSSPar(), LoadSSSeq(), NormalizeColNameV(), Rename(), SetCommonNodeAttrs(), SetDstCol(), SetSrcCol(), Unique(), UpdateFltFromTable(), and UpdateFltFromTableMP().
Adds suffix to column name if it doesn't exist.
Definition at line 539 of file table.h.
References TVec< TVal, TSizeTy >::Add(), TVec< TVal, TSizeTy >::Len(), and NormalizeColName().
Referenced by Aggregate(), Group(), GroupAux(), ProjectInPlace(), SelfSimJoinPerGroup(), SpliceByGroup(), and Unique().
void TTable::Order | ( | const TStrV & | OrderBy, |
TStr | OrderColName = "" , |
||
TBool | ResetRankByMSC = false , |
||
TBool | Asc = true |
||
) |
Orders the rows according to the values in columns of OrderBy (in descending lexicographic order).
Definition at line 3240 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atInt, BegRI(), TStr::Empty(), EndRI(), FirstValidRow, GetColIdx(), GetColType(), GetMP(), GetStrVal(), IntCols, IsNextDirty, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, NumValidRows, QSort(), and QSortPar().
Referenced by IsNextK().
|
protected |
Partitions vector for QSort.
Definition at line 3126 of file table.cpp.
References CompareRows(), GetPivot(), and TVec< TVal, TSizeTy >::Swap().
Referenced by QSort().
Definition at line 5355 of file table.cpp.
References CompareKeyVal(), GetPivotKeyVal(), and TVec< TVal, TSizeTy >::Swap().
Referenced by QSortKeyVal().
void TTable::PrintContextSize | ( | ) |
Definition at line 3959 of file table.cpp.
References Context, GetContextMemUsedKB(), TUInt64::GetStr(), TStrHash< TDat, TStringPool, THashFunc >::Len(), TStrHash< TDat, TStringPool, THashFunc >::Reserved(), and TTableContext::StringVals.
Definition at line 1788 of file table.cpp.
References THash< TKey, TDat, THashFunc >::BegI(), THash< TKey, TDat, THashFunc >::EndI(), TVec< TVal, TSizeTy >::GetDat(), TVec< TVal, TSizeTy >::Len(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
void TTable::PrintSize | ( | ) |
Definition at line 3930 of file table.cpp.
References FltCols, GetMemUsedKB(), TUInt64::GetStr(), IntCols, TVec< TVal, TSizeTy >::Len(), NumRows, NumValidRows, StrColMaps, and TInt::Val.
Returns table with only the columns in ProjectCols
.
Definition at line 4615 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), Context, GetColType(), IsColName(), TVec< TVal, TSizeTy >::Len(), New(), and TExcept::Throw().
void TTable::ProjectInPlace | ( | const TStrV & | ProjectCols | ) |
Keeps only the columns specified in ProjectCols
.
Definition at line 5239 of file table.cpp.
References AddColType(), atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::Clr(), ColTypeMap, TVec< TVal, TSizeTy >::Del(), FltCols, GetColIdx(), GetSchemaColName(), GetSchemaColType(), IdColName, IntCols, IsColName(), THashSet< TKey, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), NormalizeColNameV(), Sch, StrColMaps, and TExcept::Throw().
Referenced by SelfSimJoinPerGroup().
|
protected |
Performs QSort on given vector V
.
Definition at line 3154 of file table.cpp.
References CompareRows(), ISort(), and Partition().
Referenced by Order(), and QSortPar().
Definition at line 5378 of file table.cpp.
References CheckSortedKeyVal(), ISortKeyVal(), and PartitionKeyVal().
Referenced by TSnap::ToGraphMP(), TSnap::ToNetworkMP(), and TSnap::ToNetworkMP2().
|
protected |
Performs QSort in parallel on given vector V
.
Definition at line 3206 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), TVec< TVal, TSizeTy >::Clr(), TVec< TVal, TSizeTy >::Len(), Merge(), and QSort().
Referenced by Order().
Reads values of entire float column into Result
.
Definition at line 5221 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, BegRI(), EndRI(), GetColIdx(), GetColType(), IsColName(), and TExcept::Throw().
Reads values of entire int column into Result
.
Definition at line 5212 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atInt, BegRI(), EndRI(), GetColIdx(), GetColType(), IsColName(), and TExcept::Throw().
Reads values of entire string column into Result
.
Definition at line 5230 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atStr, BegRI(), EndRI(), GetColIdx(), GetColType(), IsColName(), and TExcept::Throw().
|
protected |
Reinitializes row ids.
Register (cache) result of a grouping statement by a single group-by attribute T is a hash table mapping a key x to rows keyed by x => DISABLED FOR NOW
Definition at line 1889 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddDat(), BegRI(), THash< TKey, TDat, THashFunc >::Clr(), EndRI(), GetColIdx(), IdColName, IntCols, and RowIdMap.
Referenced by ConcatTable().
|
protected |
Removes first valid row of the table.
Definition at line 1122 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddDat(), FirstValidRow, GetColIdx(), GetIdColName(), IntCols, Invalid, LastValidRow, Next, NumValidRows, and RowIdMap.
Referenced by RemoveRow().
Removes row with id RowIdx
.
Definition at line 1135 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddDat(), Assert, FirstValidRow, GetColIdx(), GetIdColName(), IntCols, Invalid, LastValidRow, Next, NumValidRows, RemoveFirstRow(), and RowIdMap.
Referenced by TRowIteratorWithRemove::RemoveNext().
Renames a column.
Definition at line 1105 of file table.cpp.
References AddColType(), DelColType(), GetColTypeMap(), IsColName(), TVec< TVal, TSizeTy >::Len(), NormalizeColName(), Sch, TVec< TVal, TSizeTy >::SetVal(), and TExcept::Throw().
Returns a re-numbered column name based on number of existing columns with conflicting names.
Definition at line 4632 of file table.cpp.
References TStr::GetCh(), TInt::GetStr(), TStr::GetSubStr(), TStr::Len(), TVec< TVal, TSizeTy >::Len(), and Sch.
Creates Index for Flt Column ColName
.
Creates an Index on float column ColName. The index is hash-based, going from the column value to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5495 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), BegRI(), EndRI(), FltColIndexes, THash< TKey, TDat, THashFunc >::GetDat(), and THash< TKey, TDat, THashFunc >::IsKey().
Creates Index for Int Column ColName
.
Creates an Index on integer column ColName. The index is hash-based, going from the column value to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5476 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), BegRI(), EndRI(), THash< TKey, TDat, THashFunc >::GetDat(), IntColIndexes, and THash< TKey, TDat, THashFunc >::IsKey().
Creates Index for Str Column ColName
.
Creates an Index on string column given by ColName. The index is hash-based, going from the column value (that is, the integer mapping of the string value) to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5514 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), BegRI(), EndRI(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::IsKey(), and StrMapColIndexes.
|
protected |
Resizes the table to hold RowCount
rows.
Definition at line 4330 of file table.cpp.
References FirstValidRow, FltCols, IntCols, Invalid, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumValidRows, TVec< TVal, TSizeTy >::Reserve(), StrColMaps, and TVec< TVal, TSizeTy >::Trunc().
Referenced by AddNJointRowsMP().
void TTable::Save | ( | TSOut & | SOut | ) |
Saves table schema and content to a binary format.
Note that TTableContext must be saved separately as it can be shared among multiple tables.
Definition at line 854 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddDat(), atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::BegI(), ColTypeMap, DenormalizeColName(), THash< TKey, TDat, THashFunc >::EndI(), FirstValidRow, FltCols, TSOut::Flush(), TPair< TVal1, TVal2 >::GetVal1(), TPair< TVal1, TVal2 >::GetVal2(), IntCols, LastValidRow, Next, NumRows, NumValidRows, THash< TKey, TDat, THashFunc >::Save(), TVec< TVal, TSizeTy >::Save(), TInt::Save(), and StrColMaps.
Referenced by SaveBin().
void TTable::SaveBin | ( | const TStr & | OutFNm | ) |
Saves table schema and content to a binary file.
Definition at line 849 of file table.cpp.
References Save().
void TTable::SaveSS | ( | const TStr & | OutFNm | ) |
Saves table schema and content to a TSV file.
Definition at line 800 of file table.cpp.
References atFlt, atInt, atStr, BegRI(), TStr::CStr(), DenormalizeSchema(), Dump(), EndRI(), GetSchemaColName(), GetSchemaColType(), TVec< TVal, TSizeTy >::Len(), NumValidRows, and Sch.
void TTable::Select | ( | TPredicate & | Predicate, |
TIntV & | SelectedRows, | ||
TBool | Remove = true |
||
) |
Selects rows that satisfy given Predicate
.
Select. Has two modes of operation:
Definition at line 2750 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, BegRI(), BegRIWR(), EndRI(), TPredicate::Eval(), GetColIdx(), GetColType(), TRowIteratorWithRemove::GetNextFltAttr(), TRowIteratorWithRemove::GetNextIntAttr(), TRowIteratorWithRemove::GetNextRowIdx(), TRowIteratorWithRemove::GetNextStrAttr(), TPredicate::GetVariables(), Last, TVec< TVal, TSizeTy >::Len(), TRowIteratorWithRemove::RemoveNext(), TPredicate::SetFltVal(), TPredicate::SetIntVal(), and TPredicate::SetStrVal().
Referenced by Classify(), and Select().
|
inline |
Definition at line 1266 of file table.h.
References Select().
void TTable::SelectAtomic | ( | const TStr & | Col1, |
const TStr & | Col2, | ||
TPredComp | Cmp, | ||
TIntV & | SelectedRows, | ||
TBool | Remove = true |
||
) |
Selects rows using atomic compare operation.
Select atomic - optimized cases of select with predicate of an atomic form: compare attribute to attribute or compare attribute to a constant
Definition at line 2813 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), Assert, atFlt, atInt, atStr, BegRI(), BegRIWR(), Cmp(), EndRI(), TPredicate::EvalAtom(), TPredicate::EvalStrAtom(), GetColIdx(), GetColType(), TRowIteratorWithRemove::GetNextFltAttr(), TRowIteratorWithRemove::GetNextIntAttr(), TRowIteratorWithRemove::GetNextRowIdx(), TRowIteratorWithRemove::GetNextStrAttr(), Last, TRowIteratorWithRemove::RemoveNext(), SUBSTR, SUPERSTR, and TExcept::Throw().
Referenced by ClassifyAtomic(), and SelectAtomic().
Definition at line 1278 of file table.h.
References SelectAtomic().
void TTable::SelectAtomicConst | ( | const TStr & | Col, |
const TPrimitive & | Val, | ||
TPredComp | Cmp, | ||
TIntV & | SelectedRows, | ||
PTable & | SelectedTable, | ||
TBool | Remove = true , |
||
TBool | Table = true |
||
) |
Selects rows where the value of Col
matches given primitive Val
.
Definition at line 2873 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), Assert, atStr, BegRI(), BegRIWR(), TRowIterator::CompareAtomicConst(), TRowIteratorWithRemove::CompareAtomicConst(), TRowIterator::CompareAtomicConstTStr(), EndRI(), FirstValidRow, GetColIdx(), GetColType(), GetMP(), TRowIteratorWithRemove::GetNextRowIdx(), GetPartitionRanges(), TRowIterator::GetRowIdx(), TPrimitive::GetStr(), TPrimitive::GetType(), Invalid, IsNextDirty, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumValidRows, TRowIteratorWithRemove::RemoveNext(), TVec< TVal, TSizeTy >::Reserve(), and TExcept::Throw().
Referenced by ClassifyAtomicConst(), SelectAtomicConst(), SelectAtomicFltConst(), SelectAtomicIntConst(), and SelectAtomicStrConst().
|
inline |
Definition at line 1290 of file table.h.
References SelectAtomicConst().
|
inline |
Definition at line 1296 of file table.h.
References SelectAtomicConst().
Definition at line 1323 of file table.h.
References SelectAtomicConst().
|
inline |
Definition at line 1326 of file table.h.
References SelectAtomicConst().
Definition at line 1309 of file table.h.
References SelectAtomicConst().
|
inline |
Definition at line 1312 of file table.h.
References SelectAtomicConst().
Definition at line 1316 of file table.h.
References SelectAtomicConst().
|
inline |
Definition at line 1319 of file table.h.
References SelectAtomicConst().
void TTable::SelectFirstNRows | ( | const TInt & | N | ) |
Selects first N rows from the table.
Definition at line 3357 of file table.cpp.
References Assert, BegRI(), EndRI(), TRowIterator::GetRowIdx(), Invalid, Last, LastValidRow, Next, and NumValidRows.
Joins table with itself, on values of Col
.
Definition at line 1366 of file table.h.
References Join().
|
inline |
Definition at line 1367 of file table.h.
References SimJoin().
PTable TTable::SelfSimJoinPerGroup | ( | const TStr & | GroupAttr, |
const TStr & | SimCol, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
Returns table with schema (GroupId1, GroupId2, Similarity).
Definition at line 2094 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddDat(), atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::BegI(), BegRI(), Context, THash< TKey, TDat, THashFunc >::EndI(), EndRI(), GetColIdx(), GetColType(), THash< TKey, TDat, THashFunc >::GetDat(), TInt::GetStr(), Group(), IntCols, IsColName(), THash< TKey, TDat, THashFunc >::IsKey(), THash< TKey, TDat, THashFunc >::Len(), New(), StrColMaps, and TExcept::Throw().
Referenced by SelfSimJoinPerGroup().
PTable TTable::SelfSimJoinPerGroup | ( | const TStrV & | GroupBy, |
const TStr & | SimCol, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
SimJoinPerGroup performs SimJoin based on a set of attributes. Performs the grouping internally and returns a projection of the columns on which groupby was performed along with the similarity.
Definition at line 2180 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), THash< TKey, TDat, THashFunc >::BegI(), TVec< TVal, TSizeTy >::Clr(), THash< TKey, TDat, THashFunc >::EndI(), THash< TKey, TDat, THashFunc >::GetDat(), GroupAux(), InitializeJointTable(), THash< TKey, TDat, THashFunc >::IsKey(), TStr::IsStrIn(), TVec< TVal, TSizeTy >::Len(), NormalizeColNameV(), ProjectInPlace(), SelfSimJoinPerGroup(), TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
|
inline |
Sets the columns to be used as both src and dst node attributes.
Definition at line 1188 of file table.h.
References TVec< TVal, TSizeTy >::Add(), CommonNodeAttrs, and NormalizeColName().
|
inline |
Sets the name of the column to be used as dst nodes in the graph.
Definition at line 1167 of file table.h.
References DstCol, IsColName(), NormalizeColName(), and TExcept::Throw().
|
inlineprotected |
Sets the first valid row of the TTable.
Definition at line 811 of file table.h.
References FirstValidRow, Invalid, TVec< TVal, TSizeTy >::Len(), Next, and TExcept::Throw().
Definition at line 4152 of file table.cpp.
References FltCols, GetMP(), GetPartitionRanges(), TRowIterator::GetRowIdx(), TVec< TVal, TSizeTy >::Len(), and TExcept::Throw().
Referenced by UpdateFltFromTableMP().
|
inlinestatic |
Definition at line 526 of file table.h.
References UseMP.
|
inline |
Sets the name of the column to be used as src nodes in the graph.
Definition at line 1160 of file table.h.
References IsColName(), NormalizeColName(), SrcCol, and TExcept::Throw().
PTable TTable::SimJoin | ( | const TStrV & | Cols1, |
const TTable & | Table, | ||
const TStrV & | Cols2, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
Returns Similarity based join of two tables based on a given distance metric and a given threshold. Records (r1, r2) that are returned satisfy the criterion: d(r1, r2) <= Threshold
Definition at line 1994 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), Assert, atFlt, atInt, BegRI(), EndRI(), GetColType(), Haversine, InitializeJointTable(), IsColName(), Jaccard, L1Norm, L2Norm, TVec< TVal, TSizeTy >::Len(), TExcept::Throw(), and TFlt::Val.
Referenced by SelfSimJoin().
Splices table into subtables according to a grouping statement.
Definition at line 1808 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::BegI(), Context, THash< TKey, TDat, THashFunc >::EndI(), FltCols, GetColIdx(), THash< TKey, TDat, THashFunc >::GetDat(), TVec< TVal, TSizeTy >::GetDat(), GetIdColName(), GroupAux(), IdColName, IntCols, TVec< TVal, TSizeTy >::Len(), New(), NormalizeColNameV(), RowIdMap, Sch, StrColMaps, TPair< TVal1, TVal2 >::Val1, and TPair< TVal1, TVal2 >::Val2.
Adds entire flt column to table.
Definition at line 4104 of file table.cpp.
References AddColType(), AddSchemaCol(), atFlt, BegRI(), EndRI(), FltCols, TVec< TVal, TSizeTy >::Len(), and NumRows.
|
protected |
Parallel helper function for grouping. - we currently don't support such parallel grouping by complex keys.
Stores column for a group. Physical row ids have to be passed.
Definition at line 1310 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddColType(), atInt, IntCols, TVec< TVal, TSizeTy >::Len(), and NumRows.
Referenced by GroupAux().
Adds entire int column to table.
Definition at line 4087 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atInt, BegRI(), EndRI(), IntCols, TVec< TVal, TSizeTy >::Len(), and NumRows.
Adds entire str column to table.
Definition at line 4121 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), AddColType(), TStrHash< TDat, TStringPool, THashFunc >::AddKey(), AddSchemaCol(), atStr, BegRI(), Context, EndRI(), FltCols, TStrHash< TDat, TStringPool, THashFunc >::GetKeyId(), TVec< TVal, TSizeTy >::Len(), NumRows, StrColMaps, and TTableContext::StringVals.
|
inlinestatic |
Builds table from hash table of int->int.
Definition at line 988 of file table.h.
References New().
Referenced by TSnap::MapHits(), and TSnap::MapPageRank().
|
inlinestatic |
PTable TTable::ThresholdJoin | ( | const TStr & | KeyCol1, |
const TStr & | JoinCol1, | ||
const TTable & | Table, | ||
const TStr & | KeyCol2, | ||
const TStr & | JoinCol2, | ||
TInt | Threshold, | ||
TBool | PerJoinKey = false |
||
) |
Definition at line 2644 of file table.cpp.
References atInt, atStr, GetColIdx(), GetColType(), GroupByIntCol(), GroupByStrCol(), NumValidRows, ThresholdJoinCountCollisions(), ThresholdJoinCountPerJoinKeyCollisions(), ThresholdJoinInputCorrectness(), ThresholdJoinOutputTable(), ThresholdJoinPerJoinKeyOutputTable(), and TExcept::Throw().
|
protected |
Definition at line 2506 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddDat(), atStr, BegRI(), EndRI(), THash< TKey, TDat, THashFunc >::GetDat(), IntCols, THash< TKey, TDat, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), StrColMaps, and TTriple< TVal1, TVal2, TVal3 >::Val3.
Referenced by ThresholdJoin().
|
protected |
Definition at line 2557 of file table.cpp.
References THash< TKey, TDat, THashFunc >::AddDat(), atStr, BegRI(), EndRI(), THash< TKey, TDat, THashFunc >::GetDat(), IntCols, THash< TKey, TDat, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), StrColMaps, TPair< TVal1, TVal2 >::Val1, TPair< TVal1, TVal2 >::Val2, and TTriple< TVal1, TVal2, TVal3 >::Val3.
Referenced by ThresholdJoin().
|
protected |
Definition at line 2478 of file table.cpp.
References TStr::CStr(), GetColType(), IsColName(), and TExcept::Throw().
Referenced by ThresholdJoin().
|
protected |
Definition at line 2608 of file table.cpp.
References THash< TKey, TDat, THashFunc >::BegI(), THash< TKey, TDat, THashFunc >::EndI(), InitializeJointTable(), TTriple< TVal1, TVal2, TVal3 >::Val1, TTriple< TVal1, TVal2, TVal3 >::Val2, and TTriple< TVal1, TVal2, TVal3 >::Val3.
Referenced by ThresholdJoin().
|
protected |
Definition at line 2622 of file table.cpp.
References THashSet< TKey, THashFunc >::AddKey(), THash< TKey, TDat, THashFunc >::BegI(), THash< TKey, TDat, THashFunc >::EndI(), InitializeJointTable(), THashSet< TKey, THashFunc >::IsKey(), TTriple< TVal1, TVal2, TVal3 >::Val1, TTriple< TVal1, TVal2, TVal3 >::Val2, and TTriple< TVal1, TVal2, TVal3 >::Val3.
Referenced by ThresholdJoin().
Creates a sequence of graphs based on grouping specified by GroupAttr.
Definition at line 3662 of file table.cpp.
References TInt::Mn, TInt::Mx, and ToGraphSequence().
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3676 of file table.cpp.
References TInt::Mn, TInt::Mx, and ToGraphSequenceIterator().
TVec< PNEANet > TTable::ToGraphSequence | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TInt | WindowSize, | ||
TInt | JumpSize, | ||
TInt | StartVal = TInt::Mn , |
||
TInt | EndVal = TInt::Mx |
||
) |
Creates a sequence of graphs based on values of column SplitAttr and windows specified by JumpSize and WindowSize.
Definition at line 3651 of file table.cpp.
References FillBucketsByWindow(), and GetGraphsFromSequence().
Referenced by ToGraphPerGroup().
PNEANet TTable::ToGraphSequenceIterator | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TInt | WindowSize, | ||
TInt | JumpSize, | ||
TInt | StartVal = TInt::Mn , |
||
TInt | EndVal = TInt::Mx |
||
) |
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3666 of file table.cpp.
References FillBucketsByWindow(), and GetFirstGraphFromSequence().
Referenced by ToGraphPerGroupIterator().
TVec< PNEANet > TTable::ToVarGraphSequence | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TIntPrV | SplitIntervals | ||
) |
Creates a sequence of graphs based on values of column SplitAttr and intervals specified by SplitIntervals.
Definition at line 3657 of file table.cpp.
References FillBucketsByInterval(), and GetGraphsFromSequence().
PNEANet TTable::ToVarGraphSequenceIterator | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TIntPrV | SplitIntervals | ||
) |
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3671 of file table.cpp.
References FillBucketsByInterval(), and GetFirstGraphFromSequence().
Returns union of this table with given Table
.
Definition at line 4531 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), BegRI(), Context, EndRI(), GetCollidingRows(), GetIdColName(), THashSet< TKey, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), New(), and Sch.
Definition at line 1413 of file table.h.
References Union().
Referenced by Union().
Returns union of this table with given Table
, preserving duplicates.
Definition at line 4511 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), Context, GetIdColName(), TVec< TVal, TSizeTy >::Len(), New(), and Sch.
Definition at line 1416 of file table.h.
References UnionAll().
Referenced by UnionAll().
void TTable::UnionAllInPlace | ( | const TTable & | Table | ) |
Same as TTable::ConcatTable.
Definition at line 4524 of file table.cpp.
References AddTable().
|
inline |
Definition at line 1419 of file table.h.
References UnionAllInPlace().
Referenced by UnionAllInPlace().
void TTable::Unique | ( | const TStr & | Col | ) |
Removes rows with duplicate values in given column.
Definition at line 1266 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::BegI(), THash< TKey, TDat, THashFunc >::EndI(), GetColType(), GroupByFltCol(), GroupByIntCol(), GroupByStrCol(), KeepSortedRows(), and NormalizeColName().
Referenced by Unique().
Removes rows with duplicate values in given columns.
Definition at line 1298 of file table.cpp.
References GroupAux(), KeepSortedRows(), TVec< TVal, TSizeTy >::Len(), NormalizeColNameV(), and Unique().
void TTable::UpdateFltFromTable | ( | const TStr & | KeyAttr, |
const TStr & | UpdateAttr, | ||
const TTable & | Table, | ||
const TStr & | FKeyAttr, | ||
const TStr & | ReadAttr, | ||
TFlt | DefaultFltVal = 0.0 |
||
) |
Definition at line 4242 of file table.cpp.
References atFlt, atInt, BegRI(), EndRI(), FltCols, GetColIdx(), GetColType(), THash< TKey, TDat, THashFunc >::GetDat(), GetMP(), GroupByIntCol(), IsColName(), THash< TKey, TDat, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), NormalizeColName(), TExcept::Throw(), and UpdateFltFromTableMP().
void TTable::UpdateFltFromTableMP | ( | const TStr & | KeyAttr, |
const TStr & | UpdateAttr, | ||
const TTable & | Table, | ||
const TStr & | FKeyAttr, | ||
const TStr & | ReadAttr, | ||
TFlt | DefaultFltVal = 0.0 |
||
) |
Definition at line 4174 of file table.cpp.
References atFlt, atInt, FltCols, GetColIdx(), GetColType(), THashMP< TKey, TDat, THashFunc >::GetDat(), TRowIterator::GetFltAttr(), TRowIterator::GetIntAttr(), GetMP(), GetPartitionRanges(), GroupByIntColMP(), THashMP< TKey, TDat, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), NormalizeColName(), NumRows, TVec< TVal, TSizeTy >::PutAll(), SetFltColToConstMP(), sync_bool_compare_and_swap(), and TExcept::Throw().
Referenced by UpdateFltFromTable().
|
protected |
Template for utility function to update a grouping hash map.
Definition at line 1680 of file table.h.
References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), THash< TKey, TDat, THashFunc >::GetDat(), and THash< TKey, TDat, THashFunc >::IsKey().
|
protected |
Template for utility function to update a parallel grouping hash map.
Definition at line 1692 of file table.h.
References TVec< TVal, TSizeTy >::Add(), THashMP< TKey, TDat, THashFunc >::AddDat(), THashMP< TKey, TDat, THashFunc >::GetDat(), and THashMP< TKey, TDat, THashFunc >::IsKey().
|
protected |
Updates table state after adding one or more rows.
Definition at line 4140 of file table.cpp.
References TVec< TVal, TSizeTy >::Add(), Last, LastValidRow, Next, NumRows, and NumValidRows.
Referenced by AddRow().
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
protected |
Aggregation policy used for solving conflicts between different values of an attribute of the same node.
Definition at line 601 of file table.h.
Referenced by BuildGraph(), GetFirstGraphFromSequence(), and GetNextGraphFromSequence().
String columns are implemented using a string pool to fight memory fragmentation. The value of string column c
in row r
is Context.StringVals.GetKey(StrColMaps[c][r])
Definition at line 564 of file table.h.
Referenced by AddColType(), DelColType(), GenerateColTypeMap(), GetColIdx(), GetColType(), GetColTypeMap(), IsColName(), ProjectInPlace(), Save(), and TTable().
|
protected |
List of attribute pairs with values common to source and destination and their common given name.
Definition at line 594 of file table.h.
Referenced by AddNodeAttributes(), and SetCommonNodeAttrs().
|
protected |
Execution Context.
Definition at line 545 of file table.h.
Referenced by AddStrVal(), BuildGraph(), ChangeContext(), ColConcat(), ColConcatConst(), GetContext(), GetContextKey(), GetContextMemUsedKB(), GetStr(), GetStrVal(), InitializeJointTable(), Intersection(), LoadTableShM(), Minus(), PrintContextSize(), Project(), SelfSimJoinPerGroup(), SpliceByGroup(), StoreStrCol(), Union(), and UnionAll().
|
protected |
Current row id bucket - used when generating a sequence of graphs using an iterator.
Definition at line 600 of file table.h.
Referenced by GetFirstGraphFromSequence(), GetNextGraphFromSequence(), and IsLastGraphOfSequence().
|
protected |
Column (attribute) to serve as dst nodes when constructing the graph.
Definition at line 590 of file table.h.
Referenced by BuildGraph(), GetDstCol(), and SetDstCol().
|
protected |
List of columns (attributes) to serve as destination node attributes.
Definition at line 593 of file table.h.
Referenced by AddGraphAttribute(), AddGraphAttributeV(), BuildGraph(), GetDstNodeFltAttrV(), GetDstNodeIntAttrV(), and GetDstNodeStrAttrV().
|
protected |
List of columns (attributes) to serve as edge attributes.
Definition at line 591 of file table.h.
Referenced by AddEdgeAttributes(), AddGraphAttribute(), AddGraphAttributeV(), BuildGraph(), GetEdgeFltAttrV(), GetEdgeIntAttrV(), and GetEdgeStrAttrV().
|
protected |
Physical index of first valid row.
Definition at line 553 of file table.h.
Referenced by AddTable(), BegRI(), BegRIWR(), Defrag(), TRowIteratorWithRemove::GetNextRowIdx(), GetPartitionRanges(), TRowIteratorWithRemove::IsFirst(), LoadTableShM(), Order(), RemoveFirstRow(), RemoveRow(), ResizeTable(), Save(), SelectAtomicConst(), SetFirstValidRow(), and TTable().
Indexes for Float Columns.
Definition at line 570 of file table.h.
Referenced by GetFltRowIdxByVal(), and RequestIndexFlt().
Data columns of floating point attributes.
Definition at line 559 of file table.h.
Referenced by AddEdgeAttributes(), AddFltCol(), AddJointRow(), AddNJointRowsMP(), AddNodeAttributes(), AddNRows(), AddRow(), AddSelectedRows(), AddTable(), Aggregate(), AggregateCols(), BuildGraph(), ColGenericOp(), ColGenericOpMP(), CompareRows(), Defrag(), GetDstNodeFltAttrV(), GetEdgeFltAttrV(), TRowIterator::GetFltAttr(), GetFltVal(), GetFltValAtRowIdx(), GetMemUsedKB(), TRowIteratorWithRemove::GetNextFltAttr(), GetSrcNodeFltAttrV(), GroupByFltCol(), InitializeJointTable(), LoadTableShM(), PrintSize(), ProjectInPlace(), ResizeTable(), Save(), SetFltColToConstMP(), SpliceByGroup(), StoreFltCol(), StoreStrCol(), TTable(), UpdateFltFromTable(), and UpdateFltFromTableMP().
Maps grouping statements to their (group id –> group-by key) mapping.
A mapping between the newly-added group id column name of a grouping statement to a vector of the group-by attribute names and a flag specifying whether those attributes are ordered or not.
Definition at line 577 of file table.h.
Referenced by GetMemUsedKB(), and GroupAux().
Maps grouping statements to their (group-by key –> group id) mapping.
A mapping between grouping statement (group-by attribute names and 'Ordered' flag) to a hash map between given group ids to their corresponding group-by key.
Definition at line 581 of file table.h.
Referenced by Aggregate(), GetMemUsedKB(), and GroupAux().
Maps user-given grouping statement names to their group-by attributes.
Definition at line 573 of file table.h.
Referenced by GroupAux().
|
protected |
A mapping from column name to column type and column index among columns of the same type.
Name of column associated with (optional) permanent row identifiers.
Definition at line 565 of file table.h.
Referenced by AddRow(), AddTable(), Aggregate(), Defrag(), GetCollidingRows(), GetIdColName(), GroupAux(), GroupByFltCol(), GroupByIntCol(), GroupByIntColMP(), GroupByStrCol(), InitializeJointTable(), InitIds(), ProjectInPlace(), Reindex(), and SpliceByGroup().
Indexes for Int Columns.
Definition at line 568 of file table.h.
Referenced by GetIntRowIdxByVal(), and RequestIndexInt().
Next
[i] is the successor of row i
. Table iterators follow the order dictated by Next
Data columns of integer attributes.
Definition at line 558 of file table.h.
Referenced by AddEdgeAttributes(), AddIdColumn(), AddIntCol(), AddJointRow(), AddNJointRowsMP(), AddNodeAttributes(), AddNRows(), AddRow(), AddSelectedRows(), AddTable(), Aggregate(), AggregateCols(), BuildGraph(), ClassifyAux(), ColGenericOp(), ColGenericOpMP(), CompareRows(), Defrag(), FillBucketsByInterval(), FillBucketsByWindow(), GetDstNodeIntAttrV(), GetEdgeIntAttrV(), TRowIterator::GetIntAttr(), GetIntVal(), GetIntValAtRowIdx(), GetMemUsedKB(), TRowIteratorWithRemove::GetNextIntAttr(), GetSrcNodeIntAttrV(), GroupAux(), GroupByFltCol(), GroupByIntCol(), GroupByStrCol(), InitializeJointTable(), LoadTableShM(), Order(), PrintSize(), ProjectInPlace(), Reindex(), RemoveFirstRow(), RemoveRow(), ResizeTable(), Save(), SelfSimJoinPerGroup(), SpliceByGroup(), StoreGroupCol(), StoreIntCol(), ThresholdJoinCountCollisions(), ThresholdJoinCountPerJoinKeyCollisions(), and TTable().
|
staticprotected |
Special value for Next vector entry - logically removed row.
Definition at line 487 of file table.h.
Referenced by AddTable(), FillBucketsByInterval(), FillBucketsByWindow(), IsRowValid(), RemoveFirstRow(), RemoveRow(), ResizeTable(), SelectAtomicConst(), SelectFirstNRows(), and SetFirstValidRow().
|
protected |
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing GetPartitionRanges.
Definition at line 603 of file table.h.
Referenced by GenerateColTypeMap(), GetPartitionRanges(), Order(), SelectAtomicConst(), and TTable().
|
staticprotected |
Special value for Next vector entry - last row in table.
Definition at line 486 of file table.h.
Referenced by AddJointRow(), AddNJointRowsMP(), AddTable(), Defrag(), EndRI(), EndRIWR(), GetEdgeTable(), GetEmptyRowsStart(), GetFltNodePropertyTable(), GetNodeTable(), IncrementNext(), IsNextK(), KeepSortedRows(), LoadSSPar(), LoadSSSeq(), Order(), Select(), SelectAtomic(), SelectAtomicConst(), SelectFirstNRows(), TTable(), and UpdateTableForNewRow().
|
protected |
Physical index of last valid row.
Definition at line 554 of file table.h.
Referenced by AddJointRow(), AddNJointRowsMP(), AddTable(), Defrag(), GetEmptyRowsStart(), IncrementNext(), KeepSortedRows(), LoadTableShM(), Order(), RemoveFirstRow(), RemoveRow(), ResizeTable(), Save(), SelectAtomicConst(), SelectFirstNRows(), TTable(), and UpdateTableForNewRow().
|
protected |
A vector describing the logical order of the rows.
Definition at line 555 of file table.h.
Referenced by AddJointRow(), AddNJointRowsMP(), AddNRows(), AddSelectedRows(), AddTable(), Defrag(), FillBucketsByInterval(), FillBucketsByWindow(), GetEmptyRowsStart(), GetMemUsedKB(), TRowIteratorWithRemove::GetNextRowIdx(), GetPartitionRanges(), IncrementNext(), IsNextK(), IsRowValid(), LoadTableShM(), TRowIterator::Next(), Order(), RemoveFirstRow(), RemoveRow(), ResizeTable(), Save(), SelectAtomicConst(), SelectFirstNRows(), SetFirstValidRow(), TTable(), and UpdateTableForNewRow().
|
protected |
Number of rows in the table (valid and invalid).
Definition at line 551 of file table.h.
Referenced by AddFltCol(), AddIdColumn(), AddIntCol(), AddJointRow(), AddNJointRowsMP(), AddStrCol(), AddTable(), ClassifyAux(), Defrag(), GetEmptyRowsStart(), GetNumRows(), IncrementNext(), LoadTableShM(), Order(), PrintSize(), Save(), StoreFltCol(), StoreGroupCol(), StoreIntCol(), StoreStrCol(), TTable(), UpdateFltFromTableMP(), and UpdateTableForNewRow().
|
protected |
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition at line 552 of file table.h.
Referenced by AddJointRow(), AddNJointRowsMP(), AddTable(), Aggregate(), ColConcat(), ColGenericOp(), Defrag(), GetEmptyRowsStart(), GetNumValidRows(), GetPartitionRanges(), GroupByIntColMP(), IncrementNext(), Join(), LoadTableShM(), Order(), PrintSize(), RemoveFirstRow(), RemoveRow(), ResizeTable(), Save(), SaveSS(), SelectAtomicConst(), SelectFirstNRows(), ThresholdJoin(), TTable(), and UpdateTableForNewRow().
Partitioning of row ids into buckets corresponding to different graph objects when generating a sequence of graphs.
Example: <T_1.age,T_2.age, age> - T_1.age is a src node attribute, T_2.age is a dst node attribute. However, since all nodes refer to the same universe of entities (users) we just do one assignment of age per node, and call that attribute 'age'. This list should be very small.
Definition at line 599 of file table.h.
Referenced by FillBucketsByInterval(), FillBucketsByWindow(), GetGraphsFromSequence(), GetMemUsedKB(), GetNextGraphFromSequence(), InitRowIdBuckets(), and IsLastGraphOfSequence().
|
protected |
Mapping of permanent row ids to physical id.
Definition at line 566 of file table.h.
Referenced by AddIdColumn(), AddJointRow(), AddNJointRowsMP(), Defrag(), GetMemUsedKB(), GetRowIdMap(), Reindex(), RemoveFirstRow(), RemoveRow(), and SpliceByGroup().
|
protected |
Table Schema.
Execution context includes a global string pool for all string values of tables in current session. Access to the pool is done via Context.StringVals
.
Definition at line 549 of file table.h.
Referenced by AddRow(), AddSchemaCol(), AddTable(), ChangeContext(), DenormalizeColName(), DenormalizeSchema(), Dump(), GenerateColTypeMap(), GetCollidingRows(), GetSchemaColName(), GetSchemaColType(), InitializeJointTable(), Intersection(), Minus(), ProjectInPlace(), Rename(), RenumberColName(), SaveSS(), SpliceByGroup(), Union(), and UnionAll().
|
protected |
Column (attribute) to serve as src nodes when constructing the graph.
Definition at line 589 of file table.h.
Referenced by BuildGraph(), GetSrcCol(), and SetSrcCol().
|
protected |
List of columns (attributes) to serve as source node attributes.
Definition at line 592 of file table.h.
Referenced by AddGraphAttribute(), AddGraphAttributeV(), BuildGraph(), GetSrcNodeFltAttrV(), GetSrcNodeIntAttrV(), and GetSrcNodeStrAttrV().
Data columns of integer mappings of string attributes.
Definition at line 560 of file table.h.
Referenced by AddJointRow(), AddNJointRowsMP(), AddNRows(), AddRow(), AddSelectedRows(), AddStrCol(), AddStrVal(), AddTable(), BuildGraph(), ChangeContext(), ColConcat(), ColConcatConst(), Defrag(), GetDstNodeStrAttrV(), GetEdgeStrAttrV(), GetMemUsedKB(), GetSrcNodeStrAttrV(), TRowIterator::GetStrMapById(), GetStrMapById(), TRowIterator::GetStrMapByName(), GetStrMapByName(), GetStrVal(), GroupByStrCol(), InitializeJointTable(), LoadTableShM(), PrintSize(), ProjectInPlace(), ResizeTable(), Save(), SelfSimJoinPerGroup(), SpliceByGroup(), StoreStrCol(), ThresholdJoinCountCollisions(), ThresholdJoinCountPerJoinKeyCollisions(), and TTable().
Indexes for String Columns.
Definition at line 569 of file table.h.
Referenced by GetStrRowIdxByMap(), and RequestIndexStrMap().
|
staticprotected |