SNAP Library 6.0, Developer Reference  2020-12-09 16:24:20
SNAP, a general purpose, high performance system for analysis and manipulation of large networks
TTable Class Reference

Table class: Relational table with columnar data storage. More...

#include <table.h>

Collaboration diagram for TTable:

Classes

class  TLoadVecInit
 

Public Member Functions

void AddIntCol (const TStr &ColName)
 Adds an integer column with name ColName. More...
 
void AddFltCol (const TStr &ColName)
 Adds a float column with name ColName. More...
 
void AddStrCol (const TStr &ColName)
 Adds a string column with name ColName. More...
 
void GroupByIntColMP (const TStr &GroupBy, THashMP< TInt, TIntV > &Grouping, TBool UsePhysicalIds=true) const
 Groups/hashes by a single column with integer values, using OpenMP multi-threading. More...
 
 TTable ()
 
 TTable (TTableContext *Context)
 
 TTable (const Schema &S, TTableContext *Context)
 
 TTable (TSIn &SIn, TTableContext *Context)
 
 TTable (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Constructor to build table out of a hash table of int->int. More...
 
 TTable (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Constructor to build table out of a hash table of int->float. More...
 
 TTable (const TTable &Table)
 Copy constructor. More...
 
 TTable (const TTable &Table, const TIntV &RowIds)
 
void SaveSS (const TStr &OutFNm)
 Saves table schema and content to a TSV file. More...
 
void SaveBin (const TStr &OutFNm)
 Saves table schema and content to a binary file. More...
 
void Save (TSOut &SOut)
 Saves table schema and content to a binary format. More...
 
void Dump (FILE *OutF=stdout) const
 Prints table contents to a text file. More...
 
void AddRow (const TTableRow &Row)
 Adds row with values taken from given TTableRow. More...
 
TTableContextGetContext ()
 Returns the context. More...
 
TTableContextChangeContext (TTableContext *Context)
 Changes the current context. Moves all object items to the new context. More...
 
TInt GetColIdx (const TStr &ColName) const
 Gets index of column ColName among columns of the same type in the schema. More...
 
TInt GetIntVal (const TStr &ColName, const TInt &RowIdx)
 Gets the value of integer attribute ColName at row RowIdx. More...
 
TFlt GetFltVal (const TStr &ColName, const TInt &RowIdx)
 Gets the value of float attribute ColName at row RowIdx. More...
 
TStr GetStrVal (const TStr &ColName, const TInt &RowIdx) const
 Gets the value of string attribute ColName at row RowIdx. More...
 
TInt GetStrMapById (TInt ColIdx, TInt RowIdx) const
 Gets the integer mapping of the string at column ColIdx at row RowIdx. More...
 
TInt GetStrMapByName (const TStr &ColName, TInt RowIdx) const
 Gets the integer mapping of the string at column ColName at row RowIdx. More...
 
TStr GetStrValById (TInt ColIdx, TInt RowIdx) const
 Gets the value of the string attribute at column ColIdx at row RowIdx. More...
 
TStr GetStrValByName (const TStr &ColName, const TInt &RowIdx) const
 Gets the value of the string attribute at column ColName at row RowIdx. More...
 
TIntV GetIntRowIdxByVal (const TStr &ColName, const TInt &Val) const
 Gets the rows containing Val in int column ColName. More...
 
TIntV GetStrRowIdxByMap (const TStr &ColName, const TInt &Map) const
 Gets the rows containing int mapping Map in str column ColName. More...
 
TIntV GetFltRowIdxByVal (const TStr &ColName, const TFlt &Val) const
 Gets the rows containing Val in flt column ColName. More...
 
TInt RequestIndexInt (const TStr &ColName)
 Creates Index for Int Column ColName. More...
 
TInt RequestIndexFlt (const TStr &ColName)
 Creates Index for Flt Column ColName. More...
 
TInt RequestIndexStrMap (const TStr &ColName)
 Creates Index for Str Column ColName. More...
 
TStr GetStr (const TInt &KeyId) const
 Gets the string with KeyId. More...
 
TInt GetIntValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx)
 Get the integer value at column ColIdx and row RowIdx. More...
 
TFlt GetFltValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx)
 Get the float value at column ColIdx and row RowIdx. More...
 
Schema GetSchema ()
 Gets the schema of this table. More...
 
TVec< PNEANetToGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx)
 Creates a sequence of graphs based on values of column SplitAttr and windows specified by JumpSize and WindowSize. More...
 
TVec< PNEANetToVarGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals)
 Creates a sequence of graphs based on values of column SplitAttr and intervals specified by SplitIntervals. More...
 
TVec< PNEANetToGraphPerGroup (TStr GroupAttr, TAttrAggr AggrPolicy)
 Creates a sequence of graphs based on grouping specified by GroupAttr. More...
 
PNEANet ToGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx)
 Creates the graph sequence one at a time. More...
 
PNEANet ToVarGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals)
 Creates the graph sequence one at a time. More...
 
PNEANet ToGraphPerGroupIterator (TStr GroupAttr, TAttrAggr AggrPolicy)
 Creates the graph sequence one at a time. More...
 
PNEANet NextGraphIterator ()
 Calls to this must be preceded by a call to one of the above ToGraph*Iterator functions. More...
 
TBool IsLastGraphOfSequence ()
 Checks if the end of the graph sequence is reached. More...
 
TStr GetSrcCol () const
 Gets the name of the column to be used as src nodes in the graph. More...
 
void SetSrcCol (const TStr &Src)
 Sets the name of the column to be used as src nodes in the graph. More...
 
TStr GetDstCol () const
 Gets the name of the column to be used as dst nodes in the graph. More...
 
void SetDstCol (const TStr &Dst)
 Sets the name of the column to be used as dst nodes in the graph. More...
 
void AddEdgeAttr (const TStr &Attr)
 Adds column to be used as graph edge attribute. More...
 
void AddEdgeAttr (TStrV &Attrs)
 Adds columns to be used as graph edge attributes. More...
 
void AddSrcNodeAttr (const TStr &Attr)
 Adds column to be used as src node atribute of the graph. More...
 
void AddSrcNodeAttr (TStrV &Attrs)
 Adds columns to be used as src node attributes of the graph. More...
 
void AddDstNodeAttr (const TStr &Attr)
 Adds column to be used as dst node atribute of the graph. More...
 
void AddDstNodeAttr (TStrV &Attrs)
 Adds columns to be used as dst node attributes of the graph. More...
 
void AddNodeAttr (const TStr &Attr)
 Handles the common case where src and dst both belong to the same "universe" of entities. More...
 
void AddNodeAttr (TStrV &Attrs)
 Handles the common case where src and dst both belong to the same "universe" of entities. More...
 
void SetCommonNodeAttrs (const TStr &SrcAttr, const TStr &DstAttr, const TStr &CommonAttrName)
 Sets the columns to be used as both src and dst node attributes. More...
 
TStrV GetSrcNodeIntAttrV () const
 Gets src node int attribute name vector. More...
 
TStrV GetDstNodeIntAttrV () const
 Gets dst node int attribute name vector. More...
 
TStrV GetEdgeIntAttrV () const
 Gets edge int attribute name vector. More...
 
TStrV GetSrcNodeFltAttrV () const
 Gets src node float attribute name vector. More...
 
TStrV GetDstNodeFltAttrV () const
 Gets dst node float attribute name vector. More...
 
TStrV GetEdgeFltAttrV () const
 Gets edge float attribute name vector. More...
 
TStrV GetSrcNodeStrAttrV () const
 Gets src node str attribute name vector. More...
 
TStrV GetDstNodeStrAttrV () const
 Gets dst node str attribute name vector. More...
 
TStrV GetEdgeStrAttrV () const
 Gets edge str attribute name vector. More...
 
TAttrType GetColType (const TStr &ColName) const
 Gets type of column ColName. More...
 
TInt GetNumRows () const
 Gets total number of rows in this table. More...
 
TInt GetNumValidRows () const
 Gets number of valid, i.e. not deleted, rows in this table. More...
 
THash< TInt, TIntGetRowIdMap () const
 Gets a map of logical to physical row ids. More...
 
TRowIterator BegRI () const
 Gets iterator to the first valid row of the table. More...
 
TRowIterator EndRI () const
 Gets iterator to the last valid row of the table. More...
 
TRowIteratorWithRemove BegRIWR ()
 Gets iterator with reomve to the first valid row. More...
 
TRowIteratorWithRemove EndRIWR ()
 Gets iterator with reomve to the last valid row. More...
 
void GetPartitionRanges (TIntPrV &Partitions, TInt NumPartitions) const
 Partitions the table into NumPartitions and populate Partitions with the ranges. More...
 
void Rename (const TStr &Column, const TStr &NewLabel)
 Renames a column. More...
 
void Unique (const TStr &Col)
 Removes rows with duplicate values in given column. More...
 
void Unique (const TStrV &Cols, TBool Ordered=true)
 Removes rows with duplicate values in given columns. More...
 
void Select (TPredicate &Predicate, TIntV &SelectedRows, TBool Remove=true)
 Selects rows that satisfy given Predicate. More...
 
void Select (TPredicate &Predicate)
 
void Classify (TPredicate &Predicate, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
 
void SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, TIntV &SelectedRows, TBool Remove=true)
 Selects rows using atomic compare operation. More...
 
void SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp)
 
void ClassifyAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
 
void SelectAtomicConst (const TStr &Col, const TPrimitive &Val, TPredComp Cmp, TIntV &SelectedRows, PTable &SelectedTable, TBool Remove=true, TBool Table=true)
 Selects rows where the value of Col matches given primitive Val. More...
 
template<class T >
void SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp)
 
template<class T >
void SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, PTable &SelectedTable)
 
template<class T >
void ClassifyAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
 
void SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp)
 
void SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp, PTable &SelectedTable)
 
void SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp)
 
void SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp, PTable &SelectedTable)
 
void SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp)
 
void SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp, PTable &SelectedTable)
 
void Group (const TStrV &GroupBy, const TStr &GroupColName, TBool Ordered=true, TBool UsePhysicalIds=true)
 Groups rows depending on values of GroupBy columns. More...
 
void Count (const TStr &CountColName, const TStr &Col)
 Counts number of unique elements. More...
 
void Order (const TStrV &OrderBy, TStr OrderColName="", TBool ResetRankByMSC=false, TBool Asc=true)
 Orders the rows according to the values in columns of OrderBy (in descending lexicographic order). More...
 
void Aggregate (const TStrV &GroupByAttrs, TAttrAggr AggOp, const TStr &ValAttr, const TStr &ResAttr, TBool Ordered=true)
 Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr. More...
 
void AggregateCols (const TStrV &AggrAttrs, TAttrAggr AggOp, const TStr &ResAttr)
 Aggregates attributes in AggrAttrs across columns. More...
 
TVec< PTableSpliceByGroup (const TStrV &GroupByAttrs, TBool Ordered=true)
 Splices table into subtables according to a grouping statement. More...
 
PTable Join (const TStr &Col1, const TTable &Table, const TStr &Col2)
 Performs equijoin. More...
 
PTable Join (const TStr &Col1, const PTable &Table, const TStr &Col2)
 
PTable ThresholdJoin (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2, TInt Threshold, TBool PerJoinKey=false)
 
PTable SelfJoin (const TStr &Col)
 Joins table with itself, on values of Col. More...
 
PTable SelfSimJoin (const TStrV &Cols, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold)
 
PTable SelfSimJoinPerGroup (const TStr &GroupAttr, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold)
 Performs join if the distance between two rows is less than the specified threshold. More...
 
PTable SelfSimJoinPerGroup (const TStrV &GroupBy, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold)
 Performs join if the distance between two rows is less than the specified threshold. More...
 
PTable SimJoin (const TStrV &Cols1, const TTable &Table, const TStrV &Cols2, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold)
 Performs join if the distance between two rows is less than the specified threshold. More...
 
void SelectFirstNRows (const TInt &N)
 Selects first N rows from the table. More...
 
void Defrag ()
 Releases memory of deleted rows, and defrags. More...
 
void StoreIntCol (const TStr &ColName, const TIntV &ColVals)
 Adds entire int column to table. More...
 
void StoreFltCol (const TStr &ColName, const TFltV &ColVals)
 Adds entire flt column to table. More...
 
void StoreStrCol (const TStr &ColName, const TStrV &ColVals)
 Adds entire str column to table. More...
 
void UpdateFltFromTable (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0)
 
void UpdateFltFromTableMP (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0)
 
void SetFltColToConstMP (TInt UpdateColIdx, TFlt DefaultFltVal)
 
PTable Union (const TTable &Table)
 Returns union of this table with given Table. More...
 
PTable Union (const PTable &Table)
 
PTable UnionAll (const TTable &Table)
 Returns union of this table with given Table, preserving duplicates. More...
 
PTable UnionAll (const PTable &Table)
 
void UnionAllInPlace (const TTable &Table)
 Same as TTable::ConcatTable. More...
 
void UnionAllInPlace (const PTable &Table)
 
PTable Intersection (const TTable &Table)
 Returns intersection of this table with given Table. More...
 
PTable Intersection (const PTable &Table)
 
PTable Minus (TTable &Table)
 Returns table with rows that are present in this table but not in given Table. More...
 
PTable Minus (const PTable &Table)
 
PTable Project (const TStrV &ProjectCols)
 Returns table with only the columns in ProjectCols. More...
 
void ProjectInPlace (const TStrV &ProjectCols)
 Keeps only the columns specified in ProjectCols. More...
 
void ColGenericOp (const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
 Performs columnwise arithmetic operation. More...
 
void ColGenericOpMP (TInt ArgColIdx1, TInt ArgColIdx2, TAttrType ArgType1, TAttrType ArgType2, TInt ResColIdx, TArithOp op)
 
void ColAdd (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise addition. See TTable::ColGenericOp. More...
 
void ColSub (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise subtraction. See TTable::ColGenericOp. More...
 
void ColMul (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise multiplication. See TTable::ColGenericOp. More...
 
void ColDiv (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise division. See TTable::ColGenericOp. More...
 
void ColMod (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise modulus. See TTable::ColGenericOp. More...
 
void ColMin (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs min of two columns. See TTable::ColGenericOp. More...
 
void ColMax (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs max of two columns. See TTable::ColGenericOp. More...
 
void ColGenericOp (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr, TArithOp op, TBool AddToFirstTable)
 Performs columnwise arithmetic operation with column of given table. More...
 
void ColAdd (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise addition with column of given table. More...
 
void ColSub (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise subtraction with column of given table. More...
 
void ColMul (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise multiplication with column of given table. More...
 
void ColDiv (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise division with column of given table. More...
 
void ColMod (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise modulus with column of given table. More...
 
void ColGenericOp (const TStr &Attr1, const TFlt &Num, const TStr &ResAttr, TArithOp op, const TBool floatCast)
 Performs arithmetic op of column values and given Num. More...
 
void ColGenericOpMP (const TInt &ColIdx1, const TInt &ColIdx2, TAttrType ArgType, const TFlt &Num, TArithOp op, TBool ShouldCast)
 
void ColAdd (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs addition of column values and given Num. More...
 
void ColSub (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs subtraction of column values and given Num. More...
 
void ColMul (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs multiplication of column values and given Num. More...
 
void ColDiv (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs division of column values and given Num. More...
 
void ColMod (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs modulus of column values and given Num. More...
 
void ColConcat (const TStr &Attr1, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="")
 Concatenates two string columns. More...
 
void ColConcat (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="", TBool AddToFirstTable=true)
 Concatenates string column with column of given table. More...
 
void ColConcatConst (const TStr &Attr1, const TStr &Val, const TStr &Sep="", const TStr &ResAttr="")
 Concatenates column values with given string value. More...
 
void ReadIntCol (const TStr &ColName, TIntV &Result) const
 Reads values of entire int column into Result. More...
 
void ReadFltCol (const TStr &ColName, TFltV &Result) const
 Reads values of entire float column into Result. More...
 
void ReadStrCol (const TStr &ColName, TStrV &Result) const
 Reads values of entire string column into Result. More...
 
void InitIds ()
 Adds explicit row ids, initialize hash set mapping ids to physical rows. More...
 
PTable IsNextK (const TStr &OrderCol, TInt K, const TStr &GroupBy, const TStr &RankColName="")
 Distance based filter. More...
 
void PrintSize ()
 
void PrintContextSize ()
 
TSize GetMemUsedKB ()
 Returns approximate memory used by table in [KB]. More...
 
TSize GetContextMemUsedKB ()
 Returns approximate memory used by table context in [KB]. More...
 

Static Public Member Functions

static void SetMP (TInt Value)
 
static TInt GetMP ()
 
static TStr NormalizeColName (const TStr &ColName)
 Adds suffix to column name if it doesn't exist. More...
 
static TStrV NormalizeColNameV (const TStrV &Cols)
 Adds suffix to column name if it doesn't exist. More...
 
static PTable New ()
 
static PTable New (TTableContext *Context)
 
static PTable New (const Schema &S, TTableContext *Context)
 
static PTable New (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Returns pointer to a table constructed from given int->int hash. More...
 
static PTable New (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Returns pointer to a table constructed from given int->float hash. More...
 
static PTable New (const PTable Table)
 Returns pointer to a new table created from given Table. More...
 
static void GetSchema (const TStr &InFNm, Schema &S, const char &Separator= '\t')
 Returns pointer to a new table created from given Table, with name set to TableName. More...
 
static PTable LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const char &Separator= '\t', TBool HasTitleLine=false)
 Loads table from spread sheet (TSV, CSV, etc). Note: HasTitleLine = true is not supported. Please comment title lines instead. More...
 
static PTable LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const TIntV &RelevantCols, const char &Separator= '\t', TBool HasTitleLine=false)
 Loads table from spread sheet - but only load the columns specified by RelevantCols. Note: HasTitleLine = true is not supported. Please comment title lines instead. More...
 
static PTable Load (TSIn &SIn, TTableContext *Context)
 Loads table from a binary format. More...
 
static PTable LoadShM (TShMIn &ShMIn, TTableContext *Context)
 Static constructor to load table from memory. More...
 
static PTable TableFromHashMap (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Builds table from hash table of int->int. More...
 
static PTable TableFromHashMap (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Builds table from hash table of int->float. More...
 
static PTable GetNodeTable (const PNEANet &Network, TTableContext *Context)
 Extracts node TTable from PNEANet. More...
 
static PTable GetEdgeTable (const PNEANet &Network, TTableContext *Context)
 Extracts edge TTable from PNEANet. More...
 
static PTable GetEdgeTablePN (const PNGraphMP &Network, TTableContext *Context)
 Extracts edge TTable from parallel graph PNGraphMP. More...
 
static PTable GetFltNodePropertyTable (const PNEANet &Network, const TIntFltH &Property, const TStr &NodeAttrName, const TAttrType &NodeAttrType, const TStr &PropertyAttrName, TTableContext *Context)
 Extracts node and edge property TTables from THash. More...
 

Protected Member Functions

void InvalidatePhysicalGroupings ()
 
void InvalidateAffectedGroupings (const TStr &Attr)
 
void IncrementNext ()
 Increments the next vector and set last, NumRows and NumValidRows. More...
 
void ClassifyAux (const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
 Adds a label attribute with positive labels on selected rows and negative labels on the rest. More...
 
const char * GetContextKey (TInt Val) const
 Gets the Key of the Context StringVals pool. Used by ToGraph method in conv.cpp. More...
 
TStr GetStrValIdx (TInt ColIdx, TInt RowIdx) const
 Gets the value in column with id ColIdx at row RowIdx. More...
 
void AddStrVal (const TInt &ColIdx, const TStr &Val)
 Adds Val in column with id ColIdx. More...
 
void AddStrVal (const TStr &Col, const TStr &Val)
 Adds Val in column with name Col. More...
 
TStr GetIdColName () const
 Gets name of the id column of this table. More...
 
TStr GetSchemaColName (TInt Idx) const
 Gets name of the column with index Idx in the schema. More...
 
TAttrType GetSchemaColType (TInt Idx) const
 Gets type of the column with index Idx in the schema. More...
 
void AddSchemaCol (const TStr &ColName, TAttrType ColType)
 Adds column with name ColName and type ColType to the schema. More...
 
TBool IsColName (const TStr &ColName) const
 
void AddColType (const TStr &ColName, TPair< TAttrType, TInt > ColType)
 Adds column with name ColName and type ColType to the ColTypeMap. More...
 
void AddColType (const TStr &ColName, TAttrType ColType, TInt Index)
 Adds column with name ColName and type ColType to the ColTypeMap. More...
 
void DelColType (const TStr &ColName)
 Adds column with name ColName and type ColType to the ColTypeMap. More...
 
TPair< TAttrType, TIntGetColTypeMap (const TStr &ColName) const
 Gets column type and index of ColName. More...
 
TStr RenumberColName (const TStr &ColName) const
 Returns a re-numbered column name based on number of existing columns with conflicting names. More...
 
TStr DenormalizeColName (const TStr &ColName) const
 Removes suffix to column name if exists. More...
 
Schema DenormalizeSchema () const
 Removes suffix to column names in the Schema. More...
 
TBool IsAttr (const TStr &Attr)
 Checks if Attr is an attribute of this table schema. More...
 
void AddTable (const TTable &T)
 Adds all the rows of the input table. Allows duplicate rows (not a union). More...
 
void ConcatTable (const PTable &T)
 Appends all rows of T to this table, and recalculate indices. More...
 
void AddRowI (const TRowIterator &RI)
 Adds row corresponding to RI. More...
 
void AddRowV (const TIntV &IntVals, const TFltV &FltVals, const TStrV &StrVals)
 Adds row with values corresponding to the given vectors by type. More...
 
void AddGraphAttribute (const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst)
 Adds names of columns to be used as graph attributes. More...
 
void AddGraphAttributeV (TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst)
 Adds vector of names of columns to be used as graph attributes. More...
 
void CheckAndAddIntNode (PNEANet Graph, THashSet< TInt > &NodeVals, TInt NodeId)
 Checks if given NodeId is seen earlier; if not, add it to Graph and hashmap NodeVals. More...
 
template<class T >
TInt CheckAndAddFltNode (T Graph, THash< TFlt, TInt > &NodeVals, TFlt FNodeVal)
 Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals. More...
 
void AddEdgeAttributes (PNEANet &Graph, int RowId)
 Adds attributes of edge corresponding to RowId to the Graph. More...
 
void AddNodeAttributes (TInt NId, TStrV NodeAttrV, TInt RowId, THash< TInt, TStrIntVH > &NodeIntAttrs, THash< TInt, TStrFltVH > &NodeFltAttrs, THash< TInt, TStrStrVH > &NodeStrAttrs)
 Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values). More...
 
PNEANet BuildGraph (const TIntV &RowIds, TAttrAggr AggrPolicy)
 Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes. More...
 
void InitRowIdBuckets (int NumBuckets)
 Initializes the RowIdBuckets vector which will be used for the graph sequence creation. More...
 
void FillBucketsByWindow (TStr SplitAttr, TInt JumpSize, TInt WindowSize, TInt StartVal, TInt EndVal)
 Fills RowIdBuckets with sets of row ids. More...
 
void FillBucketsByInterval (TStr SplitAttr, TIntPrV SplitIntervals)
 Fills RowIdBuckets with sets of row ids. More...
 
TVec< PNEANetGetGraphsFromSequence (TAttrAggr AggrPolicy)
 Returns a sequence of graphs. More...
 
PNEANet GetFirstGraphFromSequence (TAttrAggr AggrPolicy)
 Returns the first graph of the sequence. More...
 
PNEANet GetNextGraphFromSequence ()
 Returns the next graph in sequence corresponding to RowIdBuckets. More...
 
template<class T >
AggregateVector (TVec< T > &V, TAttrAggr Policy)
 Aggregates vector into a single scalar value according to a policy. More...
 
void GroupingSanityCheck (const TStr &GroupBy, const TAttrType &AttrType) const
 Checks if grouping key exists and matches given attr type. More...
 
template<class T >
void GroupByIntCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
 Groups/hashes by a single column with integer values. More...
 
template<class T >
void GroupByFltCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
 Groups/hashes by a single column with float values. Returns hash table with grouping. More...
 
template<class T >
void GroupByStrCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
 Groups/hashes by a single column with string values. Returns hash table with grouping. More...
 
template<class T >
void UpdateGrouping (THash< T, TIntV > &Grouping, T Key, TInt Val) const
 Template for utility function to update a grouping hash map. More...
 
template<class T >
void UpdateGrouping (THashMP< T, TIntV > &Grouping, T Key, TInt Val) const
 Template for utility function to update a parallel grouping hash map. More...
 
void PrintGrouping (const THash< TGroupKey, TIntV > &Grouping) const
 
TInt CompareRows (TInt R1, TInt R2, const TAttrType &CompareByType, const TInt &CompareByIndex, TBool Asc=true)
 Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More...
 
TInt CompareRows (TInt R1, TInt R2, const TVec< TAttrType > &CompareByTypes, const TIntV &CompareByIndices, TBool Asc=true)
 Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More...
 
TInt GetPivot (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc)
 Gets pivot element for QSort. More...
 
TInt Partition (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc)
 Partitions vector for QSort. More...
 
void ISort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true)
 Performs insertion sort on given vector V. More...
 
void QSort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true)
 Performs QSort on given vector V. More...
 
void Merge (TIntV &V, TInt Idx1, TInt Idx2, TInt Idx3, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true)
 Helper function for parallel QSort. More...
 
void QSortPar (TIntV &V, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true)
 Performs QSort in parallel on given vector V. More...
 
bool IsRowValid (TInt RowIdx) const
 Checks if RowIdx corresponds to a valid (i.e. not deleted) row. More...
 
TInt GetLastValidRowIdx ()
 Gets the id of the last valid row of the table. More...
 
void RemoveFirstRow ()
 Removes first valid row of the table. More...
 
void RemoveRow (TInt RowIdx, TInt PrevRowIdx)
 Removes row with id RowIdx. More...
 
void KeepSortedRows (const TIntV &KeepV)
 Removes all rows that are not mentioned in the SORTED vector KeepV. More...
 
void SetFirstValidRow ()
 Sets the first valid row of the TTable. More...
 
PTable InitializeJointTable (const TTable &Table)
 Initializes an empty table for the join of this table with the given table. More...
 
void AddJointRow (const TTable &T1, const TTable &T2, TInt RowIdx1, TInt RowIdx2)
 Adds joint row T1[RowIdx1]<=>T2[RowIdx2]. More...
 
void ThresholdJoinInputCorrectness (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2)
 
void ThresholdJoinCountCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntPr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType)
 
PTable ThresholdJoinOutputTable (const THash< TIntPr, TIntTr > &Counters, TInt Threshold, const TTable &Table)
 
void ThresholdJoinCountPerJoinKeyCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntTr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType)
 
PTable ThresholdJoinPerJoinKeyOutputTable (const THash< TIntTr, TIntTr > &Counters, TInt Threshold, const TTable &Table)
 
void ResizeTable (int RowCount)
 Resizes the table to hold RowCount rows. More...
 
int GetEmptyRowsStart (int NewRows)
 Gets the start index to a chunk of empty rows of size NewRows. More...
 
void AddSelectedRows (const TTable &Table, const TIntV &RowIDs)
 Adds rows from Table that correspond to ids in RowIDs. More...
 
void AddNRows (int NewRows, const TVec< TIntV > &IntColsP, const TVec< TFltV > &FltColsP, const TVec< TIntV > &StrColMapsP)
 Adds NewRows rows from the given vectors for each column type. More...
 
void AddNJointRowsMP (const TTable &T1, const TTable &T2, const TVec< TIntPrV > &JointRowIDSet)
 Adds rows from T1 and T2 to this table in a parallel manner. Used by Join. More...
 
void UpdateTableForNewRow ()
 Updates table state after adding one or more rows. More...
 
void GroupAux (const TStrV &GroupBy, THash< TGroupKey, TPair< TInt, TIntV > > &Grouping, TBool Ordered, const TStr &GroupColName, TBool KeepUnique, TIntV &UniqueVec, TBool UsePhysicalIds=true)
 Helper function for grouping. More...
 
void StoreGroupCol (const TStr &GroupColName, const TVec< TPair< TInt, TInt > > &GroupAndRowIds)
 Parallel helper function for grouping. - we currently don't support such parallel grouping by complex keys. More...
 
void Reindex ()
 Reinitializes row ids. More...
 
void AddIdColumn (const TStr &IdColName)
 Adds a column of explicit integer identifiers to the rows. More...
 
void GetCollidingRows (const TTable &T, THashSet< TInt > &Collisions)
 Gets set of row ids of rows common with table T. More...
 

Static Protected Member Functions

static void LoadSSPar (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine)
 Parallelly loads data from input file at InFNm into NewTable. Only work when NewTable has no string columns. More...
 
static void LoadSSSeq (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine)
 Sequentially loads data from input file at InFNm into NewTable. More...
 
static TInt CompareKeyVal (const TInt &K1, const TInt &V1, const TInt &K2, const TInt &V2)
 
static TInt CheckSortedKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 
static void ISortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 
static TInt GetPivotKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 
static TInt PartitionKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 
static void QSortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 

Protected Attributes

TTableContextContext
 Execution Context. More...
 
Schema Sch
 Table Schema. More...
 
TCRef CRef
 
TInt NumRows
 Number of rows in the table (valid and invalid). More...
 
TInt NumValidRows
 Number of valid rows in the table (i.e. rows that were not logically removed). More...
 
TInt FirstValidRow
 Physical index of first valid row. More...
 
TInt LastValidRow
 Physical index of last valid row. More...
 
TIntV Next
 A vector describing the logical order of the rows. More...
 
TVec< TIntVIntCols
 Next[i] is the successor of row i. Table iterators follow the order dictated by Next More...
 
TVec< TFltVFltCols
 Data columns of floating point attributes. More...
 
TVec< TIntVStrColMaps
 Data columns of integer mappings of string attributes. More...
 
THash< TStr, TPair< TAttrType, TInt > > ColTypeMap
 
TStr IdColName
 A mapping from column name to column type and column index among columns of the same type. More...
 
TIntIntH RowIdMap
 Mapping of permanent row ids to physical id. More...
 
THash< TStr, THash< TInt, TIntV > > IntColIndexes
 Indexes for Int Columns. More...
 
THash< TStr, THash< TInt, TIntV > > StrMapColIndexes
 Indexes for String Columns. More...
 
THash< TStr, THash< TFlt, TIntV > > FltColIndexes
 Indexes for Float Columns. More...
 
THash< TStr, GroupStmtGroupStmtNames
 Maps user-given grouping statement names to their group-by attributes. More...
 
THash< GroupStmt, THash< TInt, TGroupKey > > GroupIDMapping
 Maps grouping statements to their (group id –> group-by key) mapping. More...
 
THash< GroupStmt, THash< TGroupKey, TIntV > > GroupMapping
 Maps grouping statements to their (group-by key –> group id) mapping. More...
 
TStr SrcCol
 Column (attribute) to serve as src nodes when constructing the graph. More...
 
TStr DstCol
 Column (attribute) to serve as dst nodes when constructing the graph. More...
 
TStrV EdgeAttrV
 List of columns (attributes) to serve as edge attributes. More...
 
TStrV SrcNodeAttrV
 List of columns (attributes) to serve as source node attributes. More...
 
TStrV DstNodeAttrV
 List of columns (attributes) to serve as destination node attributes. More...
 
TStrTrV CommonNodeAttrs
 List of attribute pairs with values common to source and destination and their common given name. More...
 
TVec< TIntVRowIdBuckets
 Partitioning of row ids into buckets corresponding to different graph objects when generating a sequence of graphs. More...
 
TInt CurrBucket
 Current row id bucket - used when generating a sequence of graphs using an iterator. More...
 
TAttrAggr AggrPolicy
 Aggregation policy used for solving conflicts between different values of an attribute of the same node. More...
 
TInt IsNextDirty
 Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing GetPartitionRanges. More...
 

Static Protected Attributes

static const TInt Last = -1
 Special value for Next vector entry - last row in table. More...
 
static const TInt Invalid = -2
 Special value for Next vector entry - logically removed row. More...
 
static TInt UseMP = 1
 Global switch for choosing multi-threaded versions of TTable functions. More...
 

Private Member Functions

void GenerateColTypeMap (THash< TStr, TPair< TInt, TInt > > &ColTypeIntMap)
 
void LoadTableShM (TShMIn &ShMIn, TTableContext *ContextTable)
 

Friends

class TPt< TTable >
 
class TRowIterator
 
class TRowIteratorWithRemove
 
template<class PGraph >
PGraph TSnap::ToGraph (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy)
 
template<class PGraph >
PGraph TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy)
 
template<class PGraph >
PGraph TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy)
 
template<class PGraph >
PGraph TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy)
 
template<class PGraph >
PGraph TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy)
 
int TSnap::LoadCrossNet (TCrossNet &Graph, PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV)
 
int TSnap::LoadMode (TModeNet &Graph, PTable Table, const TStr &NCol, TStrV &NodeAttrV)
 
template<class PGraphMP >
PGraphMP TSnap::ToGraphMP (PTable Table, const TStr &SrcCol, const TStr &DstCol)
 
template<class PGraphMP >
PGraphMP TSnap::ToGraphMP3 (PTable Table, const TStr &SrcCol, const TStr &DstCol)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP2 (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy)
 

Detailed Description

Table class: Relational table with columnar data storage.

Definition at line 484 of file table.h.

Constructor & Destructor Documentation

TTable::TTable ( )

Definition at line 302 of file table.cpp.

Referenced by Load(), LoadShM(), and New().

302  : Context(new TTableContext), NumRows(0), NumValidRows(0),
303  FirstValidRow(0), LastValidRow(-1) {}
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
Execution context.
Definition: table.h:180
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552

Here is the caller graph for this function:

TTable::TTable ( TTableContext Context)

Definition at line 305 of file table.cpp.

305  : Context(Context), NumRows(0),
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TTable::TTable ( const Schema S,
TTableContext Context 
)

Definition at line 308 of file table.cpp.

References AddColType(), AddSchemaCol(), atFlt, atInt, atStr, FltCols, IntCols, TVec< TVal, TSizeTy >::Len(), and StrColMaps.

308  : Context(Context),
310  TInt IntColCnt = 0;
311  TInt FltColCnt = 0;
312  TInt StrColCnt = 0;
313  for (TInt i = 0; i < TableSchema.Len(); i++) {
314  TStr ColName = TableSchema[i].Val1;
315  TAttrType ColType = TableSchema[i].Val2;
316  AddSchemaCol(ColName, ColType);
317  switch (ColType) {
318  case atInt:
319  AddColType(ColName, atInt, IntColCnt);
320  IntColCnt++;
321  break;
322  case atFlt:
323  AddColType(ColName, atFlt, FltColCnt);
324  FltColCnt++;
325  break;
326  case atStr:
327  AddColType(ColName, atStr, StrColCnt);
328  StrColCnt++;
329  break;
330  }
331  }
332  IntCols = TVec<TIntV>(IntColCnt);
333  FltCols = TVec<TFltV>(FltColCnt);
334  StrColMaps = TVec<TIntV>(StrColCnt);
335 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
Definition: dt.h:412
Definition: gbase.h:23
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23

Here is the call graph for this function:

TTable::TTable ( TSIn SIn,
TTableContext Context 
)

Definition at line 378 of file table.cpp.

References GenerateColTypeMap().

378  : Context(Context), NumRows(SIn),
379  NumValidRows(SIn), FirstValidRow(SIn), LastValidRow(SIn), Next(SIn), IntCols(SIn),
380  FltCols(SIn), StrColMaps(SIn) {
381  THash<TStr,TPair<TInt,TInt> > ColTypeIntMap(SIn);
382  GenerateColTypeMap(ColTypeIntMap);
383 }
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
void GenerateColTypeMap(THash< TStr, TPair< TInt, TInt > > &ColTypeIntMap)
Definition: table.cpp:337
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
Definition: hash.h:97
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552

Here is the call graph for this function:

TTable::TTable ( const THash< TInt, TInt > &  H,
const TStr Col1,
const TStr Col2,
TTableContext Context,
const TBool  IsStrKeys = false 
)

Constructor to build table out of a hash table of int->int.

Definition at line 385 of file table.cpp.

References AddColType(), AddSchemaCol(), atInt, atStr, THash< TKey, TDat, THashFunc >::GetDatV(), THash< TKey, TDat, THashFunc >::GetKeyV(), InitIds(), IntCols, IsNextDirty, Last, Next, NumRows, and StrColMaps.

386  : Context(Context), NumRows(H.Len()),
387  NumValidRows(H.Len()), FirstValidRow(0), LastValidRow(H.Len()-1) {
388  TAttrType KeyType = IsStrKeys ? atStr : atInt;
389  AddSchemaCol(Col1, KeyType);
390  AddSchemaCol(Col2, atInt);
391  AddColType(Col1, KeyType, 0);
392  AddColType(Col2, atInt, 1);
393  if (IsStrKeys) {
394  StrColMaps = TVec<TIntV>(1);
395  IntCols = TVec<TIntV>(1);
396  H.GetKeyV(StrColMaps[0]);
397  H.GetDatV(IntCols[0]);
398  } else {
399  IntCols = TVec<TIntV>(2);
400  H.GetKeyV(IntCols[0]);
401  H.GetDatV(IntCols[1]);
402  }
403  Next = TIntV(NumRows);
404  for (TInt i = 0; i < NumRows; i++) {
405  Next[i] = i+1;
406  }
407  Next[NumRows-1] = Last;
408  IsNextDirty = 0;
409  InitIds();
410 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
void GetDatV(TVec< TDat > &DatV) const
Definition: hash.h:492
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
void InitIds()
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition: table.cpp:1883
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1137
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void GetKeyV(TVec< TKey > &KeyV) const
Definition: hash.h:484
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
TVec< TInt > TIntV
Definition: ds.h:1594
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
int Len() const
Definition: hash.h:228

Here is the call graph for this function:

TTable::TTable ( const THash< TInt, TFlt > &  H,
const TStr Col1,
const TStr Col2,
TTableContext Context,
const TBool  IsStrKeys = false 
)

Constructor to build table out of a hash table of int->float.

Definition at line 412 of file table.cpp.

References AddColType(), AddSchemaCol(), atFlt, atInt, atStr, FltCols, THash< TKey, TDat, THashFunc >::GetDatV(), THash< TKey, TDat, THashFunc >::GetKeyV(), InitIds(), IntCols, IsNextDirty, Last, Next, NumRows, and StrColMaps.

413  : Context(Context),
414  NumRows(H.Len()), NumValidRows(H.Len()), FirstValidRow(0), LastValidRow(H.Len()-1) {
415  TAttrType KeyType = IsStrKeys ? atStr : atInt;
416  AddSchemaCol(Col1, KeyType);
417  AddSchemaCol(Col2, atFlt);
418  AddColType(Col1, KeyType, 0);
419  AddColType(Col2, atFlt, 0);
420  if (IsStrKeys) {
421  StrColMaps = TVec<TIntV>(1);
422  H.GetKeyV(StrColMaps[0]);
423  } else {
424  IntCols = TVec<TIntV>(1);
425  H.GetKeyV(IntCols[0]);
426  }
427  FltCols = TVec<TFltV>(1);
428  H.GetDatV(FltCols[0]);
429  Next = TIntV(NumRows);
430  for (TInt i = 0; i < NumRows; i++) {
431  Next[i] = i+1;
432  }
433  Next[NumRows-1] = Last;
434  IsNextDirty = 0;
435  InitIds();
436 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
void GetDatV(TVec< TDat > &DatV) const
Definition: hash.h:492
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
void InitIds()
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition: table.cpp:1883
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void GetKeyV(TVec< TKey > &KeyV) const
Definition: hash.h:484
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
Definition: gbase.h:23
TVec< TInt > TIntV
Definition: ds.h:1594
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
int Len() const
Definition: hash.h:228

Here is the call graph for this function:

TTable::TTable ( const TTable Table)
inline

Copy constructor.

Definition at line 919 of file table.h.

919  : Context(Table.Context), Sch(Table.Sch),
921  LastValidRow(Table.LastValidRow), Next(Table.Next), IntCols(Table.IntCols),
922  FltCols(Table.FltCols), StrColMaps(Table.StrColMaps), ColTypeMap(Table.ColTypeMap),
925  SrcCol(Table.SrcCol), DstCol(Table.DstCol),
928  IsNextDirty(Table.IsNextDirty) {}
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
THash< GroupStmt, THash< TGroupKey, TIntV > > GroupMapping
Maps grouping statements to their (group-by key –> group id) mapping.
Definition: table.h:581
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
Schema Sch
Table Schema.
Definition: table.h:549
THash< TStr, TPair< TAttrType, TInt > > ColTypeMap
Definition: table.h:564
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
TStrTrV CommonNodeAttrs
List of attribute pairs with values common to source and destination and their common given name...
Definition: table.h:594
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
THash< TStr, GroupStmt > GroupStmtNames
Maps user-given grouping statement names to their group-by attributes.
Definition: table.h:573
TStr SrcCol
Column (attribute) to serve as src nodes when constructing the graph.
Definition: table.h:589
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TStr DstCol
Column (attribute) to serve as dst nodes when constructing the graph.
Definition: table.h:590
THash< GroupStmt, THash< TInt, TGroupKey > > GroupIDMapping
Maps grouping statements to their (group id –> group-by key) mapping.
Definition: table.h:577
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TTable::TTable ( const TTable Table,
const TIntV RowIds 
)

Definition at line 438 of file table.cpp.

References AddSelectedRows(), ColTypeMap, FirstValidRow, FltCols, InitIds(), IntCols, IsNextDirty, LastValidRow, TVec< TVal, TSizeTy >::Len(), NumRows, NumValidRows, and StrColMaps.

438  : Context(Table.Context),
439  Sch(Table.Sch), SrcCol(Table.SrcCol), DstCol(Table.DstCol), EdgeAttrV(Table.EdgeAttrV),
442  ColTypeMap = Table.ColTypeMap;
443  IntCols = TVec<TIntV>(Table.IntCols.Len());
444  FltCols = TVec<TFltV>(Table.FltCols.Len());
446  FirstValidRow = 0;
447  LastValidRow = -1;
448  NumRows = 0;
449  NumValidRows = 0;
450  AddSelectedRows(Table, RowIDs);
451  IsNextDirty = 0;
452  InitIds();
453 }
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
Schema Sch
Table Schema.
Definition: table.h:549
THash< TStr, TPair< TAttrType, TInt > > ColTypeMap
Definition: table.h:564
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
void AddSelectedRows(const TTable &Table, const TIntV &RowIDs)
Adds rows from Table that correspond to ids in RowIDs.
Definition: table.cpp:4399
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
void InitIds()
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition: table.cpp:1883
TStrTrV CommonNodeAttrs
List of attribute pairs with values common to source and destination and their common given name...
Definition: table.h:594
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
TStr SrcCol
Column (attribute) to serve as src nodes when constructing the graph.
Definition: table.h:589
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TStr DstCol
Column (attribute) to serve as dst nodes when constructing the graph.
Definition: table.h:590
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552

Here is the call graph for this function:

Member Function Documentation

void TTable::AddColType ( const TStr ColName,
TPair< TAttrType, TInt ColType 
)
inlineprotected

Adds column with name ColName and type ColType to the ColTypeMap.

Definition at line 651 of file table.h.

References THash< TKey, TDat, THashFunc >::AddDat(), and NormalizeColName().

Referenced by AddColType(), AddFltCol(), AddIdColumn(), AddIntCol(), AddStrCol(), ClassifyAux(), GenerateColTypeMap(), Order(), ProjectInPlace(), Rename(), StoreFltCol(), StoreGroupCol(), StoreIntCol(), StoreStrCol(), and TTable().

651  {
652  TStr NColName = NormalizeColName(ColName);
653  ColTypeMap.AddDat(NColName, ColType);
654  }
THash< TStr, TPair< TAttrType, TInt > > ColTypeMap
Definition: table.h:564
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
Definition: dt.h:412
TDat & AddDat(const TKey &Key)
Definition: hash.h:238

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddColType ( const TStr ColName,
TAttrType  ColType,
TInt  Index 
)
inlineprotected

Adds column with name ColName and type ColType to the ColTypeMap.

Definition at line 656 of file table.h.

References AddColType(), and NormalizeColName().

656  {
657  TStr NColName = NormalizeColName(ColName);
658  AddColType(NColName, TPair<TAttrType,TInt>(ColType, Index));
659  }
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
Definition: ds.h:32
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
Definition: dt.h:412

Here is the call graph for this function:

void TTable::AddDstNodeAttr ( const TStr Attr)
inline

Adds column to be used as dst node atribute of the graph.

Definition at line 1180 of file table.h.

References AddGraphAttribute().

Referenced by AddNodeAttr().

1180 { AddGraphAttribute(Attr, false, false, true); }
void AddGraphAttribute(const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds names of columns to be used as graph attributes.
Definition: table.cpp:985

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddDstNodeAttr ( TStrV Attrs)
inline

Adds columns to be used as dst node attributes of the graph.

Definition at line 1182 of file table.h.

References AddGraphAttributeV().

1182 { AddGraphAttributeV(Attrs, false, false, true); }
void AddGraphAttributeV(TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds vector of names of columns to be used as graph attributes.
Definition: table.cpp:992

Here is the call graph for this function:

void TTable::AddEdgeAttr ( const TStr Attr)
inline

Adds column to be used as graph edge attribute.

Definition at line 1172 of file table.h.

References AddGraphAttribute().

1172 { AddGraphAttribute(Attr, true, false, false); }
void AddGraphAttribute(const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds names of columns to be used as graph attributes.
Definition: table.cpp:985

Here is the call graph for this function:

void TTable::AddEdgeAttr ( TStrV Attrs)
inline

Adds columns to be used as graph edge attributes.

Definition at line 1174 of file table.h.

References AddGraphAttributeV().

1174 { AddGraphAttributeV(Attrs, true, false, false); }
void AddGraphAttributeV(TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds vector of names of columns to be used as graph attributes.
Definition: table.cpp:992

Here is the call graph for this function:

void TTable::AddEdgeAttributes ( PNEANet Graph,
int  RowId 
)
inlineprotected

Adds attributes of edge corresponding to RowId to the Graph.

Definition at line 3395 of file table.cpp.

References atFlt, atInt, atStr, EdgeAttrV, FltCols, GetColIdx(), GetColType(), GetStrValIdx(), IntCols, and TVec< TVal, TSizeTy >::Len().

Referenced by BuildGraph().

3395  {
3396  for (TInt i = 0; i < EdgeAttrV.Len(); i++) {
3397  TStr ColName = EdgeAttrV[i];
3398  TAttrType T = GetColType(ColName);
3399  TInt Index = GetColIdx(ColName);
3400  switch (T) {
3401  case atInt:
3402  Graph->AddIntAttrDatE(RowId, IntCols[Index][RowId], ColName);
3403  break;
3404  case atFlt:
3405  Graph->AddFltAttrDatE(RowId, FltCols[Index][RowId], ColName);
3406  break;
3407  case atStr:
3408  Graph->AddStrAttrDatE(RowId, GetStrValIdx(Index, RowId), ColName);
3409  break;
3410  }
3411  }
3412 }
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
TStr GetStrValIdx(TInt ColIdx, TInt RowIdx) const
Gets the value in column with id ColIdx at row RowIdx.
Definition: table.h:626
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: dt.h:412
Definition: gbase.h:23
Definition: gbase.h:23

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddFltCol ( const TStr ColName)

Adds a float column with name ColName.

Definition at line 4680 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atFlt, FltCols, TVec< TVal, TSizeTy >::Len(), and NumRows.

Referenced by Aggregate(), AggregateCols(), and ColGenericOp().

4680  {
4681  AddSchemaCol(ColName, atFlt);
4683  TInt L = FltCols.Len();
4684  AddColType(ColName, atFlt, L-1);
4685 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TVec< TFlt > TFltV
Definition: ds.h:1596
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
Definition: gbase.h:23
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddGraphAttribute ( const TStr Attr,
TBool  IsEdge,
TBool  IsSrc,
TBool  IsDst 
)
protected

Adds names of columns to be used as graph attributes.

Definition at line 985 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), DstNodeAttrV, EdgeAttrV, IsColName(), NormalizeColName(), SrcNodeAttrV, and TExcept::Throw().

Referenced by AddDstNodeAttr(), AddEdgeAttr(), and AddSrcNodeAttr().

985  {
986  if (!IsColName(Attr)) { TExcept::Throw(Attr + ": No such column"); }
987  if (IsEdge) { EdgeAttrV.Add(NormalizeColName(Attr)); }
988  if (IsSrc) { SrcNodeAttrV.Add(NormalizeColName(Attr)); }
989  if (IsDst) { DstNodeAttrV.Add(NormalizeColName(Attr)); }
990 }
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TBool IsColName(const TStr &ColName) const
Definition: table.h:646

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddGraphAttributeV ( TStrV Attrs,
TBool  IsEdge,
TBool  IsSrc,
TBool  IsDst 
)
protected

Adds vector of names of columns to be used as graph attributes.

Definition at line 992 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), DstNodeAttrV, EdgeAttrV, IsColName(), TVec< TVal, TSizeTy >::Len(), NormalizeColName(), SrcNodeAttrV, and TExcept::Throw().

Referenced by AddDstNodeAttr(), AddEdgeAttr(), and AddSrcNodeAttr().

992  {
993  for (TInt i = 0; i < Attrs.Len(); i++) {
994  if (!IsColName(Attrs[i])) {
995  TExcept::Throw(Attrs[i] + ": no such column");
996  }
997  }
998  for (TInt i = 0; i < Attrs.Len(); i++) {
999  if (IsEdge) { EdgeAttrV.Add(NormalizeColName(Attrs[i])); }
1000  if (IsSrc) { SrcNodeAttrV.Add(NormalizeColName(Attrs[i])); }
1001  if (IsDst) { DstNodeAttrV.Add(NormalizeColName(Attrs[i])); }
1002  }
1003 }
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
Definition: dt.h:1137
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TBool IsColName(const TStr &ColName) const
Definition: table.h:646

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddIdColumn ( const TStr IdColName)
protected

Adds a column of explicit integer identifiers to the rows.

Definition at line 1900 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), AddColType(), THash< TKey, TDat, THashFunc >::AddDat(), AddSchemaCol(), atInt, BegRI(), THash< TKey, TDat, THashFunc >::Clr(), EndRI(), IntCols, TVec< TVal, TSizeTy >::Len(), NumRows, TVec< TVal, TSizeTy >::Reserve(), and RowIdMap.

Referenced by InitIds().

1900  {
1901  //printf("NumRows: %d\n", NumRows.Val);
1902  TInt IdCol = IntCols.Add();
1903  IntCols[IdCol].Reserve(NumRows, NumRows);
1904  //printf("IdCol Reserved\n");
1905  TInt IdCnt = 0;
1906  RowIdMap.Clr();
1907  for (TRowIterator RI = BegRI(); RI < EndRI(); RI++) {
1908  IntCols[IdCol][RI.GetRowIdx()] = IdCnt;
1909  RowIdMap.AddDat(IdCnt, RI.GetRowIdx());
1910  IdCnt++;
1911  }
1912  AddSchemaCol(ColName, atInt);
1913  AddColType(ColName, atInt, IntCols.Len()-1);
1914 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Iterator class for TTable rows.
Definition: table.h:330
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
Definition: dt.h:1137
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
void Clr(const bool &DoDel=true, const int &NoDelLim=-1, const bool &ResetDat=true)
Definition: hash.h:361
void Reserve(const TSizeTy &_MxVals)
Reserves enough memory for the vector to store _MxVals elements.
Definition: ds.h:543
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TDat & AddDat(const TKey &Key)
Definition: hash.h:238

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddIntCol ( const TStr ColName)

Adds an integer column with name ColName.

Definition at line 4673 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atInt, IntCols, TVec< TVal, TSizeTy >::Len(), and NumRows.

Referenced by Aggregate(), AggregateCols(), and ColGenericOp().

4673  {
4674  AddSchemaCol(ColName, atInt);
4676  TInt L = IntCols.Len();
4677  AddColType(ColName, atInt, L-1);
4678 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1137
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
TVec< TInt > TIntV
Definition: ds.h:1594
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddJointRow ( const TTable T1,
const TTable T2,
TInt  RowIdx1,
TInt  RowIdx2 
)
protected

Adds joint row T1[RowIdx1]<=>T2[RowIdx2].

Definition at line 1957 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), TVec< TVal, TSizeTy >::Empty(), FltCols, IntCols, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, NumValidRows, RowIdMap, and StrColMaps.

1957  {
1958  for (TInt i = 0; i < T1.IntCols.Len(); i++) {
1959  IntCols[i].Add(T1.IntCols[i][RowIdx1]);
1960  }
1961  for (TInt i = 0; i < T1.FltCols.Len(); i++) {
1962  FltCols[i].Add(T1.FltCols[i][RowIdx1]);
1963  }
1964  for (TInt i = 0; i < T1.StrColMaps.Len(); i++) {
1965  StrColMaps[i].Add(T1.StrColMaps[i][RowIdx1]);
1966  }
1967  TInt IntOffset = T1.IntCols.Len();
1968  TInt FltOffset = T1.FltCols.Len();
1969  TInt StrOffset = T1.StrColMaps.Len();
1970  for (TInt i = 0; i < T2.IntCols.Len(); i++) {
1971  IntCols[i+IntOffset].Add(T2.IntCols[i][RowIdx2]);
1972  }
1973  for (TInt i = 0; i < T2.FltCols.Len(); i++) {
1974  FltCols[i+FltOffset].Add(T2.FltCols[i][RowIdx2]);
1975  }
1976  for (TInt i = 0; i < T2.StrColMaps.Len(); i++) {
1977  StrColMaps[i+StrOffset].Add(T2.StrColMaps[i][RowIdx2]);
1978  }
1979  TInt IdOffset = IntOffset + T2.IntCols.Len();
1980  NumRows++;
1981  NumValidRows++;
1982  if (!Next.Empty()) {
1983  Next[Next.Len()-1] = NumValidRows-1;
1985  }
1986  Next.Add(Last);
1988  IntCols[IdOffset].Add(NumRows-1);
1989 }
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
bool Empty() const
Tests whether the vector is empty.
Definition: ds.h:570
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TDat & AddDat(const TKey &Key)
Definition: hash.h:238

Here is the call graph for this function:

void TTable::AddNJointRowsMP ( const TTable T1,
const TTable T2,
const TVec< TIntPrV > &  JointRowIDSet 
)
protected

Adds rows from T1 and T2 to this table in a parallel manner. Used by Join.

Definition at line 4442 of file table.cpp.

References THash< TKey, TDat, THashFunc >::AddDat(), Assert, THash< TKey, TDat, THashFunc >::Clr(), FltCols, TPair< TVal1, TVal2 >::GetVal1(), TPair< TVal1, TVal2 >::GetVal2(), IntCols, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, NumValidRows, ResizeTable(), RowIdMap, StrColMaps, and TExcept::Throw().

4442  {
4443  //double startFn = omp_get_wtime();
4444  int JointTableSize = 0;
4445  TIntV StartOffsets(JointRowIDSet.Len());
4446  for (int i = 0; i < JointRowIDSet.Len(); i++) {
4447  StartOffsets[i] = JointTableSize;
4448  JointTableSize += JointRowIDSet[i].Len();
4449  }
4450  if (JointTableSize == 0) {
4451  TExcept::Throw("Joint table is empty");
4452  }
4453  //double endOffsets = omp_get_wtime();
4454  //printf("Offsets time = %f\n",endOffsets-startFn);
4455  ResizeTable(JointTableSize);
4456  //double endResize = omp_get_wtime();
4457  //printf("Resize time = %f\n",endResize-endOffsets);
4458  NumRows = JointTableSize;
4459  NumValidRows = JointTableSize;
4460  Assert(NumRows <= Next.Len());
4461 
4462  TInt IntOffset = T1.IntCols.Len();
4463  TInt FltOffset = T1.FltCols.Len();
4464  TInt StrOffset = T1.StrColMaps.Len();
4465 
4466  TInt IdOffset = IntOffset + T2.IntCols.Len();
4467  RowIdMap.Clr();
4468  for (TInt IdCnt = 0; IdCnt < JointTableSize; IdCnt++) {
4469  RowIdMap.AddDat(IdCnt, IdCnt);
4470  }
4471 
4472  #pragma omp parallel for schedule(dynamic, CHUNKS_PER_THREAD)
4473  for (int j = 0; j < JointRowIDSet.Len(); j++) {
4474  const TIntPrV& RowIDs = JointRowIDSet[j];
4475  int start = StartOffsets[j];
4476  int NewRows = RowIDs.Len();
4477  if (NewRows == 0) {continue;}
4478  for (TInt r = 0; r < NewRows; r++){
4479  TIntPr CurrRowIdPr = RowIDs[r];
4480  for(TInt i = 0; i < T1.IntCols.Len(); i++){
4481  IntCols[i][start+r] = T1.IntCols[i][CurrRowIdPr.GetVal1()];
4482  }
4483  for(TInt i = 0; i < T1.FltCols.Len(); i++){
4484  FltCols[i][start+r] = T1.FltCols[i][CurrRowIdPr.GetVal1()];
4485  }
4486  for(TInt i = 0; i < T1.StrColMaps.Len(); i++){
4487  StrColMaps[i][start+r] = T1.StrColMaps[i][CurrRowIdPr.GetVal1()];
4488  }
4489  for(TInt i = 0; i < T2.IntCols.Len(); i++){
4490  IntCols[i+IntOffset][start+r] = T2.IntCols[i][CurrRowIdPr.GetVal2()];
4491  }
4492  for(TInt i = 0; i < T2.FltCols.Len(); i++){
4493  FltCols[i+FltOffset][start+r] = T2.FltCols[i][CurrRowIdPr.GetVal2()];
4494  }
4495  for(TInt i = 0; i < T2.StrColMaps.Len(); i++){
4496  StrColMaps[i+StrOffset][start+r] = T2.StrColMaps[i][CurrRowIdPr.GetVal2()];
4497  }
4498  IntCols[IdOffset][start+r] = start+r;
4499  }
4500  for(TInt r = 0; r < NewRows; r++){
4501  Next[start+r] = start+r+1;
4502  }
4503  }
4504  LastValidRow = JointTableSize-1;
4505  Next[LastValidRow] = Last;
4506  //double endIterate = omp_get_wtime();
4507  //printf("Iterate time = %f\n",endIterate-endResize);
4508 }
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
const TVal1 & GetVal1() const
Definition: ds.h:60
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
void ResizeTable(int RowCount)
Resizes the table to hold RowCount rows.
Definition: table.cpp:4330
const TVal2 & GetVal2() const
Definition: ds.h:61
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
#define Assert(Cond)
Definition: bd.h:251
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
Definition: ds.h:32
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void Clr(const bool &DoDel=true, const int &NoDelLim=-1, const bool &ResetDat=true)
Definition: hash.h:361
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TDat & AddDat(const TKey &Key)
Definition: hash.h:238

Here is the call graph for this function:

void TTable::AddNodeAttr ( const TStr Attr)
inline

Handles the common case where src and dst both belong to the same "universe" of entities.

Definition at line 1184 of file table.h.

References AddDstNodeAttr(), and AddSrcNodeAttr().

1184 { AddSrcNodeAttr(Attr); AddDstNodeAttr(Attr); }
void AddDstNodeAttr(const TStr &Attr)
Adds column to be used as dst node atribute of the graph.
Definition: table.h:1180
void AddSrcNodeAttr(const TStr &Attr)
Adds column to be used as src node atribute of the graph.
Definition: table.h:1176

Here is the call graph for this function:

void TTable::AddNodeAttr ( TStrV Attrs)
inline

Handles the common case where src and dst both belong to the same "universe" of entities.

Definition at line 1186 of file table.h.

References AddDstNodeAttr(), and AddSrcNodeAttr().

1186 { AddSrcNodeAttr(Attrs); AddDstNodeAttr(Attrs); }
void AddDstNodeAttr(const TStr &Attr)
Adds column to be used as dst node atribute of the graph.
Definition: table.h:1180
void AddSrcNodeAttr(const TStr &Attr)
Adds column to be used as src node atribute of the graph.
Definition: table.h:1176

Here is the call graph for this function:

void TTable::AddNodeAttributes ( TInt  NId,
TStrV  NodeAttrV,
TInt  RowId,
THash< TInt, TStrIntVH > &  NodeIntAttrs,
THash< TInt, TStrFltVH > &  NodeFltAttrs,
THash< TInt, TStrStrVH > &  NodeStrAttrs 
)
inlineprotected

Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values).

Definition at line 3414 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddKey(), atFlt, atInt, CommonNodeAttrs, FltCols, GetColIdx(), GetColType(), THash< TKey, TDat, THashFunc >::GetDat(), GetStrValIdx(), IntCols, THash< TKey, TDat, THashFunc >::IsKey(), and TVec< TVal, TSizeTy >::Len().

Referenced by BuildGraph().

3415  {
3416  for (TInt i = 0; i < NodeAttrV.Len(); i++) {
3417  TStr ColAttr = NodeAttrV[i];
3418  TAttrType CT = GetColType(ColAttr);
3419  int ColId = GetColIdx(ColAttr);
3420  // check if this is a common src-dst attribute
3421  for (TInt i = 0; i < CommonNodeAttrs.Len(); i++) {
3422  if (CommonNodeAttrs[i].Val1 == ColAttr || CommonNodeAttrs[i].Val2 == ColAttr) {
3423  ColAttr = CommonNodeAttrs[i].Val3;
3424  break;
3425  }
3426  }
3427  if (CT == atInt) {
3428  if (!NodeIntAttrs.IsKey(NId)) { NodeIntAttrs.AddKey(NId); }
3429  if (!NodeIntAttrs.GetDat(NId).IsKey(ColAttr)) { NodeIntAttrs.GetDat(NId).AddKey(ColAttr); }
3430  NodeIntAttrs.GetDat(NId).GetDat(ColAttr).Add(IntCols[ColId][RowId]);
3431  } else if (CT == atFlt) {
3432  if (!NodeFltAttrs.IsKey(NId)) { NodeFltAttrs.AddKey(NId); }
3433  if (!NodeFltAttrs.GetDat(NId).IsKey(ColAttr)) { NodeFltAttrs.GetDat(NId).AddKey(ColAttr); }
3434  NodeFltAttrs.GetDat(NId).GetDat(ColAttr).Add(FltCols[ColId][RowId]);
3435  } else {
3436  if (!NodeStrAttrs.IsKey(NId)) { NodeStrAttrs.AddKey(NId); }
3437  if (!NodeStrAttrs.GetDat(NId).IsKey(ColAttr)) { NodeStrAttrs.GetDat(NId).AddKey(ColAttr); }
3438  NodeStrAttrs.GetDat(NId).GetDat(ColAttr).Add(GetStrValIdx(ColId, RowId));
3439  }
3440  }
3441 }
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
TStr GetStrValIdx(TInt ColIdx, TInt RowIdx) const
Gets the value in column with id ColIdx at row RowIdx.
Definition: table.h:626
TStrTrV CommonNodeAttrs
List of attribute pairs with values common to source and destination and their common given name...
Definition: table.h:594
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
int AddKey(const TKey &Key)
Definition: hash.h:373
Definition: dt.h:412
Definition: gbase.h:23
bool IsKey(const TKey &Key) const
Definition: hash.h:258
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddNRows ( int  NewRows,
const TVec< TIntV > &  IntColsP,
const TVec< TFltV > &  FltColsP,
const TVec< TIntV > &  StrColMapsP 
)
protected

Adds NewRows rows from the given vectors for each column type.

Definition at line 4421 of file table.cpp.

References FltCols, GetEmptyRowsStart(), IntCols, TVec< TVal, TSizeTy >::Len(), Next, and StrColMaps.

4421  {
4422  if (NewRows == 0) { return; }
4423  // this call should be thread-safe
4424  int start = GetEmptyRowsStart(NewRows);
4425  for (TInt r = 0; r < NewRows; r++) {
4426  for (TInt i = 0; i < IntColsP.Len(); i++) {
4427  IntCols[i][start+r] = IntColsP[i][r];
4428  }
4429  for (TInt i = 0; i < FltColsP.Len(); i++) {
4430  FltCols[i][start+r] = FltColsP[i][r];
4431  }
4432  for (TInt i = 0; i < StrColMapsP.Len(); i++) {
4433  StrColMaps[i][start+r] = StrColMapsP[i][r];
4434  }
4435  }
4436  for (TInt r = 0; r < NewRows-1; r++) {
4437  Next[start+r] = start+r+1;
4438  }
4439 }
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
int GetEmptyRowsStart(int NewRows)
Gets the start index to a chunk of empty rows of size NewRows.
Definition: table.cpp:4376
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555

Here is the call graph for this function:

void TTable::AddRow ( const TTableRow Row)
inline

Adds row with values taken from given TTableRow.

Definition at line 1002 of file table.h.

References AddRowV(), TTableRow::GetFltVals(), TTableRow::GetIntVals(), and TTableRow::GetStrVals().

1002 { AddRowV(Row.GetIntVals(), Row.GetFltVals(), Row.GetStrVals()); };
TStrV GetStrVals() const
Gets string attributes of this row.
Definition: table.h:253
TFltV GetFltVals() const
Gets float attributes of this row.
Definition: table.h:251
TIntV GetIntVals() const
Gets int attributes of this row.
Definition: table.h:249
void AddRowV(const TIntV &IntVals, const TFltV &FltVals, const TStrV &StrVals)
Adds row with values corresponding to the given vectors by type.
Definition: table.cpp:4317

Here is the call graph for this function:

void TTable::AddRowI ( const TRowIterator RI)
protected

Adds row corresponding to RI.

Definition at line 4295 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), atFlt, atInt, atStr, FltCols, GetColIdx(), GetColType(), TRowIterator::GetFltAttr(), TRowIterator::GetIntAttr(), GetSchemaColName(), TRowIterator::GetStrMapByName(), IdColName, IntCols, TVec< TVal, TSizeTy >::Len(), Sch, StrColMaps, and UpdateTableForNewRow().

4295  {
4296  for (TInt c = 0; c < Sch.Len(); c++) {
4297  TStr ColName = GetSchemaColName(c);
4298  if (ColName == IdColName) { continue; }
4299 
4300  TInt ColIdx = GetColIdx(ColName);
4301 
4302  switch (GetColType(ColName)) {
4303  case atInt:
4304  IntCols[ColIdx].Add(RI.GetIntAttr(ColName));
4305  break;
4306  case atFlt:
4307  FltCols[ColIdx].Add(RI.GetFltAttr(ColName));
4308  break;
4309  case atStr:
4310  StrColMaps[ColIdx].Add(RI.GetStrMapByName(ColName));
4311  break;
4312  }
4313  }
4315 }
TFlt GetFltAttr(TInt ColIdx) const
Returns value of floating point attribute specified by float column index for current row...
Definition: table.cpp:159
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
Schema Sch
Table Schema.
Definition: table.h:549
TInt GetIntAttr(TInt ColIdx) const
Returns value of integer attribute specified by integer column index for current row.
Definition: table.cpp:155
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
TInt GetStrMapByName(const TStr &Col) const
Returns integer mapping of string attribute specified by attribute name for current row...
Definition: table.cpp:181
Definition: gbase.h:23
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStr GetSchemaColName(TInt Idx) const
Gets name of the column with index Idx in the schema.
Definition: table.h:638
void UpdateTableForNewRow()
Updates table state after adding one or more rows.
Definition: table.cpp:4140
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: dt.h:412
Definition: gbase.h:23
Definition: gbase.h:23
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

void TTable::AddRowV ( const TIntV IntVals,
const TFltV FltVals,
const TStrV StrVals 
)
protected

Adds row with values corresponding to the given vectors by type.

Definition at line 4317 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), AddStrVal(), FltCols, IntCols, TVec< TVal, TSizeTy >::Len(), and UpdateTableForNewRow().

Referenced by AddRow().

4317  {
4318  for (TInt c = 0; c < IntVals.Len(); c++) {
4319  IntCols[c].Add(IntVals[c]);
4320  }
4321  for (TInt c = 0; c < FltVals.Len(); c++) {
4322  FltCols[c].Add(FltVals[c]);
4323  }
4324  for (TInt c = 0; c < StrVals.Len(); c++) {
4325  AddStrVal(c, StrVals[c]);
4326  }
4328 }
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
void UpdateTableForNewRow()
Updates table state after adding one or more rows.
Definition: table.cpp:4140
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
void AddStrVal(const TInt &ColIdx, const TStr &Val)
Adds Val in column with id ColIdx.
Definition: table.cpp:971
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddSchemaCol ( const TStr ColName,
TAttrType  ColType 
)
inlineprotected

Adds column with name ColName and type ColType to the schema.

Definition at line 642 of file table.h.

References TVec< TVal, TSizeTy >::Add(), and NormalizeColName().

Referenced by AddFltCol(), AddIdColumn(), AddIntCol(), AddStrCol(), ClassifyAux(), GenerateColTypeMap(), GroupAux(), Order(), StoreFltCol(), StoreIntCol(), StoreStrCol(), and TTable().

642  {
643  TStr NColName = NormalizeColName(ColName);
644  Sch.Add(TPair<TStr,TAttrType>(NColName, ColType));
645  }
Schema Sch
Table Schema.
Definition: table.h:549
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
Definition: dt.h:412
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddSelectedRows ( const TTable Table,
const TIntV RowIDs 
)
protected

Adds rows from Table that correspond to ids in RowIDs.

Definition at line 4399 of file table.cpp.

References FltCols, GetEmptyRowsStart(), IntCols, TVec< TVal, TSizeTy >::Len(), Next, and StrColMaps.

Referenced by TTable().

4399  {
4400  int NewRows = RowIDs.Len();
4401  if (NewRows == 0) { return; }
4402  // this call should be thread-safe
4403  int start = GetEmptyRowsStart(NewRows);
4404  for (TInt r = 0; r < NewRows; r++) {
4405  TInt CurrRowIdx = RowIDs[r];
4406  for (TInt i = 0; i < Table.IntCols.Len(); i++) {
4407  IntCols[i][start+r] = Table.IntCols[i][CurrRowIdx];
4408  }
4409  for (TInt i = 0; i < Table.FltCols.Len(); i++) {
4410  FltCols[i][start+r] = Table.FltCols[i][CurrRowIdx];
4411  }
4412  for (TInt i = 0; i < Table.StrColMaps.Len(); i++) {
4413  StrColMaps[i][start+r] = Table.StrColMaps[i][CurrRowIdx];
4414  }
4415  }
4416  for (TInt r = 0; r < NewRows-1; r++) {
4417  Next[start+r] = start+r+1;
4418  }
4419 }
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
int GetEmptyRowsStart(int NewRows)
Gets the start index to a chunk of empty rows of size NewRows.
Definition: table.cpp:4376
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddSrcNodeAttr ( const TStr Attr)
inline

Adds column to be used as src node atribute of the graph.

Definition at line 1176 of file table.h.

References AddGraphAttribute().

Referenced by AddNodeAttr().

1176 { AddGraphAttribute(Attr, false, true, false); }
void AddGraphAttribute(const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds names of columns to be used as graph attributes.
Definition: table.cpp:985

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddSrcNodeAttr ( TStrV Attrs)
inline

Adds columns to be used as src node attributes of the graph.

Definition at line 1178 of file table.h.

References AddGraphAttributeV().

1178 { AddGraphAttributeV(Attrs, false, true, false); }
void AddGraphAttributeV(TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds vector of names of columns to be used as graph attributes.
Definition: table.cpp:992

Here is the call graph for this function:

void TTable::AddStrCol ( const TStr ColName)

Adds a string column with name ColName.

Definition at line 4687 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atStr, TVec< TVal, TSizeTy >::Len(), NumRows, and StrColMaps.

Referenced by ColConcat(), and ColConcatConst().

4687  {
4688  AddSchemaCol(ColName, atStr);
4690  TInt L = StrColMaps.Len();
4691  AddColType(ColName, atStr, L-1);
4692 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1137
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
TVec< TInt > TIntV
Definition: ds.h:1594
Definition: gbase.h:23
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddStrVal ( const TInt ColIdx,
const TStr Val 
)
protected

Adds Val in column with id ColIdx.

Definition at line 971 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), TStrHash< TDat, TStringPool, THashFunc >::AddKey(), Context, StrColMaps, and TTableContext::StringVals.

Referenced by AddRowV(), and AddStrVal().

971  {
972  TInt KeyId = TInt(Context->StringVals.AddKey(Key));
973  //printf("TTable::AddStrVal2 %d .%s. %d\n", ColIdx.Val, Key.CStr(), KeyId.Val);
974  StrColMaps[ColIdx].Add(KeyId);
975 }
TTableContext * Context
Execution Context.
Definition: table.h:545
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStrHash< TInt, TBigStrPool > StringVals
StringPool - stores string data values and maps them to integers.
Definition: table.h:182
int AddKey(const char *Key)
Definition: hash.h:968
Definition: dt.h:1137
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AddStrVal ( const TStr Col,
const TStr Val 
)
protected

Adds Val in column with name Col.

Definition at line 977 of file table.cpp.

References AddStrVal(), atStr, GetColIdx(), GetColType(), and TExcept::Throw().

977  {
978  if (GetColType(Col) != atStr) {
979  TExcept::Throw(Col + " is not a string valued column");
980  }
981  //printf("TTable::AddStrVal1 .%s. .%s.\n", Col.CStr(), Key.CStr());
982  AddStrVal(GetColIdx(Col), Key);
983 }
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
void AddStrVal(const TInt &ColIdx, const TStr &Val)
Adds Val in column with id ColIdx.
Definition: table.cpp:971
Definition: gbase.h:23

Here is the call graph for this function:

void TTable::AddTable ( const TTable T)
protected

Adds all the rows of the input table. Allows duplicate rows (not a union).

Definition at line 3975 of file table.cpp.

References TVec< TVal, TSizeTy >::AddV(), atFlt, atInt, atStr, FirstValidRow, FltCols, GetColIdx(), GetColType(), GetSchemaColName(), IdColName, IntCols, Invalid, Last, LastValidRow, TVec< TVal, TSizeTy >::Len(), Next, NumRows, NumValidRows, Sch, StrColMaps, and TExcept::Throw().

Referenced by ConcatTable(), and UnionAllInPlace().

3975  {
3976  //for (TInt c = 0; c < S.Len(); c++) {
3977  // if (S[c] != T.S[c]) { printf("(%s,%d) != (%s,%d)\n", S[c].Val1.CStr(), S[c].Val2, T.S[c].Val1.CStr(), T.S[c].Val2); TExcept::Throw("when adding tables, their schemas must match!"); }
3978  //}
3979  for (TInt c = 0; c < Sch.Len(); c++) {
3980  TStr ColName = GetSchemaColName(c);
3981  TInt ColIdx = GetColIdx(ColName);
3982  TInt TColIdx = ColName == IdColName ? T.GetColIdx(T.IdColName) : T.GetColIdx(ColName);
3983  if (TColIdx < 0) { TExcept::Throw("when adding a table, it must contain all columns of source table!"); }
3984  switch (GetColType(ColName)) {
3985  case atInt:
3986  IntCols[ColIdx].AddV(T.IntCols[TColIdx]);
3987  break;
3988  case atFlt:
3989  FltCols[ColIdx].AddV(T.FltCols[TColIdx]);
3990  break;
3991  case atStr:
3992  StrColMaps[ColIdx].AddV(T.StrColMaps[TColIdx]);
3993  break;
3994  }
3995  }
3996 
3997  TIntV TNext(T.Next);
3998  for (TInt i = 0; i < TNext.Len(); i++) {
3999  if (TNext[i] != Last && TNext[i] != Invalid) { TNext[i] += NumRows; }
4000  }
4001 
4002  Next.AddV(TNext);
4003  // checks if table is empty
4004  if (LastValidRow >= 0) {
4006  }
4008  NumRows += T.NumRows;
4010 }
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
Schema Sch
Table Schema.
Definition: table.h:549
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStr GetSchemaColName(TInt Idx) const
Gets name of the column with index Idx in the schema.
Definition: table.h:638
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
static const TInt Invalid
Special value for Next vector entry - logically removed row.
Definition: table.h:487
Definition: dt.h:412
Definition: gbase.h:23
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
TSizeTy AddV(const TVec< TVal, TSizeTy > &ValV)
Adds the elements of the vector ValV to the to end of the vector.
Definition: ds.h:1110

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::Aggregate ( const TStrV GroupByAttrs,
TAttrAggr  AggOp,
const TStr ValAttr,
const TStr ResAttr,
TBool  Ordered = true 
)

Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr.

Definition at line 1585 of file table.cpp.

References aaCount, aaMean, TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), AddFltCol(), AddIntCol(), atFlt, atInt, atStr, THashMP< TKey, TDat, THashFunc >::BegI(), THash< TKey, TDat, THashFunc >::BegI(), THashMP< TKey, TDat, THashFunc >::EndI(), THash< TKey, TDat, THashFunc >::EndI(), FltCols, GetColIdx(), GetColType(), THashMP< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::GetKey(), GetMP(), GroupAux(), GroupByFltCol(), GroupByIntCol(), GroupByIntColMP(), GroupByStrCol(), GroupMapping, IdColName, IntCols, IsColName(), THash< TKey, TDat, THashFunc >::Len(), TVec< TVal, TSizeTy >::Len(), NormalizeColNameV(), NumValidRows, and TExcept::Throw().

Referenced by Count().

1586  {
1587 
1588  for (TInt c = 0; c < GroupByAttrs.Len(); c++) {
1589  if (!IsColName(GroupByAttrs[c])) {
1590  TExcept::Throw("no such column " + GroupByAttrs[c]);
1591  }
1592  }
1593 
1594  // double startFn = omp_get_wtime();
1595  TStrV NGroupByAttrs = NormalizeColNameV(GroupByAttrs);
1596  TBool UsePhysicalIds = (GetColIdx(IdColName) < 0);
1597 
1598  THash<TInt,TIntV> GroupByIntMapping;
1599  THash<TFlt,TIntV> GroupByFltMapping;
1600  THash<TInt,TIntV> GroupByStrMapping;
1601  THash<TGroupKey,TIntV> Mapping;
1602 #ifdef GCC_ATOMIC
1603  THashMP<TInt,TIntV> GroupByIntMapping_MP(NumValidRows);
1604  TIntV GroupByIntMPKeys(NumValidRows);
1605 #endif
1606  TInt NumOfGroups = 0;
1607  TInt GroupingCase = 0;
1608 
1609  // check if grouping already exists
1610  GroupStmt Stmt(NGroupByAttrs, Ordered, UsePhysicalIds);
1611  if (GroupMapping.IsKey(Stmt)) {
1612  Mapping = GroupMapping.GetDat(Stmt);
1613  } else{
1614  if(NGroupByAttrs.Len() == 1){
1615  switch(GetColType(NGroupByAttrs[0])){
1616  case atInt:
1617 #ifdef GCC_ATOMIC
1618  if(GetMP()){
1619  GroupByIntColMP(NGroupByAttrs[0], GroupByIntMapping_MP, UsePhysicalIds);
1620  int x = 0;
1621  for(THashMP<TInt,TIntV>::TIter it = GroupByIntMapping_MP.BegI(); it < GroupByIntMapping_MP.EndI(); it++){
1622  GroupByIntMPKeys[x] = it.GetKey();
1623  x++;
1624  /*
1625  printf("%d --> ", it.GetKey().Val);
1626  TIntV& V = it.GetDat();
1627  for(int i = 0; i < V.Len(); i++){
1628  printf(" %d", V[i].Val);
1629  }
1630  printf("\n");
1631  */
1632  }
1633  NumOfGroups = x;
1634  GroupingCase = 4;
1635  //printf("Number of groups: %d\n", NumOfGroups.Val);
1636  break;
1637  }
1638 #endif // GCC_ATOMIC
1639  GroupByIntCol(NGroupByAttrs[0], GroupByIntMapping, TIntV(), true, UsePhysicalIds);
1640  NumOfGroups = GroupByIntMapping.Len();
1641  GroupingCase = 1;
1642  break;
1643  case atFlt:
1644  GroupByFltCol(NGroupByAttrs[0], GroupByFltMapping, TIntV(), true, UsePhysicalIds);
1645  NumOfGroups = GroupByFltMapping.Len();
1646  GroupingCase = 2;
1647  break;
1648  case atStr:
1649  GroupByStrCol(NGroupByAttrs[0], GroupByStrMapping, TIntV(), true, UsePhysicalIds);
1650  NumOfGroups = GroupByStrMapping.Len();
1651  GroupingCase = 3;
1652  break;
1653  }
1654  }
1655  else{
1656  TIntV UniqueVector;
1658  GroupAux(NGroupByAttrs, Mapping_aux, Ordered, "", false, UniqueVector, UsePhysicalIds);
1659  for(THash<TGroupKey, TPair<TInt, TIntV> >::TIter it = Mapping_aux.BegI(); it < Mapping_aux.EndI(); it++){
1660  Mapping.AddDat(it.GetKey(), it.GetDat().Val2);
1661  }
1662  NumOfGroups = Mapping.Len();
1663  }
1664  }
1665 
1666  // double endGroup = omp_get_wtime();
1667  // printf("Group time = %f\n", endGroup-startFn);
1668 
1669  TAttrType T = GetColType(ValAttr);
1670 
1671  // add column corresponding to result attribute type
1672  if (AggOp == aaCount) { AddIntCol(ResAttr); }
1673  else {
1674  if (T == atInt) { AddIntCol(ResAttr); }
1675  else if (T == atFlt) { AddFltCol(ResAttr); }
1676  else {
1677  // Count is the only aggregation operation handled for Str
1678  TExcept::Throw("Invalid aggregation for Str type!");
1679  }
1680  }
1681  TInt ColIdx = GetColIdx(ResAttr);
1682  TInt AggrColIdx = GetColIdx(ValAttr);
1683 
1684  // double endAdd = omp_get_wtime();
1685  // printf("AddCol time = %f\n", endAdd-endGroup);
1686 
1687 #ifdef USE_OPENMP
1688  #pragma omp parallel for schedule(dynamic)
1689 #endif
1690  for (int g = 0; g < NumOfGroups; g++) {
1691  TIntV* GroupRows = NULL;
1692  switch(GroupingCase){
1693  case 0:
1694  GroupRows = & Mapping.GetDat(Mapping.GetKey(g));
1695  break;
1696  case 1:
1697  GroupRows = & GroupByIntMapping.GetDat(GroupByIntMapping.GetKey(g));
1698  break;
1699  case 2:
1700  GroupRows = & GroupByIntMapping.GetDat(GroupByIntMapping.GetKey(g));
1701  break;
1702  case 3:
1703  GroupRows = & GroupByStrMapping.GetDat(GroupByStrMapping.GetKey(g));
1704  break;
1705  case 4:
1706 #ifdef GCC_ATOMIC
1707  GroupRows = & GroupByIntMapping_MP.GetDat(GroupByIntMPKeys[g]);
1708 #endif
1709  break;
1710  }
1711 
1712  // find valid rows of group
1713  /*
1714  TIntV ValidRows;
1715  for (TInt i = 0; i < GroupRows.Len(); i++) {
1716  // TODO: This should not be necessary
1717  if (!RowIdMap.IsKey(GroupRows[i])) { continue; }
1718  TInt RowId = RowIdMap.GetDat(GroupRows[i]);
1719  // GroupRows has physical row indices
1720  if (RowId != Invalid) { ValidRows.Add(RowId); }
1721  }
1722  */
1723  TIntV& ValidRows = *GroupRows;
1724  TInt sz = ValidRows.Len();
1725  if (sz <= 0) continue;
1726  // Count is handled separately (other operations have aggregation policies defined in a template)
1727  if (AggOp == aaCount) {
1728  for (TInt i = 0; i < sz; i++) { IntCols[ColIdx][ValidRows[i]] = sz; }
1729  } else {
1730  // aggregate based on column type
1731  if (T == atInt) {
1732  TIntV V;
1733  for (TInt i = 0; i < sz; i++) { V.Add(IntCols[AggrColIdx][ValidRows[i]]); }
1734  TInt Res = AggregateVector<TInt>(V, AggOp);
1735  if (AggOp == aaMean) { Res = Res / sz; }
1736  for (TInt i = 0; i < sz; i++) { IntCols[ColIdx][ValidRows[i]] = Res; }
1737  } else {
1738  TFltV V;
1739  for (TInt i = 0; i < sz; i++) { V.Add(FltCols[AggrColIdx][ValidRows[i]]); }
1740  TFlt Res = AggregateVector<TFlt>(V, AggOp);
1741  if (AggOp == aaMean) { Res /= sz; }
1742  for (TInt i = 0; i < sz; i++) { FltCols[ColIdx][ValidRows[i]] = Res; }
1743  }
1744  }
1745  }
1746  // double endIter = omp_get_wtime();
1747  // printf("Iter time = %f\n", endIter-endAdd);
1748 }
THash< GroupStmt, THash< TGroupKey, TIntV > > GroupMapping
Maps grouping statements to their (group-by key –> group id) mapping.
Definition: table.h:581
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
void AddIntCol(const TStr &ColName)
Adds an integer column with name ColName.
Definition: table.cpp:4673
Definition: table.h:257
void GroupByIntColMP(const TStr &GroupBy, THashMP< TInt, TIntV > &Grouping, TBool UsePhysicalIds=true) const
Groups/hashes by a single column with integer values, using OpenMP multi-threading.
Definition: table.cpp:1225
TIter BegI() const
Definition: hash.h:213
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
static TStrV NormalizeColNameV(const TStrV &Cols)
Adds suffix to column name if it doesn't exist.
Definition: table.h:539
static TInt GetMP()
Definition: table.h:527
void GroupAux(const TStrV &GroupBy, THash< TGroupKey, TPair< TInt, TIntV > > &Grouping, TBool Ordered, const TStr &GroupColName, TBool KeepUnique, TIntV &UniqueVec, TBool UsePhysicalIds=true)
Helper function for grouping.
Definition: table.cpp:1322
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
TIter EndI() const
Definition: hash.h:218
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
void GroupByFltCol(const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
Groups/hashes by a single column with float values. Returns hash table with grouping.
Definition: table.h:1626
Definition: gbase.h:23
Definition: dt.h:1386
TPHKeyDat * EndI
Definition: hashmp.h:47
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
const TVal & GetDat(const TVal &Val) const
Returns reference to the first occurrence of element Val.
Definition: ds.h:838
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
void GroupByIntCol(const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
Groups/hashes by a single column with integer values.
Definition: table.h:1598
A class representing a cached grouping statement identifier.
Definition: table.h:266
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: ds.h:32
void GroupByStrCol(const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
Groups/hashes by a single column with string values. Returns hash table with grouping.
Definition: table.h:1653
Definition: gbase.h:23
Hash-Table with multiprocessing support.
Definition: hashmp.h:81
TVec< TInt > TIntV
Definition: ds.h:1594
void AddFltCol(const TStr &ColName)
Adds a float column with name ColName.
Definition: table.cpp:4680
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
Definition: dt.h:974
TBool IsColName(const TStr &ColName) const
Definition: table.h:646
int Len() const
Definition: hash.h:228
TDat & AddDat(const TKey &Key)
Definition: hash.h:238
const TKey & GetKey(const int &KeyId) const
Definition: hash.h:252
Definition: table.h:257

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::AggregateCols ( const TStrV AggrAttrs,
TAttrAggr  AggOp,
const TStr ResAttr 
)

Aggregates attributes in AggrAttrs across columns.

Definition at line 1750 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), AddFltCol(), AddIntCol(), atFlt, atInt, BegRI(), EndRI(), FltCols, GetColIdx(), GetColTypeMap(), IntCols, TVec< TVal, TSizeTy >::Len(), and TExcept::Throw().

1750  {
1752  for (TInt i = 0; i < AggrAttrs.Len(); i++) {
1753  Info.Add(GetColTypeMap(AggrAttrs[i]));
1754  if (Info[i].Val1 != Info[0].Val1) {
1755  TExcept::Throw("AggregateCols: Aggregation attributes must have the same type");
1756  }
1757  }
1758 
1759  if (Info[0].Val1 == atInt) {
1760  AddIntCol(ResAttr);
1761  TInt ResIdx = GetColIdx(ResAttr);
1762 
1763  for (TRowIterator RI = BegRI(); RI < EndRI(); RI++) {
1764  TInt RowIdx = RI.GetRowIdx();
1765  TIntV V;
1766  for (TInt i = 0; i < AggrAttrs.Len(); i++) {
1767  V.Add(IntCols[Info[i].Val2][RowIdx]);
1768  }
1769  IntCols[ResIdx][RowIdx] = AggregateVector<TInt>(V, AggOp);
1770  }
1771  } else if (Info[0].Val1 == atFlt) {
1772  AddFltCol(ResAttr);
1773  TInt ResIdx = GetColIdx(ResAttr);
1774 
1775  for (TRowIterator RI = BegRI(); RI < EndRI(); RI++) {
1776  TInt RowIdx = RI.GetRowIdx();
1777  TFltV V;
1778  for (TInt i = 0; i < AggrAttrs.Len(); i++) {
1779  V.Add(FltCols[Info[i].Val2][RowIdx]);
1780  }
1781  FltCols[ResIdx][RowIdx] = AggregateVector<TFlt>(V, AggOp);
1782  }
1783  } else {
1784  TExcept::Throw("AggregateCols: Only Int and Flt aggregation supported right now");
1785  }
1786 }
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
void AddIntCol(const TStr &ColName)
Adds an integer column with name ColName.
Definition: table.cpp:4673
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TPair< TAttrType, TInt > GetColTypeMap(const TStr &ColName) const
Gets column type and index of ColName.
Definition: table.h:666
Definition: dt.h:1137
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
Definition: gbase.h:23
void AddFltCol(const TStr &ColName)
Adds a float column with name ColName.
Definition: table.cpp:4680
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:430

Here is the call graph for this function:

template<class T >
T TTable::AggregateVector ( TVec< T > &  V,
TAttrAggr  Policy 
)
protected

Aggregates vector into a single scalar value according to a policy.

Aggregate vector into a single scalar value according to a policy. Used for choosing an attribute value for a node when this node appears in several records and has conflicting attribute values

Definition at line 1544 of file table.h.

References aaCount, aaFirst, aaLast, aaMax, aaMean, aaMedian, aaMin, aaSum, TVec< TVal, TSizeTy >::Len(), and TVec< TVal, TSizeTy >::Sort().

1544  {
1545  switch (Policy) {
1546  case aaMin: {
1547  T Res = V[0];
1548  for (TInt i = 1; i < V.Len(); i++) {
1549  if (V[i] < Res) { Res = V[i]; }
1550  }
1551  return Res;
1552  }
1553  case aaMax: {
1554  T Res = V[0];
1555  for (TInt i = 1; i < V.Len(); i++) {
1556  if (V[i] > Res) { Res = V[i]; }
1557  }
1558  return Res;
1559  }
1560  case aaFirst: {
1561  return V[0];
1562  }
1563  case aaLast:{
1564  return V[V.Len()-1];
1565  }
1566  case aaSum: {
1567  T Res = V[0];
1568  for (TInt i = 1; i < V.Len(); i++) {
1569  Res = Res + V[i];
1570  }
1571  return Res;
1572  }
1573  case aaMean: {
1574  T Res = V[0];
1575  for (TInt i = 1; i < V.Len(); i++) {
1576  Res = Res + V[i];
1577  }
1578  Res = Res / V.Len();
1579  return Res;
1580  }
1581  case aaMedian: {
1582  V.Sort();
1583  return V[V.Len()/2];
1584  }
1585  case aaCount: {
1586  // NOTE: Code should never reach here
1587  // I had to put this here to avoid a compiler warning.
1588  // Is there a better way to do this?
1589  return V[0];
1590  }
1591  }
1592  // Added to remove a compiler warning.
1593  T ShouldNotComeHere;
1594  return ShouldNotComeHere;
1595 }
Definition: table.h:257
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Definition: table.h:257
void Sort(const bool &Asc=true)
Sorts the elements of the vector.
Definition: ds.h:1318
Definition: dt.h:1137
Definition: table.h:257
Definition: table.h:257
Definition: table.h:257
Definition: table.h:257
Definition: table.h:257

Here is the call graph for this function:

TRowIteratorWithRemove TTable::BegRIWR ( )
inline

Gets iterator with reomve to the first valid row.

Definition at line 1245 of file table.h.

References TRowIteratorWithRemove.

Referenced by KeepSortedRows(), Select(), SelectAtomic(), and SelectAtomicConst().

1245 { return TRowIteratorWithRemove(FirstValidRow, this);}
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
friend class TRowIteratorWithRemove
Definition: table.h:1527

Here is the caller graph for this function:

PNEANet TTable::BuildGraph ( const TIntV RowIds,
TAttrAggr  AggrPolicy 
)
protected

Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes.

Definition at line 3445 of file table.cpp.

References AddEdgeAttributes(), AddNodeAttributes(), AggrPolicy, Assert, atFlt, atInt, atStr, THash< TKey, TDat, THashFunc >::BegI(), TVec< TVal, TSizeTy >::BegI(), CheckAndAddFltNode(), Context, DstCol, DstNodeAttrV, EdgeAttrV, THash< TKey, TDat, THashFunc >::EndI(), TVec< TVal, TSizeTy >::EndI(), FltCols, GetColIdx(), GetColType(), THash< TKey, TDat, THashFunc >::GetDat(), TStrHash< TDat, TStringPool, THashFunc >::GetKey(), IntCols, THash< TKey, TDat, THashFunc >::IsKey(), TVec< TVal, TSizeTy >::Len(), TNEANet::New(), SrcCol, SrcNodeAttrV, StrColMaps, and TTableContext::StringVals.

Referenced by GetGraphsFromSequence(), and GetNextGraphFromSequence().

3445  {
3446  PNEANet Graph = TNEANet::New();
3447 
3448  const TAttrType NodeType = GetColType(SrcCol);
3449  Assert(NodeType == GetColType(DstCol));
3450  const TInt SrcColIdx = GetColIdx(SrcCol);
3451  const TInt DstColIdx = GetColIdx(DstCol);
3452 
3453  // node values - i.e. the unique values of src/dst col
3454  //THashSet<TInt> IntNodeVals; // for both int and string node attr types.
3455  THash<TFlt, TInt> FltNodeVals;
3456 
3457  // node attributes
3458  THash<TInt, TStrIntVH> NodeIntAttrs;
3459  THash<TInt, TStrFltVH> NodeFltAttrs;
3460  THash<TInt, TStrStrVH> NodeStrAttrs;
3461 
3462  // make single pass over all rows in given row id set
3463  for (TVec<TInt>::TIter it = RowIds.BegI(); it < RowIds.EndI(); it++) {
3464  TInt CurrRowIdx = *it;
3465 
3466  // add src and dst nodes to graph if they are not seen earlier
3467  TInt SVal, DVal;
3468  if (NodeType == atFlt) {
3469  TFlt FSVal = FltCols[SrcColIdx][CurrRowIdx];
3470  SVal = CheckAndAddFltNode(Graph, FltNodeVals, FSVal);
3471  TFlt FDVal = FltCols[SrcColIdx][CurrRowIdx];
3472  DVal = CheckAndAddFltNode(Graph, FltNodeVals, FDVal);
3473  } else if (NodeType == atInt || NodeType == atStr) {
3474  if (NodeType == atInt) {
3475  SVal = IntCols[SrcColIdx][CurrRowIdx];
3476  DVal = IntCols[DstColIdx][CurrRowIdx];
3477  } else {
3478  SVal = StrColMaps[SrcColIdx][CurrRowIdx];
3479  if (strlen(Context->StringVals.GetKey(SVal)) == 0) { continue; } //illegal value
3480  DVal = StrColMaps[DstColIdx][CurrRowIdx];
3481  if (strlen(Context->StringVals.GetKey(DVal)) == 0) { continue; } //illegal value
3482  }
3483  if (!Graph->IsNode(SVal)) { Graph->AddNode(SVal); }
3484  if (!Graph->IsNode(DVal)) { Graph->AddNode(DVal); }
3485  //CheckAndAddIntNode(Graph, IntNodeVals, SVal);
3486  //CheckAndAddIntNode(Graph, IntNodeVals, DVal);
3487  }
3488 
3489  // add edge and edge attributes
3490  Graph->AddEdge(SVal, DVal, CurrRowIdx);
3491  if (EdgeAttrV.Len() > 0) { AddEdgeAttributes(Graph, CurrRowIdx); }
3492 
3493  // get src and dst node attributes into hashmaps
3494  if (SrcNodeAttrV.Len() > 0) {
3495  AddNodeAttributes(SVal, SrcNodeAttrV, CurrRowIdx, NodeIntAttrs, NodeFltAttrs, NodeStrAttrs);
3496  }
3497  if (DstNodeAttrV.Len() > 0) {
3498  AddNodeAttributes(DVal, DstNodeAttrV, CurrRowIdx, NodeIntAttrs, NodeFltAttrs, NodeStrAttrs);
3499  }
3500  }
3501 
3502  // aggregate node attributes and add to graph
3503  if (SrcNodeAttrV.Len() > 0 || DstNodeAttrV.Len() > 0) {
3504  for (TNEANet::TNodeI NodeI = Graph->BegNI(); NodeI < Graph->EndNI(); NodeI++) {
3505  TInt NId = NodeI.GetId();
3506  if (NodeIntAttrs.IsKey(NId)) {
3507  TStrIntVH IntAttrVals = NodeIntAttrs.GetDat(NId);
3508  for (TStrIntVH::TIter it = IntAttrVals.BegI(); it < IntAttrVals.EndI(); it++) {
3509  TInt AttrVal = AggregateVector<TInt>(it.GetDat(), AggrPolicy);
3510  Graph->AddIntAttrDatN(NId, AttrVal, it.GetKey());
3511  }
3512  }
3513  if (NodeFltAttrs.IsKey(NId)) {
3514  TStrFltVH FltAttrVals = NodeFltAttrs.GetDat(NId);
3515  for (TStrFltVH::TIter it = FltAttrVals.BegI(); it < FltAttrVals.EndI(); it++) {
3516  TFlt AttrVal = AggregateVector<TFlt>(it.GetDat(), AggrPolicy);
3517  Graph->AddFltAttrDatN(NId, AttrVal, it.GetKey());
3518  }
3519  }
3520  if (NodeStrAttrs.IsKey(NId)) {
3521  TStrStrVH StrAttrVals = NodeStrAttrs.GetDat(NId);
3522  for (TStrStrVH::TIter it = StrAttrVals.BegI(); it < StrAttrVals.EndI(); it++) {
3523  TStr AttrVal = AggregateVector<TStr>(it.GetDat(), AggrPolicy);
3524  Graph->AddStrAttrDatN(NId, AttrVal, it.GetKey());
3525  }
3526  }
3527  }
3528  }
3529 
3530  return Graph;
3531 }
TIter EndI() const
Returns an iterator referring to the past-the-end element in the vector.
Definition: ds.h:595
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TIter BegI() const
Definition: hash.h:213
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TTableContext * Context
Execution Context.
Definition: table.h:545
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
Node iterator. Only forward iteration (operator++) is supported.
Definition: network.h:1792
TIter EndI() const
Definition: hash.h:218
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1386
const char * GetKey(const int &KeyId) const
Definition: hash.h:893
#define Assert(Cond)
Definition: bd.h:251
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
TAttrAggr AggrPolicy
Aggregation policy used for solving conflicts between different values of an attribute of the same no...
Definition: table.h:601
TStrHash< TInt, TBigStrPool > StringVals
StringPool - stores string data values and maps them to integers.
Definition: table.h:182
Definition: dt.h:1137
TStr SrcCol
Column (attribute) to serve as src nodes when constructing the graph.
Definition: table.h:589
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TStr DstCol
Column (attribute) to serve as dst nodes when constructing the graph.
Definition: table.h:590
Definition: dt.h:412
TIter BegI() const
Returns an iterator pointing to the first element in the vector.
Definition: ds.h:593
Definition: hash.h:97
Definition: gbase.h:23
Definition: bd.h:196
void AddEdgeAttributes(PNEANet &Graph, int RowId)
Adds attributes of edge corresponding to RowId to the Graph.
Definition: table.cpp:3395
Definition: gbase.h:23
bool IsKey(const TKey &Key) const
Definition: hash.h:258
static PNEANet New()
Static cons returns pointer to graph. Ex: PNEANet Graph=TNEANet::New().
Definition: network.h:2226
TInt CheckAndAddFltNode(T Graph, THash< TFlt, TInt > &NodeVals, TFlt FNodeVal)
Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals.
Definition: table.h:1533
void AddNodeAttributes(TInt NId, TStrV NodeAttrV, TInt RowId, THash< TInt, TStrIntVH > &NodeIntAttrs, THash< TInt, TStrFltVH > &NodeFltAttrs, THash< TInt, TStrStrVH > &NodeStrAttrs)
Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribut...
Definition: table.cpp:3414
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:430

Here is the call graph for this function:

Here is the caller graph for this function:

TTableContext * TTable::ChangeContext ( TTableContext Context)

Changes the current context. Moves all object items to the new context.

Definition at line 921 of file table.cpp.

References TStrHash< TDat, TStringPool, THashFunc >::AddKey(), atStr, BegRI(), Context, EndRI(), GetColIdx(), GetSchemaColName(), GetSchemaColType(), GetStrVal(), GetStrValIdx(), TVec< TVal, TSizeTy >::Len(), Sch, StrColMaps, TTableContext::StringVals, and TInt::Val.

921  {
922  TInt L = Sch.Len();
923 
924 #if 0
925  // print table on the input, iterate over all columns
926  for (TInt i = 0; i < L; i++) {
927  // skip non-string columns
928  if (GetSchemaColType(i) != atStr) {
929  continue;
930  }
931 
932  TInt ColIdx = GetColIdx(GetSchemaColName(i));
933 
934  // iterate over all rows
935  for (TRowIterator RowI = BegRI(); RowI < EndRI(); RowI++) {
936  TInt RowIdx = RowI.GetRowIdx();
937  TInt KeyId = StrColMaps[ColIdx][RowIdx];
938  printf("ChangeContext in %d %d %d .%s.\n",
939  ColIdx.Val, RowIdx.Val, KeyId.Val, GetStrVal(ColIdx, RowIdx).CStr());
940  }
941  }
942 #endif
943 
944  // add strings to the new context, change values
945  // iterate over all columns
946  for (TInt i = 0; i < L; i++) {
947  // skip non-string columns
948  if (GetSchemaColType(i) != atStr) {
949  continue;
950  }
951 
952  TInt ColIdx = GetColIdx(GetSchemaColName(i));
953 
954  // iterate over all rows
955  for (TRowIterator RowI = BegRI(); RowI < EndRI(); RowI++) {
956  TInt RowIdx = RowI.GetRowIdx();
957  // get the string
958  TStr Key = GetStrValIdx(ColIdx, RowIdx);
959  // add the string to the new context
960  TInt KeyId = TInt(NewContext->StringVals.AddKey(Key));
961  // change the value in the table
962  StrColMaps[ColIdx][RowIdx] = KeyId;
963  }
964  }
965 
966  // set the new context
967  Context = NewContext;
968  return Context;
969 }
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
Schema Sch
Table Schema.
Definition: table.h:549
int Val
Definition: dt.h:1139
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TTableContext * Context
Execution Context.
Definition: table.h:545
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
TStr GetStrValIdx(TInt ColIdx, TInt RowIdx) const
Gets the value in column with id ColIdx at row RowIdx.
Definition: table.h:626
Iterator class for TTable rows.
Definition: table.h:330
TAttrType GetSchemaColType(TInt Idx) const
Gets type of the column with index Idx in the schema.
Definition: table.h:640
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStr GetSchemaColName(TInt Idx) const
Gets name of the column with index Idx in the schema.
Definition: table.h:638
Definition: dt.h:1137
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
TStr GetStrVal(const TStr &ColName, const TInt &RowIdx) const
Gets the value of string attribute ColName at row RowIdx.
Definition: table.h:1028
Definition: dt.h:412
Definition: gbase.h:23

Here is the call graph for this function:

template<class T >
TInt TTable::CheckAndAddFltNode ( Graph,
THash< TFlt, TInt > &  NodeVals,
TFlt  FNodeVal 
)
protected

Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals.

Definition at line 1533 of file table.h.

References THash< TKey, TDat, THashFunc >::AddDat(), THash< TKey, TDat, THashFunc >::AddKey(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::IsKey(), and THash< TKey, TDat, THashFunc >::Len().

Referenced by BuildGraph().

1533  {
1534  if (!NodeVals.IsKey(FNodeVal)) {
1535  TInt NodeVal = NodeVals.Len();
1536  Graph->AddNode(NodeVal);
1537  NodeVals.AddKey(FNodeVal);
1538  NodeVals.AddDat(FNodeVal, NodeVal);
1539  return NodeVal;
1540  } else { return NodeVals.GetDat(FNodeVal); }
1541 }
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
Definition: dt.h:1137
int AddKey(const TKey &Key)
Definition: hash.h:373
bool IsKey(const TKey &Key) const
Definition: hash.h:258
int Len() const
Definition: hash.h:228
TDat & AddDat(const TKey &Key)
Definition: hash.h:238

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::CheckAndAddIntNode ( PNEANet  Graph,
THashSet< TInt > &  NodeVals,
TInt  NodeId 
)
inlineprotected

Checks if given NodeId is seen earlier; if not, add it to Graph and hashmap NodeVals.

Definition at line 3388 of file table.cpp.

References THashSet< TKey, THashFunc >::AddKey(), and THashSet< TKey, THashFunc >::IsKey().

3388  {
3389  if (!NodeVals.IsKey(NodeId)) {
3390  Graph->AddNode(NodeId);
3391  NodeVals.AddKey(NodeId);
3392  }
3393 }
bool IsKey(const TKey &Key) const
Definition: shash.h:1148
int AddKey(const TKey &Key)
Definition: shash.h:1254

Here is the call graph for this function:

TInt TTable::CheckSortedKeyVal ( TIntV Key,
TIntV Val,
TInt  Start,
TInt  End 
)
staticprotected

Definition at line 5310 of file table.cpp.

References CompareKeyVal().

Referenced by QSortKeyVal().

5310  {
5311  TInt j;
5312  for (j = Start; j < End; j++) {
5313  if (CompareKeyVal(Key[j], Val[j], Key[j+1], Val[j+1]) > 0) {
5314  break;
5315  }
5316  }
5317  if (j >= End) { return 0; }
5318  else { return 1; }
5319 }
static TInt CompareKeyVal(const TInt &K1, const TInt &V1, const TInt &K2, const TInt &V2)
Definition: table.cpp:5297
Definition: dt.h:1137

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::Classify ( TPredicate Predicate,
const TStr LabelName,
const TInt PositiveLabel = 1,
const TInt NegativeLabel = 0 
)

Definition at line 2805 of file table.cpp.

References ClassifyAux(), and Select().

2805  {
2806  TIntV SelectedRows;
2807  Select(Predicate, SelectedRows, false);
2808  ClassifyAux(SelectedRows, LabelName, PositiveLabel, NegativeLabel);
2809 }
void Select(TPredicate &Predicate, TIntV &SelectedRows, TBool Remove=true)
Selects rows that satisfy given Predicate.
Definition: table.cpp:2750
void ClassifyAux(const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
Adds a label attribute with positive labels on selected rows and negative labels on the rest...
Definition: table.cpp:4694

Here is the call graph for this function:

void TTable::ClassifyAtomic ( const TStr Col1,
const TStr Col2,
TPredComp  Cmp,
const TStr LabelName,
const TInt PositiveLabel = 1,
const TInt NegativeLabel = 0 
)

Definition at line 2866 of file table.cpp.

References ClassifyAux(), and SelectAtomic().

2867  {
2868  TIntV SelectedRows;
2869  SelectAtomic(Col1, Col2, Cmp, SelectedRows, false);
2870  ClassifyAux(SelectedRows, LabelName, PositiveLabel, NegativeLabel);
2871 }
bool Cmp(const int &RelOp, const TRec &Rec1, const TRec &Rec2)
Definition: bd.h:426
void SelectAtomic(const TStr &Col1, const TStr &Col2, TPredComp Cmp, TIntV &SelectedRows, TBool Remove=true)
Selects rows using atomic compare operation.
Definition: table.cpp:2813
void ClassifyAux(const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
Adds a label attribute with positive labels on selected rows and negative labels on the rest...
Definition: table.cpp:4694

Here is the call graph for this function:

template<class T >
void TTable::ClassifyAtomicConst ( const TStr Col,
const T &  Val,
TPredComp  Cmp,
const TStr LabelName,
const TInt PositiveLabel = 1,
const TInt NegativeLabel = 0 
)
inline

Definition at line 1301 of file table.h.

References ClassifyAux(), and SelectAtomicConst().

1302  {
1303  TIntV SelectedRows;
1304  PTable SelectedTable;
1305  SelectAtomicConst(Col, TPrimitive(Val), Cmp, SelectedRows, SelectedTable, false, false);
1306  ClassifyAux(SelectedRows, LabelName, PositiveLabel, NegativeLabel);
1307  }
Primitive class: Wrapper around primitive data types.
Definition: table.h:211
void SelectAtomicConst(const TStr &Col, const TPrimitive &Val, TPredComp Cmp, TIntV &SelectedRows, PTable &SelectedTable, TBool Remove=true, TBool Table=true)
Selects rows where the value of Col matches given primitive Val.
Definition: table.cpp:2873
Definition: bd.h:196
bool Cmp(const int &RelOp, const TRec &Rec1, const TRec &Rec2)
Definition: bd.h:426
void ClassifyAux(const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
Adds a label attribute with positive labels on selected rows and negative labels on the rest...
Definition: table.cpp:4694

Here is the call graph for this function:

void TTable::ClassifyAux ( const TIntV SelectedRows,
const TStr LabelName,
const TInt PositiveLabel = 1,
const TInt NegativeLabel = 0 
)
protected

Adds a label attribute with positive labels on selected rows and negative labels on the rest.

Definition at line 4694 of file table.cpp.

References TVec< TVal, TSizeTy >::Add(), AddColType(), AddSchemaCol(), atInt, IntCols, TVec< TVal, TSizeTy >::Len(), and NumRows.

Referenced by Classify(), ClassifyAtomic(), and ClassifyAtomicConst().

4694  {
4695  AddSchemaCol(LabelName, atInt);
4696  TInt LabelColIdx = IntCols.Len();
4697  AddColType(LabelName, atInt, LabelColIdx);
4699  for (TInt i = 0; i < NumRows; i++) {
4700  IntCols[LabelColIdx][i] = NegativeLabel;
4701  }
4702  for (TInt i = 0; i < SelectedRows.Len(); i++) {
4703  IntCols[LabelColIdx][SelectedRows[i]] = PositiveLabel;
4704  }
4705 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1137
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
TVec< TInt > TIntV
Definition: ds.h:1594
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

Here is the caller graph for this function:

void TTable::ColAdd ( const TStr Attr1,
const TStr Attr2,
const TStr ResultAttrName = "" 
)

Performs columnwise addition. See TTable::ColGenericOp.

Definition at line 4816 of file table.cpp.

References aoAdd, and ColGenericOp().

4816  {
4817  ColGenericOp(Attr1, Attr2, ResultAttrName, aoAdd);
4818 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752

Here is the call graph for this function:

void TTable::ColAdd ( const TStr Attr1,
TTable Table,
const TStr Attr2,
const TStr ResAttr = "",
TBool  AddToFirstTable = true 
)

Performs columnwise addition with column of given table.

Definition at line 4949 of file table.cpp.

References aoAdd, and ColGenericOp().

4950  {
4951  ColGenericOp(Attr1, Table, Attr2, ResultAttrName, aoAdd, AddToFirstTable);
4952 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752

Here is the call graph for this function:

void TTable::ColAdd ( const TStr Attr1,
const TFlt Num,
const TStr ResultAttrName = "",
const TBool  floatCast = false 
)

Performs addition of column values and given Num.

Definition at line 5063 of file table.cpp.

References aoAdd, and ColGenericOp().

5063  {
5064  ColGenericOp(Attr1, Num, ResultAttrName, aoAdd, floatCast);
5065 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752

Here is the call graph for this function: