SNAP Library 4.0, User Reference  2017-07-27 13:18:06
SNAP, a general purpose, high performance system for analysis and manipulation of large networks
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros
TTable Class Reference

Table class: Relational table with columnar data storage. More...

#include <table.h>

Classes

class  TLoadVecInit
 

Public Member Functions

void AddIntCol (const TStr &ColName)
 Adds an integer column with name ColName. More...
 
void AddFltCol (const TStr &ColName)
 Adds a float column with name ColName. More...
 
void AddStrCol (const TStr &ColName)
 Adds a string column with name ColName. More...
 
void GroupByIntColMP (const TStr &GroupBy, THashMP< TInt, TIntV > &Grouping, TBool UsePhysicalIds=true) const
 Groups/hashes by a single column with integer values, using OpenMP multi-threading. More...
 
 TTable ()
 
 TTable (TTableContext *Context)
 
 TTable (const Schema &S, TTableContext *Context)
 
 TTable (TSIn &SIn, TTableContext *Context)
 
 TTable (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Constructor to build table out of a hash table of int->int. More...
 
 TTable (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Constructor to build table out of a hash table of int->float. More...
 
 TTable (const TTable &Table)
 Copy constructor. More...
 
 TTable (const TTable &Table, const TIntV &RowIds)
 
void SaveSS (const TStr &OutFNm)
 Saves table schema and content to a TSV file. More...
 
void SaveBin (const TStr &OutFNm)
 Saves table schema and content to a binary file. More...
 
void Save (TSOut &SOut)
 Saves table schema and content to a binary format. More...
 
void Dump (FILE *OutF=stdout) const
 Prints table contents to a text file. More...
 
void AddRow (const TTableRow &Row)
 Adds row with values taken from given TTableRow. More...
 
TTableContextGetContext ()
 Returns the context. More...
 
TTableContextChangeContext (TTableContext *Context)
 Changes the current context. Moves all object items to the new context. More...
 
TInt GetColIdx (const TStr &ColName) const
 Gets index of column ColName among columns of the same type in the schema. More...
 
TInt GetIntVal (const TStr &ColName, const TInt &RowIdx)
 Gets the value of integer attribute ColName at row RowIdx. More...
 
TFlt GetFltVal (const TStr &ColName, const TInt &RowIdx)
 Gets the value of float attribute ColName at row RowIdx. More...
 
TStr GetStrVal (const TStr &ColName, const TInt &RowIdx) const
 Gets the value of string attribute ColName at row RowIdx. More...
 
TInt GetStrMapById (TInt ColIdx, TInt RowIdx) const
 Gets the integer mapping of the string at column ColIdx at row RowIdx. More...
 
TInt GetStrMapByName (const TStr &ColName, TInt RowIdx) const
 Gets the integer mapping of the string at column ColName at row RowIdx. More...
 
TStr GetStrValById (TInt ColIdx, TInt RowIdx) const
 Gets the value of the string attribute at column ColIdx at row RowIdx. More...
 
TStr GetStrValByName (const TStr &ColName, const TInt &RowIdx) const
 Gets the value of the string attribute at column ColName at row RowIdx. More...
 
TIntV GetIntRowIdxByVal (const TStr &ColName, const TInt &Val) const
 Gets the rows containing Val in int column ColName. More...
 
TIntV GetStrRowIdxByMap (const TStr &ColName, const TInt &Map) const
 Gets the rows containing int mapping Map in str column ColName. More...
 
TIntV GetFltRowIdxByVal (const TStr &ColName, const TFlt &Val) const
 Gets the rows containing Val in flt column ColName. More...
 
TInt RequestIndexInt (const TStr &ColName)
 Creates Index for Int Column ColName. More...
 
TInt RequestIndexFlt (const TStr &ColName)
 Creates Index for Flt Column ColName. More...
 
TInt RequestIndexStrMap (const TStr &ColName)
 Creates Index for Str Column ColName. More...
 
TStr GetStr (const TInt &KeyId) const
 Gets the string with KeyId. More...
 
TInt GetIntValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx)
 Get the integer value at column ColIdx and row RowIdx. More...
 
TFlt GetFltValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx)
 Get the float value at column ColIdx and row RowIdx. More...
 
Schema GetSchema ()
 Gets the schema of this table. More...
 
TVec< PNEANetToGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx)
 Creates a sequence of graphs based on values of column SplitAttr and windows specified by JumpSize and WindowSize. More...
 
TVec< PNEANetToVarGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals)
 Creates a sequence of graphs based on values of column SplitAttr and intervals specified by SplitIntervals. More...
 
TVec< PNEANetToGraphPerGroup (TStr GroupAttr, TAttrAggr AggrPolicy)
 Creates a sequence of graphs based on grouping specified by GroupAttr. More...
 
PNEANet ToGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx)
 Creates the graph sequence one at a time. More...
 
PNEANet ToVarGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals)
 Creates the graph sequence one at a time. More...
 
PNEANet ToGraphPerGroupIterator (TStr GroupAttr, TAttrAggr AggrPolicy)
 Creates the graph sequence one at a time. More...
 
PNEANet NextGraphIterator ()
 Calls to this must be preceded by a call to one of the above ToGraph*Iterator functions. More...
 
TBool IsLastGraphOfSequence ()
 Checks if the end of the graph sequence is reached. More...
 
TStr GetSrcCol () const
 Gets the name of the column to be used as src nodes in the graph. More...
 
void SetSrcCol (const TStr &Src)
 Sets the name of the column to be used as src nodes in the graph. More...
 
TStr GetDstCol () const
 Gets the name of the column to be used as dst nodes in the graph. More...
 
void SetDstCol (const TStr &Dst)
 Sets the name of the column to be used as dst nodes in the graph. More...
 
void AddEdgeAttr (const TStr &Attr)
 Adds column to be used as graph edge attribute. More...
 
void AddEdgeAttr (TStrV &Attrs)
 Adds columns to be used as graph edge attributes. More...
 
void AddSrcNodeAttr (const TStr &Attr)
 Adds column to be used as src node atribute of the graph. More...
 
void AddSrcNodeAttr (TStrV &Attrs)
 Adds columns to be used as src node attributes of the graph. More...
 
void AddDstNodeAttr (const TStr &Attr)
 Adds column to be used as dst node atribute of the graph. More...
 
void AddDstNodeAttr (TStrV &Attrs)
 Adds columns to be used as dst node attributes of the graph. More...
 
void AddNodeAttr (const TStr &Attr)
 Handles the common case where src and dst both belong to the same "universe" of entities. More...
 
void AddNodeAttr (TStrV &Attrs)
 Handles the common case where src and dst both belong to the same "universe" of entities. More...
 
void SetCommonNodeAttrs (const TStr &SrcAttr, const TStr &DstAttr, const TStr &CommonAttrName)
 Sets the columns to be used as both src and dst node attributes. More...
 
TStrV GetSrcNodeIntAttrV () const
 Gets src node int attribute name vector. More...
 
TStrV GetDstNodeIntAttrV () const
 Gets dst node int attribute name vector. More...
 
TStrV GetEdgeIntAttrV () const
 Gets edge int attribute name vector. More...
 
TStrV GetSrcNodeFltAttrV () const
 Gets src node float attribute name vector. More...
 
TStrV GetDstNodeFltAttrV () const
 Gets dst node float attribute name vector. More...
 
TStrV GetEdgeFltAttrV () const
 Gets edge float attribute name vector. More...
 
TStrV GetSrcNodeStrAttrV () const
 Gets src node str attribute name vector. More...
 
TStrV GetDstNodeStrAttrV () const
 Gets dst node str attribute name vector. More...
 
TStrV GetEdgeStrAttrV () const
 Gets edge str attribute name vector. More...
 
TAttrType GetColType (const TStr &ColName) const
 Gets type of column ColName. More...
 
TInt GetNumRows () const
 Gets total number of rows in this table. More...
 
TInt GetNumValidRows () const
 Gets number of valid, i.e. not deleted, rows in this table. More...
 
THash< TInt, TIntGetRowIdMap () const
 Gets a map of logical to physical row ids. More...
 
TRowIterator BegRI () const
 Gets iterator to the first valid row of the table. More...
 
TRowIterator EndRI () const
 Gets iterator to the last valid row of the table. More...
 
TRowIteratorWithRemove BegRIWR ()
 Gets iterator with reomve to the first valid row. More...
 
TRowIteratorWithRemove EndRIWR ()
 Gets iterator with reomve to the last valid row. More...
 
void GetPartitionRanges (TIntPrV &Partitions, TInt NumPartitions) const
 Partitions the table into NumPartitions and populate Partitions with the ranges. More...
 
void Rename (const TStr &Column, const TStr &NewLabel)
 Renames a column. More...
 
void Unique (const TStr &Col)
 Removes rows with duplicate values in given column. More...
 
void Unique (const TStrV &Cols, TBool Ordered=true)
 Removes rows with duplicate values in given columns. More...
 
void Select (TPredicate &Predicate, TIntV &SelectedRows, TBool Remove=true)
 Selects rows that satisfy given Predicate. More...
 
void Select (TPredicate &Predicate)
 
void Classify (TPredicate &Predicate, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
 
void SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, TIntV &SelectedRows, TBool Remove=true)
 Selects rows using atomic compare operation. More...
 
void SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp)
 
void ClassifyAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
 
void SelectAtomicConst (const TStr &Col, const TPrimitive &Val, TPredComp Cmp, TIntV &SelectedRows, PTable &SelectedTable, TBool Remove=true, TBool Table=true)
 Selects rows where the value of Col matches given primitive Val. More...
 
template<class T >
void SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp)
 
template<class T >
void SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, PTable &SelectedTable)
 
template<class T >
void ClassifyAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
 
void SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp)
 
void SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp, PTable &SelectedTable)
 
void SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp)
 
void SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp, PTable &SelectedTable)
 
void SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp)
 
void SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp, PTable &SelectedTable)
 
void Group (const TStrV &GroupBy, const TStr &GroupColName, TBool Ordered=true, TBool UsePhysicalIds=true)
 Groups rows depending on values of GroupBy columns. More...
 
void Count (const TStr &CountColName, const TStr &Col)
 Counts number of unique elements. More...
 
void Order (const TStrV &OrderBy, TStr OrderColName="", TBool ResetRankByMSC=false, TBool Asc=true)
 Orders the rows according to the values in columns of OrderBy (in descending lexicographic order). More...
 
void Aggregate (const TStrV &GroupByAttrs, TAttrAggr AggOp, const TStr &ValAttr, const TStr &ResAttr, TBool Ordered=true)
 Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr. More...
 
void AggregateCols (const TStrV &AggrAttrs, TAttrAggr AggOp, const TStr &ResAttr)
 Aggregates attributes in AggrAttrs across columns. More...
 
TVec< PTableSpliceByGroup (const TStrV &GroupByAttrs, TBool Ordered=true)
 Splices table into subtables according to a grouping statement. More...
 
PTable Join (const TStr &Col1, const TTable &Table, const TStr &Col2)
 Performs equijoin. More...
 
PTable Join (const TStr &Col1, const PTable &Table, const TStr &Col2)
 
PTable ThresholdJoin (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2, TInt Threshold, TBool PerJoinKey=false)
 
PTable SelfJoin (const TStr &Col)
 Joins table with itself, on values of Col. More...
 
PTable SelfSimJoin (const TStrV &Cols, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold)
 
PTable SelfSimJoinPerGroup (const TStr &GroupAttr, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold)
 Performs join if the distance between two rows is less than the specified threshold. More...
 
PTable SelfSimJoinPerGroup (const TStrV &GroupBy, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold)
 Performs join if the distance between two rows is less than the specified threshold. More...
 
PTable SimJoin (const TStrV &Cols1, const TTable &Table, const TStrV &Cols2, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold)
 Performs join if the distance between two rows is less than the specified threshold. More...
 
void SelectFirstNRows (const TInt &N)
 Selects first N rows from the table. More...
 
void Defrag ()
 Releases memory of deleted rows, and defrags. More...
 
void StoreIntCol (const TStr &ColName, const TIntV &ColVals)
 Adds entire int column to table. More...
 
void StoreFltCol (const TStr &ColName, const TFltV &ColVals)
 Adds entire flt column to table. More...
 
void StoreStrCol (const TStr &ColName, const TStrV &ColVals)
 Adds entire str column to table. More...
 
void UpdateFltFromTable (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0)
 
void UpdateFltFromTableMP (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0)
 
void SetFltColToConstMP (TInt UpdateColIdx, TFlt DefaultFltVal)
 
PTable Union (const TTable &Table)
 Returns union of this table with given Table. More...
 
PTable Union (const PTable &Table)
 
PTable UnionAll (const TTable &Table)
 Returns union of this table with given Table, preserving duplicates. More...
 
PTable UnionAll (const PTable &Table)
 
void UnionAllInPlace (const TTable &Table)
 Same as TTable::ConcatTable. More...
 
void UnionAllInPlace (const PTable &Table)
 
PTable Intersection (const TTable &Table)
 Returns intersection of this table with given Table. More...
 
PTable Intersection (const PTable &Table)
 
PTable Minus (TTable &Table)
 Returns table with rows that are present in this table but not in given Table. More...
 
PTable Minus (const PTable &Table)
 
PTable Project (const TStrV &ProjectCols)
 Returns table with only the columns in ProjectCols. More...
 
void ProjectInPlace (const TStrV &ProjectCols)
 Keeps only the columns specified in ProjectCols. More...
 
void ColGenericOp (const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
 Performs columnwise arithmetic operation. More...
 
void ColGenericOpMP (TInt ArgColIdx1, TInt ArgColIdx2, TAttrType ArgType1, TAttrType ArgType2, TInt ResColIdx, TArithOp op)
 
void ColAdd (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise addition. See TTable::ColGenericOp. More...
 
void ColSub (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise subtraction. See TTable::ColGenericOp. More...
 
void ColMul (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise multiplication. See TTable::ColGenericOp. More...
 
void ColDiv (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise division. See TTable::ColGenericOp. More...
 
void ColMod (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs columnwise modulus. See TTable::ColGenericOp. More...
 
void ColMin (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs min of two columns. See TTable::ColGenericOp. More...
 
void ColMax (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="")
 Performs max of two columns. See TTable::ColGenericOp. More...
 
void ColGenericOp (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr, TArithOp op, TBool AddToFirstTable)
 Performs columnwise arithmetic operation with column of given table. More...
 
void ColAdd (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise addition with column of given table. More...
 
void ColSub (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise subtraction with column of given table. More...
 
void ColMul (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise multiplication with column of given table. More...
 
void ColDiv (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise division with column of given table. More...
 
void ColMod (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true)
 Performs columnwise modulus with column of given table. More...
 
void ColGenericOp (const TStr &Attr1, const TFlt &Num, const TStr &ResAttr, TArithOp op, const TBool floatCast)
 Performs arithmetic op of column values and given Num. More...
 
void ColGenericOpMP (const TInt &ColIdx1, const TInt &ColIdx2, TAttrType ArgType, const TFlt &Num, TArithOp op, TBool ShouldCast)
 
void ColAdd (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs addition of column values and given Num. More...
 
void ColSub (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs subtraction of column values and given Num. More...
 
void ColMul (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs multiplication of column values and given Num. More...
 
void ColDiv (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs division of column values and given Num. More...
 
void ColMod (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false)
 Performs modulus of column values and given Num. More...
 
void ColConcat (const TStr &Attr1, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="")
 Concatenates two string columns. More...
 
void ColConcat (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="", TBool AddToFirstTable=true)
 Concatenates string column with column of given table. More...
 
void ColConcatConst (const TStr &Attr1, const TStr &Val, const TStr &Sep="", const TStr &ResAttr="")
 Concatenates column values with given string value. More...
 
void ReadIntCol (const TStr &ColName, TIntV &Result) const
 Reads values of entire int column into Result. More...
 
void ReadFltCol (const TStr &ColName, TFltV &Result) const
 Reads values of entire float column into Result. More...
 
void ReadStrCol (const TStr &ColName, TStrV &Result) const
 Reads values of entire string column into Result. More...
 
void InitIds ()
 Adds explicit row ids, initialize hash set mapping ids to physical rows. More...
 
PTable IsNextK (const TStr &OrderCol, TInt K, const TStr &GroupBy, const TStr &RankColName="")
 Distance based filter. More...
 
void PrintSize ()
 
void PrintContextSize ()
 
TSize GetMemUsedKB ()
 Returns approximate memory used by table in [KB]. More...
 
TSize GetContextMemUsedKB ()
 Returns approximate memory used by table context in [KB]. More...
 

Static Public Member Functions

static void SetMP (TInt Value)
 
static TInt GetMP ()
 
static TStr NormalizeColName (const TStr &ColName)
 Adds suffix to column name if it doesn't exist. More...
 
static TStrV NormalizeColNameV (const TStrV &Cols)
 Adds suffix to column name if it doesn't exist. More...
 
static PTable New ()
 
static PTable New (TTableContext *Context)
 
static PTable New (const Schema &S, TTableContext *Context)
 
static PTable New (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Returns pointer to a table constructed from given int->int hash. More...
 
static PTable New (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Returns pointer to a table constructed from given int->float hash. More...
 
static PTable New (const PTable Table)
 Returns pointer to a new table created from given Table. More...
 
static void GetSchema (const TStr &InFNm, Schema &S, const char &Separator= '\t')
 Returns pointer to a new table created from given Table, with name set to TableName. More...
 
static PTable LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const char &Separator= '\t', TBool HasTitleLine=false)
 Loads table from spread sheet (TSV, CSV, etc). Note: HasTitleLine = true is not supported. Please comment title lines instead. More...
 
static PTable LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const TIntV &RelevantCols, const char &Separator= '\t', TBool HasTitleLine=false)
 Loads table from spread sheet - but only load the columns specified by RelevantCols. Note: HasTitleLine = true is not supported. Please comment title lines instead. More...
 
static PTable Load (TSIn &SIn, TTableContext *Context)
 Loads table from a binary format. More...
 
static PTable LoadShM (TShMIn &ShMIn, TTableContext *Context)
 Static constructor to load table from memory. More...
 
static PTable TableFromHashMap (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Builds table from hash table of int->int. More...
 
static PTable TableFromHashMap (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false)
 Builds table from hash table of int->float. More...
 
static PTable GetNodeTable (const PNEANet &Network, TTableContext *Context)
 Extracts node TTable from PNEANet. More...
 
static PTable GetEdgeTable (const PNEANet &Network, TTableContext *Context)
 Extracts edge TTable from PNEANet. More...
 
static PTable GetEdgeTablePN (const PNGraphMP &Network, TTableContext *Context)
 Extracts edge TTable from parallel graph PNGraphMP. More...
 
static PTable GetFltNodePropertyTable (const PNEANet &Network, const TIntFltH &Property, const TStr &NodeAttrName, const TAttrType &NodeAttrType, const TStr &PropertyAttrName, TTableContext *Context)
 Extracts node and edge property TTables from THash. More...
 

Protected Member Functions

void InvalidatePhysicalGroupings ()
 
void InvalidateAffectedGroupings (const TStr &Attr)
 
void IncrementNext ()
 Increments the next vector and set last, NumRows and NumValidRows. More...
 
void ClassifyAux (const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
 Adds a label attribute with positive labels on selected rows and negative labels on the rest. More...
 
const char * GetContextKey (TInt Val) const
 Gets the Key of the Context StringVals pool. Used by ToGraph method in conv.cpp. More...
 
TStr GetStrVal (TInt ColIdx, TInt RowIdx) const
 Gets the value in column with id ColIdx at row RowIdx. More...
 
void AddStrVal (const TInt &ColIdx, const TStr &Val)
 Adds Val in column with id ColIdx. More...
 
void AddStrVal (const TStr &Col, const TStr &Val)
 Adds Val in column with name Col. More...
 
TStr GetIdColName () const
 Gets name of the id column of this table. More...
 
TStr GetSchemaColName (TInt Idx) const
 Gets name of the column with index Idx in the schema. More...
 
TAttrType GetSchemaColType (TInt Idx) const
 Gets type of the column with index Idx in the schema. More...
 
void AddSchemaCol (const TStr &ColName, TAttrType ColType)
 Adds column with name ColName and type ColType to the schema. More...
 
TBool IsColName (const TStr &ColName) const
 
void AddColType (const TStr &ColName, TPair< TAttrType, TInt > ColType)
 Adds column with name ColName and type ColType to the ColTypeMap. More...
 
void AddColType (const TStr &ColName, TAttrType ColType, TInt Index)
 Adds column with name ColName and type ColType to the ColTypeMap. More...
 
void DelColType (const TStr &ColName)
 Adds column with name ColName and type ColType to the ColTypeMap. More...
 
TPair< TAttrType, TIntGetColTypeMap (const TStr &ColName) const
 Gets column type and index of ColName. More...
 
TStr RenumberColName (const TStr &ColName) const
 Returns a re-numbered column name based on number of existing columns with conflicting names. More...
 
TStr DenormalizeColName (const TStr &ColName) const
 Removes suffix to column name if exists. More...
 
Schema DenormalizeSchema () const
 Removes suffix to column names in the Schema. More...
 
TBool IsAttr (const TStr &Attr)
 Checks if Attr is an attribute of this table schema. More...
 
void AddTable (const TTable &T)
 Adds all the rows of the input table. Allows duplicate rows (not a union). More...
 
void ConcatTable (const PTable &T)
 Appends all rows of T to this table, and recalculate indices. More...
 
void AddRow (const TRowIterator &RI)
 Adds row corresponding to RI. More...
 
void AddRow (const TIntV &IntVals, const TFltV &FltVals, const TStrV &StrVals)
 Adds row with values corresponding to the given vectors by type. More...
 
void AddGraphAttribute (const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst)
 Adds names of columns to be used as graph attributes. More...
 
void AddGraphAttributeV (TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst)
 Adds vector of names of columns to be used as graph attributes. More...
 
void CheckAndAddIntNode (PNEANet Graph, THashSet< TInt > &NodeVals, TInt NodeId)
 Checks if given NodeId is seen earlier; if not, add it to Graph and hashmap NodeVals. More...
 
template<class T >
TInt CheckAndAddFltNode (T Graph, THash< TFlt, TInt > &NodeVals, TFlt FNodeVal)
 Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals. More...
 
void AddEdgeAttributes (PNEANet &Graph, int RowId)
 Adds attributes of edge corresponding to RowId to the Graph. More...
 
void AddNodeAttributes (TInt NId, TStrV NodeAttrV, TInt RowId, THash< TInt, TStrIntVH > &NodeIntAttrs, THash< TInt, TStrFltVH > &NodeFltAttrs, THash< TInt, TStrStrVH > &NodeStrAttrs)
 Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values). More...
 
PNEANet BuildGraph (const TIntV &RowIds, TAttrAggr AggrPolicy)
 Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes. More...
 
void InitRowIdBuckets (int NumBuckets)
 Initializes the RowIdBuckets vector which will be used for the graph sequence creation. More...
 
void FillBucketsByWindow (TStr SplitAttr, TInt JumpSize, TInt WindowSize, TInt StartVal, TInt EndVal)
 Fills RowIdBuckets with sets of row ids. More...
 
void FillBucketsByInterval (TStr SplitAttr, TIntPrV SplitIntervals)
 Fills RowIdBuckets with sets of row ids. More...
 
TVec< PNEANetGetGraphsFromSequence (TAttrAggr AggrPolicy)
 Returns a sequence of graphs. More...
 
PNEANet GetFirstGraphFromSequence (TAttrAggr AggrPolicy)
 Returns the first graph of the sequence. More...
 
PNEANet GetNextGraphFromSequence ()
 Returns the next graph in sequence corresponding to RowIdBuckets. More...
 
template<class T >
AggregateVector (TVec< T > &V, TAttrAggr Policy)
 Aggregates vector into a single scalar value according to a policy. More...
 
void GroupingSanityCheck (const TStr &GroupBy, const TAttrType &AttrType) const
 Checks if grouping key exists and matches given attr type. More...
 
template<class T >
void GroupByIntCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
 Groups/hashes by a single column with integer values. More...
 
template<class T >
void GroupByFltCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
 Groups/hashes by a single column with float values. Returns hash table with grouping. More...
 
template<class T >
void GroupByStrCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
 Groups/hashes by a single column with string values. Returns hash table with grouping. More...
 
template<class T >
void UpdateGrouping (THash< T, TIntV > &Grouping, T Key, TInt Val) const
 Template for utility function to update a grouping hash map. More...
 
template<class T >
void UpdateGrouping (THashMP< T, TIntV > &Grouping, T Key, TInt Val) const
 Template for utility function to update a parallel grouping hash map. More...
 
void PrintGrouping (const THash< TGroupKey, TIntV > &Grouping) const
 
TInt CompareRows (TInt R1, TInt R2, const TAttrType &CompareByType, const TInt &CompareByIndex, TBool Asc=true)
 Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More...
 
TInt CompareRows (TInt R1, TInt R2, const TVec< TAttrType > &CompareByTypes, const TIntV &CompareByIndices, TBool Asc=true)
 Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More...
 
TInt GetPivot (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc)
 Gets pivot element for QSort. More...
 
TInt Partition (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc)
 Partitions vector for QSort. More...
 
void ISort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true)
 Performs insertion sort on given vector V. More...
 
void QSort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true)
 Performs QSort on given vector V. More...
 
void Merge (TIntV &V, TInt Idx1, TInt Idx2, TInt Idx3, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true)
 Helper function for parallel QSort. More...
 
void QSortPar (TIntV &V, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true)
 Performs QSort in parallel on given vector V. More...
 
bool IsRowValid (TInt RowIdx) const
 Checks if RowIdx corresponds to a valid (i.e. not deleted) row. More...
 
TInt GetLastValidRowIdx ()
 Gets the id of the last valid row of the table. More...
 
void RemoveFirstRow ()
 Removes first valid row of the table. More...
 
void RemoveRow (TInt RowIdx, TInt PrevRowIdx)
 Removes row with id RowIdx. More...
 
void KeepSortedRows (const TIntV &KeepV)
 Removes all rows that are not mentioned in the SORTED vector KeepV. More...
 
void SetFirstValidRow ()
 Sets the first valid row of the TTable. More...
 
PTable InitializeJointTable (const TTable &Table)
 Initializes an empty table for the join of this table with the given table. More...
 
void AddJointRow (const TTable &T1, const TTable &T2, TInt RowIdx1, TInt RowIdx2)
 Adds joint row T1[RowIdx1]<=>T2[RowIdx2]. More...
 
void ThresholdJoinInputCorrectness (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2)
 
void ThresholdJoinCountCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntPr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType)
 
PTable ThresholdJoinOutputTable (const THash< TIntPr, TIntTr > &Counters, TInt Threshold, const TTable &Table)
 
void ThresholdJoinCountPerJoinKeyCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntTr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType)
 
PTable ThresholdJoinPerJoinKeyOutputTable (const THash< TIntTr, TIntTr > &Counters, TInt Threshold, const TTable &Table)
 
void ResizeTable (int RowCount)
 Resizes the table to hold RowCount rows. More...
 
int GetEmptyRowsStart (int NewRows)
 Gets the start index to a chunk of empty rows of size NewRows. More...
 
void AddSelectedRows (const TTable &Table, const TIntV &RowIDs)
 Adds rows from Table that correspond to ids in RowIDs. More...
 
void AddNRows (int NewRows, const TVec< TIntV > &IntColsP, const TVec< TFltV > &FltColsP, const TVec< TIntV > &StrColMapsP)
 Adds NewRows rows from the given vectors for each column type. More...
 
void AddNJointRowsMP (const TTable &T1, const TTable &T2, const TVec< TIntPrV > &JointRowIDSet)
 Adds rows from T1 and T2 to this table in a parallel manner. Used by Join. More...
 
void UpdateTableForNewRow ()
 Updates table state after adding one or more rows. More...
 
void GroupAux (const TStrV &GroupBy, THash< TGroupKey, TPair< TInt, TIntV > > &Grouping, TBool Ordered, const TStr &GroupColName, TBool KeepUnique, TIntV &UniqueVec, TBool UsePhysicalIds=true)
 Helper function for grouping. More...
 
void StoreGroupCol (const TStr &GroupColName, const TVec< TPair< TInt, TInt > > &GroupAndRowIds)
 Parallel helper function for grouping. - we currently don't support such parallel grouping by complex keys. More...
 
void Reindex ()
 Reinitializes row ids. More...
 
void AddIdColumn (const TStr &IdColName)
 Adds a column of explicit integer identifiers to the rows. More...
 
void GetCollidingRows (const TTable &T, THashSet< TInt > &Collisions)
 Gets set of row ids of rows common with table T. More...
 

Static Protected Member Functions

static void LoadSSPar (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine)
 Parallelly loads data from input file at InFNm into NewTable. Only work when NewTable has no string columns. More...
 
static void LoadSSSeq (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine)
 Sequentially loads data from input file at InFNm into NewTable. More...
 
static TInt CompareKeyVal (const TInt &K1, const TInt &V1, const TInt &K2, const TInt &V2)
 
static TInt CheckSortedKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 
static void ISortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 
static TInt GetPivotKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 
static TInt PartitionKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 
static void QSortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End)
 

Protected Attributes

TTableContextContext
 Execution Context. More...
 
Schema Sch
 Table Schema. More...
 
TCRef CRef
 
TInt NumRows
 Number of rows in the table (valid and invalid). More...
 
TInt NumValidRows
 Number of valid rows in the table (i.e. rows that were not logically removed). More...
 
TInt FirstValidRow
 Physical index of first valid row. More...
 
TInt LastValidRow
 Physical index of last valid row. More...
 
TIntV Next
 A vector describing the logical order of the rows. More...
 
TVec< TIntVIntCols
 Next[i] is the successor of row i. Table iterators follow the order dictated by Next More...
 
TVec< TFltVFltCols
 Data columns of floating point attributes. More...
 
TVec< TIntVStrColMaps
 Data columns of integer mappings of string attributes. More...
 
THash< TStr, TPair< TAttrType,
TInt > > 
ColTypeMap
 
TStr IdColName
 A mapping from column name to column type and column index among columns of the same type. More...
 
TIntIntH RowIdMap
 Mapping of permanent row ids to physical id. More...
 
THash< TStr, THash< TInt, TIntV > > IntColIndexes
 Indexes for Int Columns. More...
 
THash< TStr, THash< TInt, TIntV > > StrMapColIndexes
 Indexes for String Columns. More...
 
THash< TStr, THash< TFlt, TIntV > > FltColIndexes
 Indexes for Float Columns. More...
 
THash< TStr, GroupStmtGroupStmtNames
 Maps user-given grouping statement names to their group-by attributes. More...
 
THash< GroupStmt, THash< TInt,
TGroupKey > > 
GroupIDMapping
 Maps grouping statements to their (group id –> group-by key) mapping. More...
 
THash< GroupStmt, THash
< TGroupKey, TIntV > > 
GroupMapping
 Maps grouping statements to their (group-by key –> group id) mapping. More...
 
TStr SrcCol
 Column (attribute) to serve as src nodes when constructing the graph. More...
 
TStr DstCol
 Column (attribute) to serve as dst nodes when constructing the graph. More...
 
TStrV EdgeAttrV
 List of columns (attributes) to serve as edge attributes. More...
 
TStrV SrcNodeAttrV
 List of columns (attributes) to serve as source node attributes. More...
 
TStrV DstNodeAttrV
 List of columns (attributes) to serve as destination node attributes. More...
 
TStrTrV CommonNodeAttrs
 List of attribute pairs with values common to source and destination and their common given name. More...
 
TVec< TIntVRowIdBuckets
 Partitioning of row ids into buckets corresponding to different graph objects when generating a sequence of graphs. More...
 
TInt CurrBucket
 Current row id bucket - used when generating a sequence of graphs using an iterator. More...
 
TAttrAggr AggrPolicy
 Aggregation policy used for solving conflicts between different values of an attribute of the same node. More...
 
TInt IsNextDirty
 Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing GetPartitionRanges. More...
 

Static Protected Attributes

static const TInt Last = -1
 Special value for Next vector entry - last row in table. More...
 
static const TInt Invalid = -2
 Special value for Next vector entry - logically removed row. More...
 
static TInt UseMP = 1
 Global switch for choosing multi-threaded versions of TTable functions. More...
 

Private Member Functions

void GenerateColTypeMap (THash< TStr, TPair< TInt, TInt > > &ColTypeIntMap)
 
void LoadTableShM (TShMIn &ShMIn, TTableContext *ContextTable)
 

Friends

class TPt< TTable >
 
class TRowIterator
 
class TRowIteratorWithRemove
 
template<class PGraph >
PGraph TSnap::ToGraph (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy)
 
template<class PGraph >
PGraph TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy)
 
template<class PGraph >
PGraph TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy)
 
template<class PGraph >
PGraph TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy)
 
template<class PGraph >
PGraph TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy)
 
int TSnap::LoadCrossNet (TCrossNet &Graph, PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV)
 
int TSnap::LoadMode (TModeNet &Graph, PTable Table, const TStr &NCol, TStrV &NodeAttrV)
 
template<class PGraphMP >
PGraphMP TSnap::ToGraphMP (PTable Table, const TStr &SrcCol, const TStr &DstCol)
 
template<class PGraphMP >
PGraphMP TSnap::ToGraphMP3 (PTable Table, const TStr &SrcCol, const TStr &DstCol)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP2 (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy)
 
template<class PGraphMP >
PGraphMP TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy)
 

Detailed Description

Table class: Relational table with columnar data storage.

Definition at line 484 of file table.h.

Constructor & Destructor Documentation

TTable::TTable ( )

Definition at line 302 of file table.cpp.

302  : Context(new TTableContext), NumRows(0), NumValidRows(0),
303  FirstValidRow(0), LastValidRow(-1) {}
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
Execution context.
Definition: table.h:180
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TTable::TTable ( TTableContext Context)

Definition at line 305 of file table.cpp.

305  : Context(Context), NumRows(0),
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TTable::TTable ( const Schema S,
TTableContext Context 
)

Definition at line 308 of file table.cpp.

308  : Context(Context),
310  TInt IntColCnt = 0;
311  TInt FltColCnt = 0;
312  TInt StrColCnt = 0;
313  for (TInt i = 0; i < TableSchema.Len(); i++) {
314  TStr ColName = TableSchema[i].Val1;
315  TAttrType ColType = TableSchema[i].Val2;
316  AddSchemaCol(ColName, ColType);
317  switch (ColType) {
318  case atInt:
319  AddColType(ColName, atInt, IntColCnt);
320  IntColCnt++;
321  break;
322  case atFlt:
323  AddColType(ColName, atFlt, FltColCnt);
324  FltColCnt++;
325  break;
326  case atStr:
327  AddColType(ColName, atStr, StrColCnt);
328  StrColCnt++;
329  break;
330  }
331  }
332  IntCols = TVec<TIntV>(IntColCnt);
333  FltCols = TVec<TFltV>(FltColCnt);
334  StrColMaps = TVec<TIntV>(StrColCnt);
335 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
Definition: dt.h:412
Definition: gbase.h:23
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
TTable::TTable ( TSIn SIn,
TTableContext Context 
)

Definition at line 378 of file table.cpp.

378  : Context(Context), NumRows(SIn),
379  NumValidRows(SIn), FirstValidRow(SIn), LastValidRow(SIn), Next(SIn), IntCols(SIn),
380  FltCols(SIn), StrColMaps(SIn) {
381  THash<TStr,TPair<TInt,TInt> > ColTypeIntMap(SIn);
382  GenerateColTypeMap(ColTypeIntMap);
383 }
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
void GenerateColTypeMap(THash< TStr, TPair< TInt, TInt > > &ColTypeIntMap)
Definition: table.cpp:337
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
Definition: hash.h:97
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TTable::TTable ( const THash< TInt, TInt > &  H,
const TStr Col1,
const TStr Col2,
TTableContext Context,
const TBool  IsStrKeys = false 
)

Constructor to build table out of a hash table of int->int.

Definition at line 385 of file table.cpp.

386  : Context(Context), NumRows(H.Len()),
387  NumValidRows(H.Len()), FirstValidRow(0), LastValidRow(H.Len()-1) {
388  TAttrType KeyType = IsStrKeys ? atStr : atInt;
389  AddSchemaCol(Col1, KeyType);
390  AddSchemaCol(Col2, atInt);
391  AddColType(Col1, KeyType, 0);
392  AddColType(Col2, atInt, 1);
393  if (IsStrKeys) {
394  StrColMaps = TVec<TIntV>(1);
395  IntCols = TVec<TIntV>(1);
396  H.GetKeyV(StrColMaps[0]);
397  H.GetDatV(IntCols[0]);
398  } else {
399  IntCols = TVec<TIntV>(2);
400  H.GetKeyV(IntCols[0]);
401  H.GetDatV(IntCols[1]);
402  }
403  Next = TIntV(NumRows);
404  for (TInt i = 0; i < NumRows; i++) {
405  Next[i] = i+1;
406  }
407  Next[NumRows-1] = Last;
408  IsNextDirty = 0;
409  InitIds();
410 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
void GetDatV(TVec< TDat > &DatV) const
Definition: hash.h:492
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
void InitIds()
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition: table.cpp:1883
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1134
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void GetKeyV(TVec< TKey > &KeyV) const
Definition: hash.h:484
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
TVec< TInt > TIntV
Definition: ds.h:1594
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
int Len() const
Definition: hash.h:228
TTable::TTable ( const THash< TInt, TFlt > &  H,
const TStr Col1,
const TStr Col2,
TTableContext Context,
const TBool  IsStrKeys = false 
)

Constructor to build table out of a hash table of int->float.

Definition at line 412 of file table.cpp.

413  : Context(Context),
414  NumRows(H.Len()), NumValidRows(H.Len()), FirstValidRow(0), LastValidRow(H.Len()-1) {
415  TAttrType KeyType = IsStrKeys ? atStr : atInt;
416  AddSchemaCol(Col1, KeyType);
417  AddSchemaCol(Col2, atFlt);
418  AddColType(Col1, KeyType, 0);
419  AddColType(Col2, atFlt, 0);
420  if (IsStrKeys) {
421  StrColMaps = TVec<TIntV>(1);
422  H.GetKeyV(StrColMaps[0]);
423  } else {
424  IntCols = TVec<TIntV>(1);
425  H.GetKeyV(IntCols[0]);
426  }
427  FltCols = TVec<TFltV>(1);
428  H.GetDatV(FltCols[0]);
429  Next = TIntV(NumRows);
430  for (TInt i = 0; i < NumRows; i++) {
431  Next[i] = i+1;
432  }
433  Next[NumRows-1] = Last;
434  IsNextDirty = 0;
435  InitIds();
436 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
void GetDatV(TVec< TDat > &DatV) const
Definition: hash.h:492
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
void InitIds()
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition: table.cpp:1883
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void GetKeyV(TVec< TKey > &KeyV) const
Definition: hash.h:484
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
Definition: gbase.h:23
TVec< TInt > TIntV
Definition: ds.h:1594
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
int Len() const
Definition: hash.h:228
TTable::TTable ( const TTable Table)
inline

Copy constructor.

Definition at line 919 of file table.h.

919  : Context(Table.Context), Sch(Table.Sch),
921  LastValidRow(Table.LastValidRow), Next(Table.Next), IntCols(Table.IntCols),
922  FltCols(Table.FltCols), StrColMaps(Table.StrColMaps), ColTypeMap(Table.ColTypeMap),
925  SrcCol(Table.SrcCol), DstCol(Table.DstCol),
928  IsNextDirty(Table.IsNextDirty) {}
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
THash< GroupStmt, THash< TGroupKey, TIntV > > GroupMapping
Maps grouping statements to their (group-by key –> group id) mapping.
Definition: table.h:581
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
Schema Sch
Table Schema.
Definition: table.h:549
THash< TStr, TPair< TAttrType, TInt > > ColTypeMap
Definition: table.h:564
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
TStrTrV CommonNodeAttrs
List of attribute pairs with values common to source and destination and their common given name...
Definition: table.h:594
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
THash< TStr, GroupStmt > GroupStmtNames
Maps user-given grouping statement names to their group-by attributes.
Definition: table.h:573
TStr SrcCol
Column (attribute) to serve as src nodes when constructing the graph.
Definition: table.h:589
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TStr DstCol
Column (attribute) to serve as dst nodes when constructing the graph.
Definition: table.h:590
THash< GroupStmt, THash< TInt, TGroupKey > > GroupIDMapping
Maps grouping statements to their (group id –> group-by key) mapping.
Definition: table.h:577
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TTable::TTable ( const TTable Table,
const TIntV RowIds 
)

Definition at line 438 of file table.cpp.

438  : Context(Table.Context),
439  Sch(Table.Sch), SrcCol(Table.SrcCol), DstCol(Table.DstCol), EdgeAttrV(Table.EdgeAttrV),
442  ColTypeMap = Table.ColTypeMap;
443  IntCols = TVec<TIntV>(Table.IntCols.Len());
444  FltCols = TVec<TFltV>(Table.FltCols.Len());
446  FirstValidRow = 0;
447  LastValidRow = -1;
448  NumRows = 0;
449  NumValidRows = 0;
450  AddSelectedRows(Table, RowIDs);
451  IsNextDirty = 0;
452  InitIds();
453 }
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
Schema Sch
Table Schema.
Definition: table.h:549
THash< TStr, TPair< TAttrType, TInt > > ColTypeMap
Definition: table.h:564
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
void AddSelectedRows(const TTable &Table, const TIntV &RowIDs)
Adds rows from Table that correspond to ids in RowIDs.
Definition: table.cpp:4399
TTableContext * Context
Execution Context.
Definition: table.h:545
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
void InitIds()
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition: table.cpp:1883
TStrTrV CommonNodeAttrs
List of attribute pairs with values common to source and destination and their common given name...
Definition: table.h:594
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
TStr SrcCol
Column (attribute) to serve as src nodes when constructing the graph.
Definition: table.h:589
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TStr DstCol
Column (attribute) to serve as dst nodes when constructing the graph.
Definition: table.h:590
TInt IsNextDirty
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing Get...
Definition: table.h:603
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552

Member Function Documentation

void TTable::AddColType ( const TStr ColName,
TPair< TAttrType, TInt ColType 
)
inlineprotected

Adds column with name ColName and type ColType to the ColTypeMap.

Definition at line 651 of file table.h.

651  {
652  TStr NColName = NormalizeColName(ColName);
653  ColTypeMap.AddDat(NColName, ColType);
654  }
THash< TStr, TPair< TAttrType, TInt > > ColTypeMap
Definition: table.h:564
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
Definition: dt.h:412
TDat & AddDat(const TKey &Key)
Definition: hash.h:238
void TTable::AddColType ( const TStr ColName,
TAttrType  ColType,
TInt  Index 
)
inlineprotected

Adds column with name ColName and type ColType to the ColTypeMap.

Definition at line 656 of file table.h.

656  {
657  TStr NColName = NormalizeColName(ColName);
658  AddColType(NColName, TPair<TAttrType,TInt>(ColType, Index));
659  }
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
Definition: ds.h:32
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
Definition: dt.h:412
void TTable::AddDstNodeAttr ( const TStr Attr)
inline

Adds column to be used as dst node atribute of the graph.

Definition at line 1180 of file table.h.

1180 { AddGraphAttribute(Attr, false, false, true); }
void AddGraphAttribute(const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds names of columns to be used as graph attributes.
Definition: table.cpp:985
void TTable::AddDstNodeAttr ( TStrV Attrs)
inline

Adds columns to be used as dst node attributes of the graph.

Definition at line 1182 of file table.h.

1182 { AddGraphAttributeV(Attrs, false, false, true); }
void AddGraphAttributeV(TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds vector of names of columns to be used as graph attributes.
Definition: table.cpp:992
void TTable::AddEdgeAttr ( const TStr Attr)
inline

Adds column to be used as graph edge attribute.

Definition at line 1172 of file table.h.

1172 { AddGraphAttribute(Attr, true, false, false); }
void AddGraphAttribute(const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds names of columns to be used as graph attributes.
Definition: table.cpp:985
void TTable::AddEdgeAttr ( TStrV Attrs)
inline

Adds columns to be used as graph edge attributes.

Definition at line 1174 of file table.h.

1174 { AddGraphAttributeV(Attrs, true, false, false); }
void AddGraphAttributeV(TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds vector of names of columns to be used as graph attributes.
Definition: table.cpp:992
void TTable::AddEdgeAttributes ( PNEANet Graph,
int  RowId 
)
inlineprotected

Adds attributes of edge corresponding to RowId to the Graph.

Definition at line 3395 of file table.cpp.

3395  {
3396  for (TInt i = 0; i < EdgeAttrV.Len(); i++) {
3397  TStr ColName = EdgeAttrV[i];
3398  TAttrType T = GetColType(ColName);
3399  TInt Index = GetColIdx(ColName);
3400  switch (T) {
3401  case atInt:
3402  Graph->AddIntAttrDatE(RowId, IntCols[Index][RowId], ColName);
3403  break;
3404  case atFlt:
3405  Graph->AddFltAttrDatE(RowId, FltCols[Index][RowId], ColName);
3406  break;
3407  case atStr:
3408  Graph->AddStrAttrDatE(RowId, GetStrVal(Index, RowId), ColName);
3409  break;
3410  }
3411  }
3412 }
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TStr GetStrVal(TInt ColIdx, TInt RowIdx) const
Gets the value in column with id ColIdx at row RowIdx.
Definition: table.h:626
Definition: dt.h:412
Definition: gbase.h:23
Definition: gbase.h:23
void TTable::AddFltCol ( const TStr ColName)

Adds a float column with name ColName.

Definition at line 4680 of file table.cpp.

4680  {
4681  AddSchemaCol(ColName, atFlt);
4682  FltCols.Add(TFltV(NumRows));
4683  TInt L = FltCols.Len();
4684  AddColType(ColName, atFlt, L-1);
4685 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TVec< TFlt > TFltV
Definition: ds.h:1596
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
Definition: gbase.h:23
void TTable::AddGraphAttribute ( const TStr Attr,
TBool  IsEdge,
TBool  IsSrc,
TBool  IsDst 
)
protected

Adds names of columns to be used as graph attributes.

Definition at line 985 of file table.cpp.

985  {
986  if (!IsColName(Attr)) { TExcept::Throw(Attr + ": No such column"); }
987  if (IsEdge) { EdgeAttrV.Add(NormalizeColName(Attr)); }
988  if (IsSrc) { SrcNodeAttrV.Add(NormalizeColName(Attr)); }
989  if (IsDst) { DstNodeAttrV.Add(NormalizeColName(Attr)); }
990 }
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TBool IsColName(const TStr &ColName) const
Definition: table.h:646
void TTable::AddGraphAttributeV ( TStrV Attrs,
TBool  IsEdge,
TBool  IsSrc,
TBool  IsDst 
)
protected

Adds vector of names of columns to be used as graph attributes.

Definition at line 992 of file table.cpp.

992  {
993  for (TInt i = 0; i < Attrs.Len(); i++) {
994  if (!IsColName(Attrs[i])) {
995  TExcept::Throw(Attrs[i] + ": no such column");
996  }
997  }
998  for (TInt i = 0; i < Attrs.Len(); i++) {
999  if (IsEdge) { EdgeAttrV.Add(NormalizeColName(Attrs[i])); }
1000  if (IsSrc) { SrcNodeAttrV.Add(NormalizeColName(Attrs[i])); }
1001  if (IsDst) { DstNodeAttrV.Add(NormalizeColName(Attrs[i])); }
1002  }
1003 }
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
Definition: dt.h:1134
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TBool IsColName(const TStr &ColName) const
Definition: table.h:646
void TTable::AddIdColumn ( const TStr IdColName)
protected

Adds a column of explicit integer identifiers to the rows.

Definition at line 1900 of file table.cpp.

1900  {
1901  //printf("NumRows: %d\n", NumRows.Val);
1902  TInt IdCol = IntCols.Add();
1903  IntCols[IdCol].Reserve(NumRows, NumRows);
1904  //printf("IdCol Reserved\n");
1905  TInt IdCnt = 0;
1906  RowIdMap.Clr();
1907  for (TRowIterator RI = BegRI(); RI < EndRI(); RI++) {
1908  IntCols[IdCol][RI.GetRowIdx()] = IdCnt;
1909  RowIdMap.AddDat(IdCnt, RI.GetRowIdx());
1910  IdCnt++;
1911  }
1912  AddSchemaCol(ColName, atInt);
1913  AddColType(ColName, atInt, IntCols.Len()-1);
1914 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Iterator class for TTable rows.
Definition: table.h:330
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
Definition: dt.h:1134
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
void Clr(const bool &DoDel=true, const int &NoDelLim=-1, const bool &ResetDat=true)
Definition: hash.h:361
void Reserve(const TSizeTy &_MxVals)
Reserves enough memory for the vector to store _MxVals elements.
Definition: ds.h:543
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TDat & AddDat(const TKey &Key)
Definition: hash.h:238
void TTable::AddIntCol ( const TStr ColName)

Adds an integer column with name ColName.

Definition at line 4673 of file table.cpp.

4673  {
4674  AddSchemaCol(ColName, atInt);
4676  TInt L = IntCols.Len();
4677  AddColType(ColName, atInt, L-1);
4678 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1134
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
TVec< TInt > TIntV
Definition: ds.h:1594
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void TTable::AddJointRow ( const TTable T1,
const TTable T2,
TInt  RowIdx1,
TInt  RowIdx2 
)
protected

Adds joint row T1[RowIdx1]<=>T2[RowIdx2].

Definition at line 1957 of file table.cpp.

1957  {
1958  for (TInt i = 0; i < T1.IntCols.Len(); i++) {
1959  IntCols[i].Add(T1.IntCols[i][RowIdx1]);
1960  }
1961  for (TInt i = 0; i < T1.FltCols.Len(); i++) {
1962  FltCols[i].Add(T1.FltCols[i][RowIdx1]);
1963  }
1964  for (TInt i = 0; i < T1.StrColMaps.Len(); i++) {
1965  StrColMaps[i].Add(T1.StrColMaps[i][RowIdx1]);
1966  }
1967  TInt IntOffset = T1.IntCols.Len();
1968  TInt FltOffset = T1.FltCols.Len();
1969  TInt StrOffset = T1.StrColMaps.Len();
1970  for (TInt i = 0; i < T2.IntCols.Len(); i++) {
1971  IntCols[i+IntOffset].Add(T2.IntCols[i][RowIdx2]);
1972  }
1973  for (TInt i = 0; i < T2.FltCols.Len(); i++) {
1974  FltCols[i+FltOffset].Add(T2.FltCols[i][RowIdx2]);
1975  }
1976  for (TInt i = 0; i < T2.StrColMaps.Len(); i++) {
1977  StrColMaps[i+StrOffset].Add(T2.StrColMaps[i][RowIdx2]);
1978  }
1979  TInt IdOffset = IntOffset + T2.IntCols.Len();
1980  NumRows++;
1981  NumValidRows++;
1982  if (!Next.Empty()) {
1983  Next[Next.Len()-1] = NumValidRows-1;
1985  }
1986  Next.Add(Last);
1988  IntCols[IdOffset].Add(NumRows-1);
1989 }
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
bool Empty() const
Tests whether the vector is empty.
Definition: ds.h:570
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TDat & AddDat(const TKey &Key)
Definition: hash.h:238
void TTable::AddNJointRowsMP ( const TTable T1,
const TTable T2,
const TVec< TIntPrV > &  JointRowIDSet 
)
protected

Adds rows from T1 and T2 to this table in a parallel manner. Used by Join.

Definition at line 4442 of file table.cpp.

4442  {
4443  //double startFn = omp_get_wtime();
4444  int JointTableSize = 0;
4445  TIntV StartOffsets(JointRowIDSet.Len());
4446  for (int i = 0; i < JointRowIDSet.Len(); i++) {
4447  StartOffsets[i] = JointTableSize;
4448  JointTableSize += JointRowIDSet[i].Len();
4449  }
4450  if (JointTableSize == 0) {
4451  TExcept::Throw("Joint table is empty");
4452  }
4453  //double endOffsets = omp_get_wtime();
4454  //printf("Offsets time = %f\n",endOffsets-startFn);
4455  ResizeTable(JointTableSize);
4456  //double endResize = omp_get_wtime();
4457  //printf("Resize time = %f\n",endResize-endOffsets);
4458  NumRows = JointTableSize;
4459  NumValidRows = JointTableSize;
4460  Assert(NumRows <= Next.Len());
4461 
4462  TInt IntOffset = T1.IntCols.Len();
4463  TInt FltOffset = T1.FltCols.Len();
4464  TInt StrOffset = T1.StrColMaps.Len();
4465 
4466  TInt IdOffset = IntOffset + T2.IntCols.Len();
4467  RowIdMap.Clr();
4468  for (TInt IdCnt = 0; IdCnt < JointTableSize; IdCnt++) {
4469  RowIdMap.AddDat(IdCnt, IdCnt);
4470  }
4471 
4472  #pragma omp parallel for schedule(dynamic, CHUNKS_PER_THREAD)
4473  for (int j = 0; j < JointRowIDSet.Len(); j++) {
4474  const TIntPrV& RowIDs = JointRowIDSet[j];
4475  int start = StartOffsets[j];
4476  int NewRows = RowIDs.Len();
4477  if (NewRows == 0) {continue;}
4478  for (TInt r = 0; r < NewRows; r++){
4479  TIntPr CurrRowIdPr = RowIDs[r];
4480  for(TInt i = 0; i < T1.IntCols.Len(); i++){
4481  IntCols[i][start+r] = T1.IntCols[i][CurrRowIdPr.GetVal1()];
4482  }
4483  for(TInt i = 0; i < T1.FltCols.Len(); i++){
4484  FltCols[i][start+r] = T1.FltCols[i][CurrRowIdPr.GetVal1()];
4485  }
4486  for(TInt i = 0; i < T1.StrColMaps.Len(); i++){
4487  StrColMaps[i][start+r] = T1.StrColMaps[i][CurrRowIdPr.GetVal1()];
4488  }
4489  for(TInt i = 0; i < T2.IntCols.Len(); i++){
4490  IntCols[i+IntOffset][start+r] = T2.IntCols[i][CurrRowIdPr.GetVal2()];
4491  }
4492  for(TInt i = 0; i < T2.FltCols.Len(); i++){
4493  FltCols[i+FltOffset][start+r] = T2.FltCols[i][CurrRowIdPr.GetVal2()];
4494  }
4495  for(TInt i = 0; i < T2.StrColMaps.Len(); i++){
4496  StrColMaps[i+StrOffset][start+r] = T2.StrColMaps[i][CurrRowIdPr.GetVal2()];
4497  }
4498  IntCols[IdOffset][start+r] = start+r;
4499  }
4500  for(TInt r = 0; r < NewRows; r++){
4501  Next[start+r] = start+r+1;
4502  }
4503  }
4504  LastValidRow = JointTableSize-1;
4505  Next[LastValidRow] = Last;
4506  //double endIterate = omp_get_wtime();
4507  //printf("Iterate time = %f\n",endIterate-endResize);
4508 }
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
const TVal1 & GetVal1() const
Definition: ds.h:60
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
void ResizeTable(int RowCount)
Resizes the table to hold RowCount rows.
Definition: table.cpp:4330
const TVal2 & GetVal2() const
Definition: ds.h:61
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
#define Assert(Cond)
Definition: bd.h:251
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
Definition: ds.h:32
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void Clr(const bool &DoDel=true, const int &NoDelLim=-1, const bool &ResetDat=true)
Definition: hash.h:361
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TDat & AddDat(const TKey &Key)
Definition: hash.h:238
void TTable::AddNodeAttr ( const TStr Attr)
inline

Handles the common case where src and dst both belong to the same "universe" of entities.

Definition at line 1184 of file table.h.

1184 { AddSrcNodeAttr(Attr); AddDstNodeAttr(Attr); }
void AddDstNodeAttr(const TStr &Attr)
Adds column to be used as dst node atribute of the graph.
Definition: table.h:1180
void AddSrcNodeAttr(const TStr &Attr)
Adds column to be used as src node atribute of the graph.
Definition: table.h:1176
void TTable::AddNodeAttr ( TStrV Attrs)
inline

Handles the common case where src and dst both belong to the same "universe" of entities.

Definition at line 1186 of file table.h.

1186 { AddSrcNodeAttr(Attrs); AddDstNodeAttr(Attrs); }
void AddDstNodeAttr(const TStr &Attr)
Adds column to be used as dst node atribute of the graph.
Definition: table.h:1180
void AddSrcNodeAttr(const TStr &Attr)
Adds column to be used as src node atribute of the graph.
Definition: table.h:1176
void TTable::AddNodeAttributes ( TInt  NId,
TStrV  NodeAttrV,
TInt  RowId,
THash< TInt, TStrIntVH > &  NodeIntAttrs,
THash< TInt, TStrFltVH > &  NodeFltAttrs,
THash< TInt, TStrStrVH > &  NodeStrAttrs 
)
inlineprotected

Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values).

Definition at line 3414 of file table.cpp.

3415  {
3416  for (TInt i = 0; i < NodeAttrV.Len(); i++) {
3417  TStr ColAttr = NodeAttrV[i];
3418  TAttrType CT = GetColType(ColAttr);
3419  int ColId = GetColIdx(ColAttr);
3420  // check if this is a common src-dst attribute
3421  for (TInt i = 0; i < CommonNodeAttrs.Len(); i++) {
3422  if (CommonNodeAttrs[i].Val1 == ColAttr || CommonNodeAttrs[i].Val2 == ColAttr) {
3423  ColAttr = CommonNodeAttrs[i].Val3;
3424  break;
3425  }
3426  }
3427  if (CT == atInt) {
3428  if (!NodeIntAttrs.IsKey(NId)) { NodeIntAttrs.AddKey(NId); }
3429  if (!NodeIntAttrs.GetDat(NId).IsKey(ColAttr)) { NodeIntAttrs.GetDat(NId).AddKey(ColAttr); }
3430  NodeIntAttrs.GetDat(NId).GetDat(ColAttr).Add(IntCols[ColId][RowId]);
3431  } else if (CT == atFlt) {
3432  if (!NodeFltAttrs.IsKey(NId)) { NodeFltAttrs.AddKey(NId); }
3433  if (!NodeFltAttrs.GetDat(NId).IsKey(ColAttr)) { NodeFltAttrs.GetDat(NId).AddKey(ColAttr); }
3434  NodeFltAttrs.GetDat(NId).GetDat(ColAttr).Add(FltCols[ColId][RowId]);
3435  } else {
3436  if (!NodeStrAttrs.IsKey(NId)) { NodeStrAttrs.AddKey(NId); }
3437  if (!NodeStrAttrs.GetDat(NId).IsKey(ColAttr)) { NodeStrAttrs.GetDat(NId).AddKey(ColAttr); }
3438  NodeStrAttrs.GetDat(NId).GetDat(ColAttr).Add(GetStrVal(ColId, RowId));
3439  }
3440  }
3441 }
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
TStrTrV CommonNodeAttrs
List of attribute pairs with values common to source and destination and their common given name...
Definition: table.h:594
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
int AddKey(const TKey &Key)
Definition: hash.h:373
TStr GetStrVal(TInt ColIdx, TInt RowIdx) const
Gets the value in column with id ColIdx at row RowIdx.
Definition: table.h:626
Definition: dt.h:412
Definition: gbase.h:23
bool IsKey(const TKey &Key) const
Definition: hash.h:258
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void TTable::AddNRows ( int  NewRows,
const TVec< TIntV > &  IntColsP,
const TVec< TFltV > &  FltColsP,
const TVec< TIntV > &  StrColMapsP 
)
protected

Adds NewRows rows from the given vectors for each column type.

Definition at line 4421 of file table.cpp.

4421  {
4422  if (NewRows == 0) { return; }
4423  // this call should be thread-safe
4424  int start = GetEmptyRowsStart(NewRows);
4425  for (TInt r = 0; r < NewRows; r++) {
4426  for (TInt i = 0; i < IntColsP.Len(); i++) {
4427  IntCols[i][start+r] = IntColsP[i][r];
4428  }
4429  for (TInt i = 0; i < FltColsP.Len(); i++) {
4430  FltCols[i][start+r] = FltColsP[i][r];
4431  }
4432  for (TInt i = 0; i < StrColMapsP.Len(); i++) {
4433  StrColMaps[i][start+r] = StrColMapsP[i][r];
4434  }
4435  }
4436  for (TInt r = 0; r < NewRows-1; r++) {
4437  Next[start+r] = start+r+1;
4438  }
4439 }
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
int GetEmptyRowsStart(int NewRows)
Gets the start index to a chunk of empty rows of size NewRows.
Definition: table.cpp:4376
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
void TTable::AddRow ( const TRowIterator RI)
protected

Adds row corresponding to RI.

Definition at line 4295 of file table.cpp.

4295  {
4296  for (TInt c = 0; c < Sch.Len(); c++) {
4297  TStr ColName = GetSchemaColName(c);
4298  if (ColName == IdColName) { continue; }
4299 
4300  TInt ColIdx = GetColIdx(ColName);
4301 
4302  switch (GetColType(ColName)) {
4303  case atInt:
4304  IntCols[ColIdx].Add(RI.GetIntAttr(ColName));
4305  break;
4306  case atFlt:
4307  FltCols[ColIdx].Add(RI.GetFltAttr(ColName));
4308  break;
4309  case atStr:
4310  StrColMaps[ColIdx].Add(RI.GetStrMapByName(ColName));
4311  break;
4312  }
4313  }
4315 }
TFlt GetFltAttr(TInt ColIdx) const
Returns value of floating point attribute specified by float column index for current row...
Definition: table.cpp:159
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
Schema Sch
Table Schema.
Definition: table.h:549
TInt GetIntAttr(TInt ColIdx) const
Returns value of integer attribute specified by integer column index for current row.
Definition: table.cpp:155
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
TInt GetStrMapByName(const TStr &Col) const
Returns integer mapping of string attribute specified by attribute name for current row...
Definition: table.cpp:181
Definition: gbase.h:23
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStr GetSchemaColName(TInt Idx) const
Gets name of the column with index Idx in the schema.
Definition: table.h:638
void UpdateTableForNewRow()
Updates table state after adding one or more rows.
Definition: table.cpp:4140
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: dt.h:412
Definition: gbase.h:23
Definition: gbase.h:23
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void TTable::AddRow ( const TIntV IntVals,
const TFltV FltVals,
const TStrV StrVals 
)
protected

Adds row with values corresponding to the given vectors by type.

Definition at line 4317 of file table.cpp.

4317  {
4318  for (TInt c = 0; c < IntVals.Len(); c++) {
4319  IntCols[c].Add(IntVals[c]);
4320  }
4321  for (TInt c = 0; c < FltVals.Len(); c++) {
4322  FltCols[c].Add(FltVals[c]);
4323  }
4324  for (TInt c = 0; c < StrVals.Len(); c++) {
4325  AddStrVal(c, StrVals[c]);
4326  }
4328 }
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
void UpdateTableForNewRow()
Updates table state after adding one or more rows.
Definition: table.cpp:4140
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
void AddStrVal(const TInt &ColIdx, const TStr &Val)
Adds Val in column with id ColIdx.
Definition: table.cpp:971
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void TTable::AddRow ( const TTableRow Row)
inline

Adds row with values taken from given TTableRow.

Definition at line 1002 of file table.h.

1002 { AddRow(Row.GetIntVals(), Row.GetFltVals(), Row.GetStrVals()); };
TStrV GetStrVals() const
Gets string attributes of this row.
Definition: table.h:253
TFltV GetFltVals() const
Gets float attributes of this row.
Definition: table.h:251
TIntV GetIntVals() const
Gets int attributes of this row.
Definition: table.h:249
void AddRow(const TRowIterator &RI)
Adds row corresponding to RI.
Definition: table.cpp:4295
void TTable::AddSchemaCol ( const TStr ColName,
TAttrType  ColType 
)
inlineprotected

Adds column with name ColName and type ColType to the schema.

Definition at line 642 of file table.h.

642  {
643  TStr NColName = NormalizeColName(ColName);
644  Sch.Add(TPair<TStr,TAttrType>(NColName, ColType));
645  }
Schema Sch
Table Schema.
Definition: table.h:549
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
Definition: dt.h:412
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void TTable::AddSelectedRows ( const TTable Table,
const TIntV RowIDs 
)
protected

Adds rows from Table that correspond to ids in RowIDs.

Definition at line 4399 of file table.cpp.

4399  {
4400  int NewRows = RowIDs.Len();
4401  if (NewRows == 0) { return; }
4402  // this call should be thread-safe
4403  int start = GetEmptyRowsStart(NewRows);
4404  for (TInt r = 0; r < NewRows; r++) {
4405  TInt CurrRowIdx = RowIDs[r];
4406  for (TInt i = 0; i < Table.IntCols.Len(); i++) {
4407  IntCols[i][start+r] = Table.IntCols[i][CurrRowIdx];
4408  }
4409  for (TInt i = 0; i < Table.FltCols.Len(); i++) {
4410  FltCols[i][start+r] = Table.FltCols[i][CurrRowIdx];
4411  }
4412  for (TInt i = 0; i < Table.StrColMaps.Len(); i++) {
4413  StrColMaps[i][start+r] = Table.StrColMaps[i][CurrRowIdx];
4414  }
4415  }
4416  for (TInt r = 0; r < NewRows-1; r++) {
4417  Next[start+r] = start+r+1;
4418  }
4419 }
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
int GetEmptyRowsStart(int NewRows)
Gets the start index to a chunk of empty rows of size NewRows.
Definition: table.cpp:4376
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
void TTable::AddSrcNodeAttr ( const TStr Attr)
inline

Adds column to be used as src node atribute of the graph.

Definition at line 1176 of file table.h.

1176 { AddGraphAttribute(Attr, false, true, false); }
void AddGraphAttribute(const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds names of columns to be used as graph attributes.
Definition: table.cpp:985
void TTable::AddSrcNodeAttr ( TStrV Attrs)
inline

Adds columns to be used as src node attributes of the graph.

Definition at line 1178 of file table.h.

1178 { AddGraphAttributeV(Attrs, false, true, false); }
void AddGraphAttributeV(TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst)
Adds vector of names of columns to be used as graph attributes.
Definition: table.cpp:992
void TTable::AddStrCol ( const TStr ColName)

Adds a string column with name ColName.

Definition at line 4687 of file table.cpp.

4687  {
4688  AddSchemaCol(ColName, atStr);
4690  TInt L = StrColMaps.Len();
4691  AddColType(ColName, atStr, L-1);
4692 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
Definition: dt.h:1134
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
TVec< TInt > TIntV
Definition: ds.h:1594
Definition: gbase.h:23
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void TTable::AddStrVal ( const TInt ColIdx,
const TStr Val 
)
protected

Adds Val in column with id ColIdx.

Definition at line 971 of file table.cpp.

971  {
972  TInt KeyId = TInt(Context->StringVals.AddKey(Key));
973  //printf("TTable::AddStrVal2 %d .%s. %d\n", ColIdx.Val, Key.CStr(), KeyId.Val);
974  StrColMaps[ColIdx].Add(KeyId);
975 }
TTableContext * Context
Execution Context.
Definition: table.h:545
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStrHash< TInt, TBigStrPool > StringVals
StringPool - stores string data values and maps them to integers.
Definition: table.h:182
int AddKey(const char *Key)
Definition: hash.h:968
Definition: dt.h:1134
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void TTable::AddStrVal ( const TStr Col,
const TStr Val 
)
protected

Adds Val in column with name Col.

Definition at line 977 of file table.cpp.

977  {
978  if (GetColType(Col) != atStr) {
979  TExcept::Throw(Col + " is not a string valued column");
980  }
981  //printf("TTable::AddStrVal1 .%s. .%s.\n", Col.CStr(), Key.CStr());
982  AddStrVal(GetColIdx(Col), Key);
983 }
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
void AddStrVal(const TInt &ColIdx, const TStr &Val)
Adds Val in column with id ColIdx.
Definition: table.cpp:971
Definition: gbase.h:23
void TTable::AddTable ( const TTable T)
protected

Adds all the rows of the input table. Allows duplicate rows (not a union).

Definition at line 3975 of file table.cpp.

3975  {
3976  //for (TInt c = 0; c < S.Len(); c++) {
3977  // if (S[c] != T.S[c]) { printf("(%s,%d) != (%s,%d)\n", S[c].Val1.CStr(), S[c].Val2, T.S[c].Val1.CStr(), T.S[c].Val2); TExcept::Throw("when adding tables, their schemas must match!"); }
3978  //}
3979  for (TInt c = 0; c < Sch.Len(); c++) {
3980  TStr ColName = GetSchemaColName(c);
3981  TInt ColIdx = GetColIdx(ColName);
3982  TInt TColIdx = ColName == IdColName ? T.GetColIdx(T.IdColName) : T.GetColIdx(ColName);
3983  if (TColIdx < 0) { TExcept::Throw("when adding a table, it must contain all columns of source table!"); }
3984  switch (GetColType(ColName)) {
3985  case atInt:
3986  IntCols[ColIdx].AddV(T.IntCols[TColIdx]);
3987  break;
3988  case atFlt:
3989  FltCols[ColIdx].AddV(T.FltCols[TColIdx]);
3990  break;
3991  case atStr:
3992  StrColMaps[ColIdx].AddV(T.StrColMaps[TColIdx]);
3993  break;
3994  }
3995  }
3996 
3997  TIntV TNext(T.Next);
3998  for (TInt i = 0; i < TNext.Len(); i++) {
3999  if (TNext[i] != Last && TNext[i] != Invalid) { TNext[i] += NumRows; }
4000  }
4001 
4002  Next.AddV(TNext);
4003  // checks if table is empty
4004  if (LastValidRow >= 0) {
4006  }
4008  NumRows += T.NumRows;
4010 }
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
Schema Sch
Table Schema.
Definition: table.h:549
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStr GetSchemaColName(TInt Idx) const
Gets name of the column with index Idx in the schema.
Definition: table.h:638
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
static const TInt Invalid
Special value for Next vector entry - logically removed row.
Definition: table.h:487
Definition: dt.h:412
Definition: gbase.h:23
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
TSizeTy AddV(const TVec< TVal, TSizeTy > &ValV)
Adds the elements of the vector ValV to the to end of the vector.
Definition: ds.h:1110
void TTable::Aggregate ( const TStrV GroupByAttrs,
TAttrAggr  AggOp,
const TStr ValAttr,
const TStr ResAttr,
TBool  Ordered = true 
)

Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr.

Definition at line 1585 of file table.cpp.

1586  {
1587 
1588  for (TInt c = 0; c < GroupByAttrs.Len(); c++) {
1589  if (!IsColName(GroupByAttrs[c])) {
1590  TExcept::Throw("no such column " + GroupByAttrs[c]);
1591  }
1592  }
1593 
1594  // double startFn = omp_get_wtime();
1595  TStrV NGroupByAttrs = NormalizeColNameV(GroupByAttrs);
1596  TBool UsePhysicalIds = (GetColIdx(IdColName) < 0);
1597 
1598  THash<TInt,TIntV> GroupByIntMapping;
1599  THash<TFlt,TIntV> GroupByFltMapping;
1600  THash<TInt,TIntV> GroupByStrMapping;
1601  THash<TGroupKey,TIntV> Mapping;
1602 #ifdef GCC_ATOMIC
1603  THashMP<TInt,TIntV> GroupByIntMapping_MP(NumValidRows);
1604  TIntV GroupByIntMPKeys(NumValidRows);
1605 #endif
1606  TInt NumOfGroups = 0;
1607  TInt GroupingCase = 0;
1608 
1609  // check if grouping already exists
1610  GroupStmt Stmt(NGroupByAttrs, Ordered, UsePhysicalIds);
1611  if (GroupMapping.IsKey(Stmt)) {
1612  Mapping = GroupMapping.GetDat(Stmt);
1613  } else{
1614  if(NGroupByAttrs.Len() == 1){
1615  switch(GetColType(NGroupByAttrs[0])){
1616  case atInt:
1617 #ifdef GCC_ATOMIC
1618  if(GetMP()){
1619  GroupByIntColMP(NGroupByAttrs[0], GroupByIntMapping_MP, UsePhysicalIds);
1620  int x = 0;
1621  for(THashMP<TInt,TIntV>::TIter it = GroupByIntMapping_MP.BegI(); it < GroupByIntMapping_MP.EndI(); it++){
1622  GroupByIntMPKeys[x] = it.GetKey();
1623  x++;
1624  /*
1625  printf("%d --> ", it.GetKey().Val);
1626  TIntV& V = it.GetDat();
1627  for(int i = 0; i < V.Len(); i++){
1628  printf(" %d", V[i].Val);
1629  }
1630  printf("\n");
1631  */
1632  }
1633  NumOfGroups = x;
1634  GroupingCase = 4;
1635  //printf("Number of groups: %d\n", NumOfGroups.Val);
1636  break;
1637  }
1638 #endif // GCC_ATOMIC
1639  GroupByIntCol(NGroupByAttrs[0], GroupByIntMapping, TIntV(), true, UsePhysicalIds);
1640  NumOfGroups = GroupByIntMapping.Len();
1641  GroupingCase = 1;
1642  break;
1643  case atFlt:
1644  GroupByFltCol(NGroupByAttrs[0], GroupByFltMapping, TIntV(), true, UsePhysicalIds);
1645  NumOfGroups = GroupByFltMapping.Len();
1646  GroupingCase = 2;
1647  break;
1648  case atStr:
1649  GroupByStrCol(NGroupByAttrs[0], GroupByStrMapping, TIntV(), true, UsePhysicalIds);
1650  NumOfGroups = GroupByStrMapping.Len();
1651  GroupingCase = 3;
1652  break;
1653  }
1654  }
1655  else{
1656  TIntV UniqueVector;
1658  GroupAux(NGroupByAttrs, Mapping_aux, Ordered, "", false, UniqueVector, UsePhysicalIds);
1659  for(THash<TGroupKey, TPair<TInt, TIntV> >::TIter it = Mapping_aux.BegI(); it < Mapping_aux.EndI(); it++){
1660  Mapping.AddDat(it.GetKey(), it.GetDat().Val2);
1661  }
1662  NumOfGroups = Mapping.Len();
1663  }
1664  }
1665 
1666  // double endGroup = omp_get_wtime();
1667  // printf("Group time = %f\n", endGroup-startFn);
1668 
1669  TAttrType T = GetColType(ValAttr);
1670 
1671  // add column corresponding to result attribute type
1672  if (AggOp == aaCount) { AddIntCol(ResAttr); }
1673  else {
1674  if (T == atInt) { AddIntCol(ResAttr); }
1675  else if (T == atFlt) { AddFltCol(ResAttr); }
1676  else {
1677  // Count is the only aggregation operation handled for Str
1678  TExcept::Throw("Invalid aggregation for Str type!");
1679  }
1680  }
1681  TInt ColIdx = GetColIdx(ResAttr);
1682  TInt AggrColIdx = GetColIdx(ValAttr);
1683 
1684  // double endAdd = omp_get_wtime();
1685  // printf("AddCol time = %f\n", endAdd-endGroup);
1686 
1687 #ifdef USE_OPENMP
1688  #pragma omp parallel for schedule(dynamic)
1689 #endif
1690  for (int g = 0; g < NumOfGroups; g++) {
1691  TIntV* GroupRows = NULL;
1692  switch(GroupingCase){
1693  case 0:
1694  GroupRows = & Mapping.GetDat(Mapping.GetKey(g));
1695  break;
1696  case 1:
1697  GroupRows = & GroupByIntMapping.GetDat(GroupByIntMapping.GetKey(g));
1698  break;
1699  case 2:
1700  GroupRows = & GroupByIntMapping.GetDat(GroupByIntMapping.GetKey(g));
1701  break;
1702  case 3:
1703  GroupRows = & GroupByStrMapping.GetDat(GroupByStrMapping.GetKey(g));
1704  break;
1705  case 4:
1706 #ifdef GCC_ATOMIC
1707  GroupRows = & GroupByIntMapping_MP.GetDat(GroupByIntMPKeys[g]);
1708 #endif
1709  break;
1710  }
1711 
1712  // find valid rows of group
1713  /*
1714  TIntV ValidRows;
1715  for (TInt i = 0; i < GroupRows.Len(); i++) {
1716  // TODO: This should not be necessary
1717  if (!RowIdMap.IsKey(GroupRows[i])) { continue; }
1718  TInt RowId = RowIdMap.GetDat(GroupRows[i]);
1719  // GroupRows has physical row indices
1720  if (RowId != Invalid) { ValidRows.Add(RowId); }
1721  }
1722  */
1723  TIntV& ValidRows = *GroupRows;
1724  TInt sz = ValidRows.Len();
1725  if (sz <= 0) continue;
1726  // Count is handled separately (other operations have aggregation policies defined in a template)
1727  if (AggOp == aaCount) {
1728  for (TInt i = 0; i < sz; i++) { IntCols[ColIdx][ValidRows[i]] = sz; }
1729  } else {
1730  // aggregate based on column type
1731  if (T == atInt) {
1732  TIntV V;
1733  for (TInt i = 0; i < sz; i++) { V.Add(IntCols[AggrColIdx][ValidRows[i]]); }
1734  TInt Res = AggregateVector<TInt>(V, AggOp);
1735  if (AggOp == aaMean) { Res = Res / sz; }
1736  for (TInt i = 0; i < sz; i++) { IntCols[ColIdx][ValidRows[i]] = Res; }
1737  } else {
1738  TFltV V;
1739  for (TInt i = 0; i < sz; i++) { V.Add(FltCols[AggrColIdx][ValidRows[i]]); }
1740  TFlt Res = AggregateVector<TFlt>(V, AggOp);
1741  if (AggOp == aaMean) { Res /= sz; }
1742  for (TInt i = 0; i < sz; i++) { FltCols[ColIdx][ValidRows[i]] = Res; }
1743  }
1744  }
1745  }
1746  // double endIter = omp_get_wtime();
1747  // printf("Iter time = %f\n", endIter-endAdd);
1748 }
THash< GroupStmt, THash< TGroupKey, TIntV > > GroupMapping
Maps grouping statements to their (group-by key –> group id) mapping.
Definition: table.h:581
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
void AddIntCol(const TStr &ColName)
Adds an integer column with name ColName.
Definition: table.cpp:4673
Definition: table.h:257
void GroupByIntColMP(const TStr &GroupBy, THashMP< TInt, TIntV > &Grouping, TBool UsePhysicalIds=true) const
Groups/hashes by a single column with integer values, using OpenMP multi-threading.
Definition: table.cpp:1225
TIter BegI() const
Definition: hash.h:213
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
static TStrV NormalizeColNameV(const TStrV &Cols)
Adds suffix to column name if it doesn't exist.
Definition: table.h:539
static TInt GetMP()
Definition: table.h:527
void GroupAux(const TStrV &GroupBy, THash< TGroupKey, TPair< TInt, TIntV > > &Grouping, TBool Ordered, const TStr &GroupColName, TBool KeepUnique, TIntV &UniqueVec, TBool UsePhysicalIds=true)
Helper function for grouping.
Definition: table.cpp:1322
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
TIter EndI() const
Definition: hash.h:218
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
void GroupByFltCol(const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
Groups/hashes by a single column with float values. Returns hash table with grouping.
Definition: table.h:1626
Definition: gbase.h:23
Definition: dt.h:1383
TPHKeyDat * EndI
Definition: hashmp.h:47
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
const TVal & GetDat(const TVal &Val) const
Returns reference to the first occurrence of element Val.
Definition: ds.h:838
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
void GroupByIntCol(const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
Groups/hashes by a single column with integer values.
Definition: table.h:1598
A class representing a cached grouping statement identifier.
Definition: table.h:266
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: ds.h:32
void GroupByStrCol(const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const
Groups/hashes by a single column with string values. Returns hash table with grouping.
Definition: table.h:1653
Definition: gbase.h:23
Hash-Table with multiprocessing support.
Definition: hashmp.h:81
TVec< TInt > TIntV
Definition: ds.h:1594
void AddFltCol(const TStr &ColName)
Adds a float column with name ColName.
Definition: table.cpp:4680
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
Definition: dt.h:971
TBool IsColName(const TStr &ColName) const
Definition: table.h:646
int Len() const
Definition: hash.h:228
TDat & AddDat(const TKey &Key)
Definition: hash.h:238
const TKey & GetKey(const int &KeyId) const
Definition: hash.h:252
Definition: table.h:257
void TTable::AggregateCols ( const TStrV AggrAttrs,
TAttrAggr  AggOp,
const TStr ResAttr 
)

Aggregates attributes in AggrAttrs across columns.

Definition at line 1750 of file table.cpp.

1750  {
1752  for (TInt i = 0; i < AggrAttrs.Len(); i++) {
1753  Info.Add(GetColTypeMap(AggrAttrs[i]));
1754  if (Info[i].Val1 != Info[0].Val1) {
1755  TExcept::Throw("AggregateCols: Aggregation attributes must have the same type");
1756  }
1757  }
1758 
1759  if (Info[0].Val1 == atInt) {
1760  AddIntCol(ResAttr);
1761  TInt ResIdx = GetColIdx(ResAttr);
1762 
1763  for (TRowIterator RI = BegRI(); RI < EndRI(); RI++) {
1764  TInt RowIdx = RI.GetRowIdx();
1765  TIntV V;
1766  for (TInt i = 0; i < AggrAttrs.Len(); i++) {
1767  V.Add(IntCols[Info[i].Val2][RowIdx]);
1768  }
1769  IntCols[ResIdx][RowIdx] = AggregateVector<TInt>(V, AggOp);
1770  }
1771  } else if (Info[0].Val1 == atFlt) {
1772  AddFltCol(ResAttr);
1773  TInt ResIdx = GetColIdx(ResAttr);
1774 
1775  for (TRowIterator RI = BegRI(); RI < EndRI(); RI++) {
1776  TInt RowIdx = RI.GetRowIdx();
1777  TFltV V;
1778  for (TInt i = 0; i < AggrAttrs.Len(); i++) {
1779  V.Add(FltCols[Info[i].Val2][RowIdx]);
1780  }
1781  FltCols[ResIdx][RowIdx] = AggregateVector<TFlt>(V, AggOp);
1782  }
1783  } else {
1784  TExcept::Throw("AggregateCols: Only Int and Flt aggregation supported right now");
1785  }
1786 }
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
void AddIntCol(const TStr &ColName)
Adds an integer column with name ColName.
Definition: table.cpp:4673
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TPair< TAttrType, TInt > GetColTypeMap(const TStr &ColName) const
Gets column type and index of ColName.
Definition: table.h:666
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
Definition: gbase.h:23
void AddFltCol(const TStr &ColName)
Adds a float column with name ColName.
Definition: table.cpp:4680
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:430
template<class T >
T TTable::AggregateVector ( TVec< T > &  V,
TAttrAggr  Policy 
)
protected

Aggregates vector into a single scalar value according to a policy.

Aggregate vector into a single scalar value according to a policy. Used for choosing an attribute value for a node when this node appears in several records and has conflicting attribute values

Definition at line 1544 of file table.h.

1544  {
1545  switch (Policy) {
1546  case aaMin: {
1547  T Res = V[0];
1548  for (TInt i = 1; i < V.Len(); i++) {
1549  if (V[i] < Res) { Res = V[i]; }
1550  }
1551  return Res;
1552  }
1553  case aaMax: {
1554  T Res = V[0];
1555  for (TInt i = 1; i < V.Len(); i++) {
1556  if (V[i] > Res) { Res = V[i]; }
1557  }
1558  return Res;
1559  }
1560  case aaFirst: {
1561  return V[0];
1562  }
1563  case aaLast:{
1564  return V[V.Len()-1];
1565  }
1566  case aaSum: {
1567  T Res = V[0];
1568  for (TInt i = 1; i < V.Len(); i++) {
1569  Res = Res + V[i];
1570  }
1571  return Res;
1572  }
1573  case aaMean: {
1574  T Res = V[0];
1575  for (TInt i = 1; i < V.Len(); i++) {
1576  Res = Res + V[i];
1577  }
1578  //Res = Res / V.Len(); // TODO: Handle Str case separately?
1579  return Res;
1580  }
1581  case aaMedian: {
1582  V.Sort();
1583  return V[V.Len()/2];
1584  }
1585  case aaCount: {
1586  // NOTE: Code should never reach here
1587  // I had to put this here to avoid a compiler warning.
1588  // Is there a better way to do this?
1589  return V[0];
1590  }
1591  }
1592  // Added to remove a compiler warning.
1593  T ShouldNotComeHere;
1594  return ShouldNotComeHere;
1595 }
Definition: table.h:257
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Definition: table.h:257
void Sort(const bool &Asc=true)
Sorts the elements of the vector.
Definition: ds.h:1318
Definition: dt.h:1134
Definition: table.h:257
Definition: table.h:257
Definition: table.h:257
Definition: table.h:257
Definition: table.h:257
TRowIterator TTable::BegRI ( ) const
inline

Gets iterator to the first valid row of the table.

Definition at line 1241 of file table.h.

1241 { return TRowIterator(FirstValidRow, this);}
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
friend class TRowIterator
Definition: table.h:1526
TRowIteratorWithRemove TTable::BegRIWR ( )
inline

Gets iterator with reomve to the first valid row.

Definition at line 1245 of file table.h.

1245 { return TRowIteratorWithRemove(FirstValidRow, this);}
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
friend class TRowIteratorWithRemove
Definition: table.h:1527
PNEANet TTable::BuildGraph ( const TIntV RowIds,
TAttrAggr  AggrPolicy 
)
protected

Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes.

Definition at line 3445 of file table.cpp.

3445  {
3446  PNEANet Graph = TNEANet::New();
3447 
3448  const TAttrType NodeType = GetColType(SrcCol);
3449  Assert(NodeType == GetColType(DstCol));
3450  const TInt SrcColIdx = GetColIdx(SrcCol);
3451  const TInt DstColIdx = GetColIdx(DstCol);
3452 
3453  // node values - i.e. the unique values of src/dst col
3454  //THashSet<TInt> IntNodeVals; // for both int and string node attr types.
3455  THash<TFlt, TInt> FltNodeVals;
3456 
3457  // node attributes
3458  THash<TInt, TStrIntVH> NodeIntAttrs;
3459  THash<TInt, TStrFltVH> NodeFltAttrs;
3460  THash<TInt, TStrStrVH> NodeStrAttrs;
3461 
3462  // make single pass over all rows in given row id set
3463  for (TVec<TInt>::TIter it = RowIds.BegI(); it < RowIds.EndI(); it++) {
3464  TInt CurrRowIdx = *it;
3465 
3466  // add src and dst nodes to graph if they are not seen earlier
3467  TInt SVal, DVal;
3468  if (NodeType == atFlt) {
3469  TFlt FSVal = FltCols[SrcColIdx][CurrRowIdx];
3470  SVal = CheckAndAddFltNode(Graph, FltNodeVals, FSVal);
3471  TFlt FDVal = FltCols[SrcColIdx][CurrRowIdx];
3472  DVal = CheckAndAddFltNode(Graph, FltNodeVals, FDVal);
3473  } else if (NodeType == atInt || NodeType == atStr) {
3474  if (NodeType == atInt) {
3475  SVal = IntCols[SrcColIdx][CurrRowIdx];
3476  DVal = IntCols[DstColIdx][CurrRowIdx];
3477  } else {
3478  SVal = StrColMaps[SrcColIdx][CurrRowIdx];
3479  if (strlen(Context->StringVals.GetKey(SVal)) == 0) { continue; } //illegal value
3480  DVal = StrColMaps[DstColIdx][CurrRowIdx];
3481  if (strlen(Context->StringVals.GetKey(DVal)) == 0) { continue; } //illegal value
3482  }
3483  if (!Graph->IsNode(SVal)) { Graph->AddNode(SVal); }
3484  if (!Graph->IsNode(DVal)) { Graph->AddNode(DVal); }
3485  //CheckAndAddIntNode(Graph, IntNodeVals, SVal);
3486  //CheckAndAddIntNode(Graph, IntNodeVals, DVal);
3487  }
3488 
3489  // add edge and edge attributes
3490  Graph->AddEdge(SVal, DVal, CurrRowIdx);
3491  if (EdgeAttrV.Len() > 0) { AddEdgeAttributes(Graph, CurrRowIdx); }
3492 
3493  // get src and dst node attributes into hashmaps
3494  if (SrcNodeAttrV.Len() > 0) {
3495  AddNodeAttributes(SVal, SrcNodeAttrV, CurrRowIdx, NodeIntAttrs, NodeFltAttrs, NodeStrAttrs);
3496  }
3497  if (DstNodeAttrV.Len() > 0) {
3498  AddNodeAttributes(DVal, DstNodeAttrV, CurrRowIdx, NodeIntAttrs, NodeFltAttrs, NodeStrAttrs);
3499  }
3500  }
3501 
3502  // aggregate node attributes and add to graph
3503  if (SrcNodeAttrV.Len() > 0 || DstNodeAttrV.Len() > 0) {
3504  for (TNEANet::TNodeI NodeI = Graph->BegNI(); NodeI < Graph->EndNI(); NodeI++) {
3505  TInt NId = NodeI.GetId();
3506  if (NodeIntAttrs.IsKey(NId)) {
3507  TStrIntVH IntAttrVals = NodeIntAttrs.GetDat(NId);
3508  for (TStrIntVH::TIter it = IntAttrVals.BegI(); it < IntAttrVals.EndI(); it++) {
3509  TInt AttrVal = AggregateVector<TInt>(it.GetDat(), AggrPolicy);
3510  Graph->AddIntAttrDatN(NId, AttrVal, it.GetKey());
3511  }
3512  }
3513  if (NodeFltAttrs.IsKey(NId)) {
3514  TStrFltVH FltAttrVals = NodeFltAttrs.GetDat(NId);
3515  for (TStrFltVH::TIter it = FltAttrVals.BegI(); it < FltAttrVals.EndI(); it++) {
3516  TFlt AttrVal = AggregateVector<TFlt>(it.GetDat(), AggrPolicy);
3517  Graph->AddFltAttrDatN(NId, AttrVal, it.GetKey());
3518  }
3519  }
3520  if (NodeStrAttrs.IsKey(NId)) {
3521  TStrStrVH StrAttrVals = NodeStrAttrs.GetDat(NId);
3522  for (TStrStrVH::TIter it = StrAttrVals.BegI(); it < StrAttrVals.EndI(); it++) {
3523  TStr AttrVal = AggregateVector<TStr>(it.GetDat(), AggrPolicy);
3524  Graph->AddStrAttrDatN(NId, AttrVal, it.GetKey());
3525  }
3526  }
3527  }
3528  }
3529 
3530  return Graph;
3531 }
TIter EndI() const
Returns an iterator referring to the past-the-end element in the vector.
Definition: ds.h:595
TStrV EdgeAttrV
List of columns (attributes) to serve as edge attributes.
Definition: table.h:591
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TIter BegI() const
Definition: hash.h:213
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TTableContext * Context
Execution Context.
Definition: table.h:545
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
Node iterator. Only forward iteration (operator++) is supported.
Definition: network.h:1792
TIter EndI() const
Definition: hash.h:218
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1383
const char * GetKey(const int &KeyId) const
Definition: hash.h:893
#define Assert(Cond)
Definition: bd.h:251
TAttrType GetColType(const TStr &ColName) const
Gets type of column ColName.
Definition: table.h:1227
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStrV SrcNodeAttrV
List of columns (attributes) to serve as source node attributes.
Definition: table.h:592
TAttrAggr AggrPolicy
Aggregation policy used for solving conflicts between different values of an attribute of the same no...
Definition: table.h:601
TStrHash< TInt, TBigStrPool > StringVals
StringPool - stores string data values and maps them to integers.
Definition: table.h:182
Definition: dt.h:1134
TStr SrcCol
Column (attribute) to serve as src nodes when constructing the graph.
Definition: table.h:589
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TStrV DstNodeAttrV
List of columns (attributes) to serve as destination node attributes.
Definition: table.h:593
TStr DstCol
Column (attribute) to serve as dst nodes when constructing the graph.
Definition: table.h:590
Definition: dt.h:412
TIter BegI() const
Returns an iterator pointing to the first element in the vector.
Definition: ds.h:593
Definition: hash.h:97
Definition: gbase.h:23
Definition: bd.h:196
void AddEdgeAttributes(PNEANet &Graph, int RowId)
Adds attributes of edge corresponding to RowId to the Graph.
Definition: table.cpp:3395
Definition: gbase.h:23
bool IsKey(const TKey &Key) const
Definition: hash.h:258
static PNEANet New()
Static cons returns pointer to graph. Ex: PNEANet Graph=TNEANet::New().
Definition: network.h:2176
TInt CheckAndAddFltNode(T Graph, THash< TFlt, TInt > &NodeVals, TFlt FNodeVal)
Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals.
Definition: table.h:1533
void AddNodeAttributes(TInt NId, TStrV NodeAttrV, TInt RowId, THash< TInt, TStrIntVH > &NodeIntAttrs, THash< TInt, TStrFltVH > &NodeFltAttrs, THash< TInt, TStrStrVH > &NodeStrAttrs)
Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribut...
Definition: table.cpp:3414
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:430
TTableContext * TTable::ChangeContext ( TTableContext Context)

Changes the current context. Moves all object items to the new context.

Definition at line 921 of file table.cpp.

921  {
922  TInt L = Sch.Len();
923 
924 #if 0
925  // print table on the input, iterate over all columns
926  for (TInt i = 0; i < L; i++) {
927  // skip non-string columns
928  if (GetSchemaColType(i) != atStr) {
929  continue;
930  }
931 
932  TInt ColIdx = GetColIdx(GetSchemaColName(i));
933 
934  // iterate over all rows
935  for (TRowIterator RowI = BegRI(); RowI < EndRI(); RowI++) {
936  TInt RowIdx = RowI.GetRowIdx();
937  TInt KeyId = StrColMaps[ColIdx][RowIdx];
938  printf("ChangeContext in %d %d %d .%s.\n",
939  ColIdx.Val, RowIdx.Val, KeyId.Val, GetStrVal(ColIdx, RowIdx).CStr());
940  }
941  }
942 #endif
943 
944  // add strings to the new context, change values
945  // iterate over all columns
946  for (TInt i = 0; i < L; i++) {
947  // skip non-string columns
948  if (GetSchemaColType(i) != atStr) {
949  continue;
950  }
951 
952  TInt ColIdx = GetColIdx(GetSchemaColName(i));
953 
954  // iterate over all rows
955  for (TRowIterator RowI = BegRI(); RowI < EndRI(); RowI++) {
956  TInt RowIdx = RowI.GetRowIdx();
957  // get the string
958  TStr Key = GetStrVal(ColIdx, RowIdx);
959  // add the string to the new context
960  TInt KeyId = TInt(NewContext->StringVals.AddKey(Key));
961  // change the value in the table
962  StrColMaps[ColIdx][RowIdx] = KeyId;
963  }
964  }
965 
966  // set the new context
967  Context = NewContext;
968  return Context;
969 }
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
Schema Sch
Table Schema.
Definition: table.h:549
int Val
Definition: dt.h:1136
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TTableContext * Context
Execution Context.
Definition: table.h:545
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
Iterator class for TTable rows.
Definition: table.h:330
TAttrType GetSchemaColType(TInt Idx) const
Gets type of the column with index Idx in the schema.
Definition: table.h:640
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TStr GetSchemaColName(TInt Idx) const
Gets name of the column with index Idx in the schema.
Definition: table.h:638
Definition: dt.h:1134
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
TStr GetStrVal(TInt ColIdx, TInt RowIdx) const
Gets the value in column with id ColIdx at row RowIdx.
Definition: table.h:626
Definition: dt.h:412
Definition: gbase.h:23
template<class T >
TInt TTable::CheckAndAddFltNode ( Graph,
THash< TFlt, TInt > &  NodeVals,
TFlt  FNodeVal 
)
protected

Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals.

Definition at line 1533 of file table.h.

1533  {
1534  if (!NodeVals.IsKey(FNodeVal)) {
1535  TInt NodeVal = NodeVals.Len();
1536  Graph->AddNode(NodeVal);
1537  NodeVals.AddKey(FNodeVal);
1538  NodeVals.AddDat(FNodeVal, NodeVal);
1539  return NodeVal;
1540  } else { return NodeVals.GetDat(FNodeVal); }
1541 }
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
Definition: dt.h:1134
int AddKey(const TKey &Key)
Definition: hash.h:373
bool IsKey(const TKey &Key) const
Definition: hash.h:258
int Len() const
Definition: hash.h:228
TDat & AddDat(const TKey &Key)
Definition: hash.h:238
void TTable::CheckAndAddIntNode ( PNEANet  Graph,
THashSet< TInt > &  NodeVals,
TInt  NodeId 
)
inlineprotected

Checks if given NodeId is seen earlier; if not, add it to Graph and hashmap NodeVals.

Definition at line 3388 of file table.cpp.

3388  {
3389  if (!NodeVals.IsKey(NodeId)) {
3390  Graph->AddNode(NodeId);
3391  NodeVals.AddKey(NodeId);
3392  }
3393 }
bool IsKey(const TKey &Key) const
Definition: shash.h:1148
int AddKey(const TKey &Key)
Definition: shash.h:1254
TInt TTable::CheckSortedKeyVal ( TIntV Key,
TIntV Val,
TInt  Start,
TInt  End 
)
staticprotected

Definition at line 5310 of file table.cpp.

5310  {
5311  TInt j;
5312  for (j = Start; j < End; j++) {
5313  if (CompareKeyVal(Key[j], Val[j], Key[j+1], Val[j+1]) > 0) {
5314  break;
5315  }
5316  }
5317  if (j >= End) { return 0; }
5318  else { return 1; }
5319 }
static TInt CompareKeyVal(const TInt &K1, const TInt &V1, const TInt &K2, const TInt &V2)
Definition: table.cpp:5297
Definition: dt.h:1134
void TTable::Classify ( TPredicate Predicate,
const TStr LabelName,
const TInt PositiveLabel = 1,
const TInt NegativeLabel = 0 
)

Definition at line 2805 of file table.cpp.

2805  {
2806  TIntV SelectedRows;
2807  Select(Predicate, SelectedRows, false);
2808  ClassifyAux(SelectedRows, LabelName, PositiveLabel, NegativeLabel);
2809 }
void Select(TPredicate &Predicate, TIntV &SelectedRows, TBool Remove=true)
Selects rows that satisfy given Predicate.
Definition: table.cpp:2750
void ClassifyAux(const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
Adds a label attribute with positive labels on selected rows and negative labels on the rest...
Definition: table.cpp:4694
void TTable::ClassifyAtomic ( const TStr Col1,
const TStr Col2,
TPredComp  Cmp,
const TStr LabelName,
const TInt PositiveLabel = 1,
const TInt NegativeLabel = 0 
)

Definition at line 2866 of file table.cpp.

2867  {
2868  TIntV SelectedRows;
2869  SelectAtomic(Col1, Col2, Cmp, SelectedRows, false);
2870  ClassifyAux(SelectedRows, LabelName, PositiveLabel, NegativeLabel);
2871 }
bool Cmp(const int &RelOp, const TRec &Rec1, const TRec &Rec2)
Definition: bd.h:426
void SelectAtomic(const TStr &Col1, const TStr &Col2, TPredComp Cmp, TIntV &SelectedRows, TBool Remove=true)
Selects rows using atomic compare operation.
Definition: table.cpp:2813
void ClassifyAux(const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
Adds a label attribute with positive labels on selected rows and negative labels on the rest...
Definition: table.cpp:4694
template<class T >
void TTable::ClassifyAtomicConst ( const TStr Col,
const T &  Val,
TPredComp  Cmp,
const TStr LabelName,
const TInt PositiveLabel = 1,
const TInt NegativeLabel = 0 
)
inline

Definition at line 1301 of file table.h.

1302  {
1303  TIntV SelectedRows;
1304  PTable SelectedTable;
1305  SelectAtomicConst(Col, TPrimitive(Val), Cmp, SelectedRows, SelectedTable, false, false);
1306  ClassifyAux(SelectedRows, LabelName, PositiveLabel, NegativeLabel);
1307  }
Primitive class: Wrapper around primitive data types.
Definition: table.h:211
void SelectAtomicConst(const TStr &Col, const TPrimitive &Val, TPredComp Cmp, TIntV &SelectedRows, PTable &SelectedTable, TBool Remove=true, TBool Table=true)
Selects rows where the value of Col matches given primitive Val.
Definition: table.cpp:2873
Definition: bd.h:196
bool Cmp(const int &RelOp, const TRec &Rec1, const TRec &Rec2)
Definition: bd.h:426
void ClassifyAux(const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0)
Adds a label attribute with positive labels on selected rows and negative labels on the rest...
Definition: table.cpp:4694
void TTable::ClassifyAux ( const TIntV SelectedRows,
const TStr LabelName,
const TInt PositiveLabel = 1,
const TInt NegativeLabel = 0 
)
protected

Adds a label attribute with positive labels on selected rows and negative labels on the rest.

Definition at line 4694 of file table.cpp.

4694  {
4695  AddSchemaCol(LabelName, atInt);
4696  TInt LabelColIdx = IntCols.Len();
4697  AddColType(LabelName, atInt, LabelColIdx);
4699  for (TInt i = 0; i < NumRows; i++) {
4700  IntCols[LabelColIdx][i] = NegativeLabel;
4701  }
4702  for (TInt i = 0; i < SelectedRows.Len(); i++) {
4703  IntCols[LabelColIdx][SelectedRows[i]] = PositiveLabel;
4704  }
4705 }
void AddSchemaCol(const TStr &ColName, TAttrType ColType)
Adds column with name ColName and type ColType to the schema.
Definition: table.h:642
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1134
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
void AddColType(const TStr &ColName, TPair< TAttrType, TInt > ColType)
Adds column with name ColName and type ColType to the ColTypeMap.
Definition: table.h:651
TVec< TInt > TIntV
Definition: ds.h:1594
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void TTable::ColAdd ( const TStr Attr1,
const TStr Attr2,
const TStr ResultAttrName = "" 
)

Performs columnwise addition. See TTable::ColGenericOp.

Definition at line 4816 of file table.cpp.

4816  {
4817  ColGenericOp(Attr1, Attr2, ResultAttrName, aoAdd);
4818 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColAdd ( const TStr Attr1,
TTable Table,
const TStr Attr2,
const TStr ResAttr = "",
TBool  AddToFirstTable = true 
)

Performs columnwise addition with column of given table.

Definition at line 4949 of file table.cpp.

4950  {
4951  ColGenericOp(Attr1, Table, Attr2, ResultAttrName, aoAdd, AddToFirstTable);
4952 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColAdd ( const TStr Attr1,
const TFlt Num,
const TStr ResultAttrName = "",
const TBool  floatCast = false 
)

Performs addition of column values and given Num.

Definition at line 5063 of file table.cpp.

5063  {
5064  ColGenericOp(Attr1, Num, ResultAttrName, aoAdd, floatCast);
5065 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColConcat ( const TStr Attr1,
const TStr Attr2,
const TStr Sep = "",
const TStr ResAttr = "" 
)

Concatenates two string columns.

Definition at line 5083 of file table.cpp.

5083  {
5084  // check if attributes are valid
5085  if (!IsAttr(Attr1)) TExcept::Throw("No attribute present: " + Attr1);
5086  if (!IsAttr(Attr2)) TExcept::Throw("No attribute present: " + Attr2);
5087 
5088  TPair<TAttrType, TInt> Info1 = GetColTypeMap(Attr1);
5089  TPair<TAttrType, TInt> Info2 = GetColTypeMap(Attr2);
5090 
5091  if (Info1.Val1 != atStr || Info2.Val1 != atStr) {
5092  TExcept::Throw("Only string columns supported in concat.");
5093  }
5094 
5095  // source column indices
5096  TInt ColIdx1 = Info1.Val2;
5097  TInt ColIdx2 = Info2.Val2;
5098 
5099  // destination column index
5100  TInt ColIdx3 = ColIdx1;
5101 
5102  // Create empty result column with type that of first attribute
5103  if (ResAttr != "") {
5104  AddStrCol(ResAttr);
5105  ColIdx3 = GetColIdx(ResAttr);
5106  }
5107 
5108  for (TRowIterator RowI = BegRI(); RowI < EndRI(); RowI++) {
5109  TStr CurVal1 = RowI.GetStrAttr(ColIdx1);
5110  TStr CurVal2 = RowI.GetStrAttr(ColIdx2);
5111  TStr NewVal = CurVal1 + Sep + CurVal2;
5112  TInt Key = TInt(Context->StringVals.AddKey(NewVal));
5113  StrColMaps[ColIdx3][RowI.GetRowIdx()] = Key;
5114  }
5115 }
TBool IsAttr(const TStr &Attr)
Checks if Attr is an attribute of this table schema.
Definition: table.cpp:4628
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
TTableContext * Context
Execution Context.
Definition: table.h:545
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TPair< TAttrType, TInt > GetColTypeMap(const TStr &ColName) const
Gets column type and index of ColName.
Definition: table.h:666
TStrHash< TInt, TBigStrPool > StringVals
StringPool - stores string data values and maps them to integers.
Definition: table.h:182
int AddKey(const char *Key)
Definition: hash.h:968
Definition: dt.h:1134
void AddStrCol(const TStr &ColName)
Adds a string column with name ColName.
Definition: table.cpp:4687
Definition: ds.h:32
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
Definition: dt.h:412
TVal1 Val1
Definition: ds.h:34
TVal2 Val2
Definition: ds.h:35
Definition: gbase.h:23
void TTable::ColConcat ( const TStr Attr1,
TTable Table,
const TStr Attr2,
const TStr Sep = "",
const TStr ResAttr = "",
TBool  AddToFirstTable = true 
)

Concatenates string column with column of given table.

Definition at line 5117 of file table.cpp.

5118  {
5119  // check if attributes are valid
5120  if (!IsAttr(Attr1)) { TExcept::Throw("No attribute present: " + Attr1); }
5121  if (!Table.IsAttr(Attr2)) { TExcept::Throw("No attribute present: " + Attr2); }
5122 
5123  if (NumValidRows != Table.NumValidRows) {
5124  TExcept::Throw("Tables do not have equal number of rows");
5125  }
5126 
5127  TPair<TAttrType, TInt> Info1 = GetColTypeMap(Attr1);
5128  TPair<TAttrType, TInt> Info2 = Table.GetColTypeMap(Attr2);
5129 
5130  if (Info1.Val1 != atStr || Info2.Val1 != atStr) {
5131  TExcept::Throw("Only string columns supported in concat.");
5132  }
5133 
5134  // source column indices
5135  TInt ColIdx1 = Info1.Val2;
5136  TInt ColIdx2 = Info2.Val2;
5137 
5138  // destination column index
5139  TInt ColIdx3 = ColIdx1;
5140 
5141  if (!AddToFirstTable) {
5142  ColIdx3 = ColIdx2;
5143  }
5144 
5145  // Create empty result column in appropriate table with type that of first attribute
5146  if (ResAttr != "") {
5147  if (AddToFirstTable) {
5148  AddStrCol(ResAttr);
5149  ColIdx3 = GetColIdx(ResAttr);
5150  }
5151  else {
5152  Table.AddStrCol(ResAttr);
5153  ColIdx3 = Table.GetColIdx(ResAttr);
5154  }
5155  }
5156 
5157  TRowIterator RI1, RI2;
5158 
5159  RI1 = BegRI();
5160  RI2 = Table.BegRI();
5161 
5162  while (RI1 < EndRI() && RI2 < Table.EndRI()) {
5163  TStr CurVal1 = RI1.GetStrAttr(ColIdx1);
5164  TStr CurVal2 = RI2.GetStrAttr(ColIdx2);
5165  TStr NewVal = CurVal1 + Sep + CurVal2;
5166  TInt Key = TInt(Context->StringVals.AddKey(NewVal));
5167  if (AddToFirstTable) {
5168  StrColMaps[ColIdx3][RI1.GetRowIdx()] = Key;
5169  }
5170  else {
5171  Table.StrColMaps[ColIdx3][RI2.GetRowIdx()] = Key;
5172  }
5173  RI1++;
5174  RI2++;
5175  }
5176 
5177  if (RI1 != EndRI() || RI2 != Table.EndRI()) {
5178  TExcept::Throw("ColGenericOp: Iteration error");
5179  }
5180 }
TBool IsAttr(const TStr &Attr)
Checks if Attr is an attribute of this table schema.
Definition: table.cpp:4628
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
TStr GetStrAttr(TInt ColIdx) const
Returns value of string attribute specified by string column index for current row.
Definition: table.cpp:163
TTableContext * Context
Execution Context.
Definition: table.h:545
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TInt GetRowIdx() const
Gets the id of the row pointed by this iterator.
Definition: table.cpp:151
TPair< TAttrType, TInt > GetColTypeMap(const TStr &ColName) const
Gets column type and index of ColName.
Definition: table.h:666
TStrHash< TInt, TBigStrPool > StringVals
StringPool - stores string data values and maps them to integers.
Definition: table.h:182
int AddKey(const char *Key)
Definition: hash.h:968
Definition: dt.h:1134
void AddStrCol(const TStr &ColName)
Adds a string column with name ColName.
Definition: table.cpp:4687
Definition: ds.h:32
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
Definition: dt.h:412
TVal1 Val1
Definition: ds.h:34
TVal2 Val2
Definition: ds.h:35
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
void TTable::ColConcatConst ( const TStr Attr1,
const TStr Val,
const TStr Sep = "",
const TStr ResAttr = "" 
)

Concatenates column values with given string value.

Definition at line 5182 of file table.cpp.

5182  {
5183  // check if attribute is valid
5184  if (!IsAttr(Attr1)) { TExcept::Throw("No attribute present: " + Attr1); }
5185 
5186  TPair<TAttrType, TInt> Info1 = GetColTypeMap(Attr1);
5187 
5188  if (Info1.Val1 != atStr) {
5189  TExcept::Throw("Only string columns supported in concat.");
5190  }
5191 
5192  // source column index
5193  TInt ColIdx1 = Info1.Val2;
5194 
5195  // destination column index
5196  TInt ColIdx2 = ColIdx1;
5197 
5198  // Create empty result column with type that of first attribute
5199  if (ResAttr != "") {
5200  AddStrCol(ResAttr);
5201  ColIdx2 = GetColIdx(ResAttr);
5202  }
5203 
5204  for (TRowIterator RowI = BegRI(); RowI < EndRI(); RowI++) {
5205  TStr CurVal = RowI.GetStrAttr(ColIdx1);
5206  TStr NewVal = CurVal + Sep + Val;
5207  TInt Key = TInt(Context->StringVals.AddKey(NewVal));
5208  StrColMaps[ColIdx2][RowI.GetRowIdx()] = Key;
5209  }
5210 }
TBool IsAttr(const TStr &Attr)
Checks if Attr is an attribute of this table schema.
Definition: table.cpp:4628
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
TTableContext * Context
Execution Context.
Definition: table.h:545
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TPair< TAttrType, TInt > GetColTypeMap(const TStr &ColName) const
Gets column type and index of ColName.
Definition: table.h:666
TStrHash< TInt, TBigStrPool > StringVals
StringPool - stores string data values and maps them to integers.
Definition: table.h:182
int AddKey(const char *Key)
Definition: hash.h:968
Definition: dt.h:1134
void AddStrCol(const TStr &ColName)
Adds a string column with name ColName.
Definition: table.cpp:4687
Definition: ds.h:32
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
Definition: dt.h:412
TVal1 Val1
Definition: ds.h:34
TVal2 Val2
Definition: ds.h:35
Definition: gbase.h:23
void TTable::ColDiv ( const TStr Attr1,
const TStr Attr2,
const TStr ResultAttrName = "" 
)

Performs columnwise division. See TTable::ColGenericOp.

Definition at line 4828 of file table.cpp.

4828  {
4829  ColGenericOp(Attr1, Attr2, ResultAttrName, aoDiv);
4830 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColDiv ( const TStr Attr1,
TTable Table,
const TStr Attr2,
const TStr ResAttr = "",
TBool  AddToFirstTable = true 
)

Performs columnwise division with column of given table.

Definition at line 4964 of file table.cpp.

4965  {
4966  ColGenericOp(Attr1, Table, Attr2, ResultAttrName, aoDiv, AddToFirstTable);
4967 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColDiv ( const TStr Attr1,
const TFlt Num,
const TStr ResultAttrName = "",
const TBool  floatCast = false 
)

Performs division of column values and given Num.

Definition at line 5075 of file table.cpp.

5075  {
5076  ColGenericOp(Attr1, Num, ResultAttrName, aoDiv, floatCast);
5077 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColGenericOp ( const TStr Attr1,
const TStr Attr2,
const TStr ResAttr,
TArithOp  op 
)

Performs columnwise arithmetic operation.

Performs Attr1 OP Attr2 and stores it in Attr1 If ResAttr != "", result is stored in a new column ResAttr

Definition at line 4752 of file table.cpp.

4752  {
4753  // check if attributes are valid
4754  if (!IsAttr(Attr1)) TExcept::Throw("No attribute present: " + Attr1);
4755  if (!IsAttr(Attr2)) TExcept::Throw("No attribute present: " + Attr2);
4756  TPair<TAttrType, TInt> Info1 = GetColTypeMap(Attr1);
4757  TPair<TAttrType, TInt> Info2 = GetColTypeMap(Attr2);
4758  TAttrType Arg1Type = Info1.Val1;
4759  TAttrType Arg2Type = Info2.Val1;
4760  if (Arg1Type == atStr || Arg2Type == atStr) {
4761  TExcept::Throw("Only numeric columns supported in arithmetic operations.");
4762  }
4763  if(Arg1Type == atInt && Arg2Type == atFlt && ResAttr == ""){
4764  TExcept::Throw("Trying to write float values to an existing int-typed column");
4765  }
4766  // source column indices
4767  TInt ColIdx1 = Info1.Val2;
4768  TInt ColIdx2 = Info2.Val2;
4769 
4770  // destination column index
4771  TInt ColIdx3 = ColIdx1;
4772  // Create empty result column with type that of first attribute
4773  if (ResAttr != "") {
4774  if (Arg1Type == atInt && Arg2Type == atInt) {
4775  AddIntCol(ResAttr);
4776  }
4777  else {
4778  AddFltCol(ResAttr);
4779  }
4780  ColIdx3 = GetColIdx(ResAttr);
4781  }
4782 #ifdef USE_OPENMP
4783  if(GetMP()){
4784  ColGenericOpMP(ColIdx1, ColIdx2, Arg1Type, Arg2Type, ColIdx3, op);
4785  return;
4786  }
4787 #endif //USE_OPENMP
4788  TAttrType ResType = atFlt;
4789  if(Arg1Type == atInt && Arg2Type == atInt){ printf("hooray!\n"); ResType = atInt;}
4790  for (TRowIterator RowI = BegRI(); RowI < EndRI(); RowI++) {
4791  //printf("%d %d %d %d\n", ColIdx1.Val, ColIdx2.Val, ColIdx3.Val, RowI.GetRowIdx().Val);
4792  if(ResType == atInt){
4793  TInt V1 = RowI.GetIntAttr(ColIdx1);
4794  TInt V2 = RowI.GetIntAttr(ColIdx2);
4795  if (op == aoAdd) { IntCols[ColIdx3][RowI.GetRowIdx()] = V1 + V2; }
4796  if (op == aoSub) { IntCols[ColIdx3][RowI.GetRowIdx()] = V1 - V2; }
4797  if (op == aoMul) { IntCols[ColIdx3][RowI.GetRowIdx()] = V1 * V2; }
4798  if (op == aoDiv) { IntCols[ColIdx3][RowI.GetRowIdx()] = V1 / V2; }
4799  if (op == aoMod) { IntCols[ColIdx3][RowI.GetRowIdx()] = V1 % V2; }
4800  if (op == aoMin) { IntCols[ColIdx3][RowI.GetRowIdx()] = (V1 < V2) ? V1 : V2;}
4801  if (op == aoMax) { IntCols[ColIdx3][RowI.GetRowIdx()] = (V1 > V2) ? V1 : V2;}
4802  } else{
4803  TFlt V1 = (Arg1Type == atInt) ? (TFlt)RowI.GetIntAttr(ColIdx1) : RowI.GetFltAttr(ColIdx1);
4804  TFlt V2 = (Arg2Type == atInt) ? (TFlt)RowI.GetIntAttr(ColIdx2) : RowI.GetFltAttr(ColIdx2);
4805  if (op == aoAdd) { FltCols[ColIdx3][RowI.GetRowIdx()] = V1 + V2; }
4806  if (op == aoSub) { FltCols[ColIdx3][RowI.GetRowIdx()] = V1 - V2; }
4807  if (op == aoMul) { FltCols[ColIdx3][RowI.GetRowIdx()] = V1 * V2; }
4808  if (op == aoDiv) { FltCols[ColIdx3][RowI.GetRowIdx()] = V1 / V2; }
4809  if (op == aoMod) { TExcept::Throw("Cannot find modulo for float columns"); }
4810  if (op == aoMin) { FltCols[ColIdx3][RowI.GetRowIdx()] = (V1 < V2) ? V1 : V2;}
4811  if (op == aoMax) { FltCols[ColIdx3][RowI.GetRowIdx()] = (V1 > V2) ? V1 : V2;}
4812  }
4813  }
4814 }
Definition: table.h:259
Definition: table.h:259
TBool IsAttr(const TStr &Attr)
Checks if Attr is an attribute of this table schema.
Definition: table.cpp:4628
Definition: table.h:259
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
void AddIntCol(const TStr &ColName)
Adds an integer column with name ColName.
Definition: table.cpp:4673
Definition: table.h:259
static TInt GetMP()
Definition: table.h:527
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1383
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TPair< TAttrType, TInt > GetColTypeMap(const TStr &ColName) const
Gets column type and index of ColName.
Definition: table.h:666
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: ds.h:32
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
Definition: gbase.h:23
Definition: table.h:259
TVal1 Val1
Definition: ds.h:34
TVal2 Val2
Definition: ds.h:35
void AddFltCol(const TStr &ColName)
Adds a float column with name ColName.
Definition: table.cpp:4680
Definition: gbase.h:23
Definition: table.h:259
Definition: table.h:259
void ColGenericOpMP(TInt ArgColIdx1, TInt ArgColIdx2, TAttrType ArgType1, TAttrType ArgType2, TInt ResColIdx, TArithOp op)
Definition: table.cpp:4708
void TTable::ColGenericOp ( const TStr Attr1,
TTable Table,
const TStr Attr2,
const TStr ResAttr,
TArithOp  op,
TBool  AddToFirstTable 
)

Performs columnwise arithmetic operation with column of given table.

Definition at line 4844 of file table.cpp.

4845  {
4846  // check if attributes are valid
4847  if (!IsAttr(Attr1)) { TExcept::Throw("No attribute present: " + Attr1); }
4848  if (!Table.IsAttr(Attr2)) { TExcept::Throw("No attribute present: " + Attr2); }
4849 
4850  if (NumValidRows != Table.NumValidRows) {
4851  TExcept::Throw("Tables do not have equal number of rows");
4852  }
4853 
4854  TPair<TAttrType, TInt> Info1 = GetColTypeMap(Attr1);
4855  TPair<TAttrType, TInt> Info2 = Table.GetColTypeMap(Attr2);
4856  TAttrType Arg1Type = Info1.Val1;
4857  TAttrType Arg2Type = Info2.Val1;
4858  if (Info1.Val1 == atStr || Info2.Val1 == atStr) {
4859  TExcept::Throw("Only numeric columns supported in arithmetic operations.");
4860  }
4861  if(Arg1Type == atInt && Arg2Type == atFlt && ResAttr == ""){
4862  TExcept::Throw("Trying to write float values to an existing int-typed column");
4863  }
4864  // source column indices
4865  TInt ColIdx1 = Info1.Val2;
4866  TInt ColIdx2 = Info2.Val2;
4867 
4868  // destination column index
4869  TInt ColIdx3 = AddToFirstTable ? ColIdx1 : ColIdx2;
4870 
4871  // Create empty result column in appropriate table with type that of first attribute
4872  if (ResAttr != "") {
4873  if (AddToFirstTable) {
4874  if (Arg1Type == atInt && Arg2Type == atInt) {
4875  AddIntCol(ResAttr);
4876  } else {
4877  AddFltCol(ResAttr);
4878  }
4879  ColIdx3 = GetColIdx(ResAttr);
4880  }
4881  else {
4882  if (Arg1Type == atInt && Arg2Type == atInt) {
4883  Table.AddIntCol(ResAttr);
4884  } else {
4885  Table.AddFltCol(ResAttr);
4886  }
4887  ColIdx3 = Table.GetColIdx(ResAttr);
4888  }
4889  }
4890 
4891  /*
4892  #ifdef USE_OPENMP
4893  if(GetMP()){
4894  ColGenericOpMP(Table, AddToFirstTable, ColIdx1, ColIdx2, Arg1Type, Arg2Type, ColIdx3, op);
4895  return;
4896  }
4897  #endif //USE_OPENMP
4898  */
4899 
4900  TRowIterator RI1, RI2;
4901  RI1 = BegRI();
4902  RI2 = Table.BegRI();
4903  TAttrType ResType = atFlt;
4904  if(Arg1Type == atInt && Arg2Type == atInt){ ResType = atInt;}
4905  while (RI1 < EndRI() && RI2 < Table.EndRI()) {
4906  if (ResType == atInt) {
4907  TInt V1 = RI1.GetIntAttr(ColIdx1);
4908  TInt V2 = RI2.GetIntAttr(ColIdx2);
4909  if (AddToFirstTable) {
4910  if (op == aoAdd) { IntCols[ColIdx3][RI1.GetRowIdx()] = V1 + V2; }
4911  if (op == aoSub) { IntCols[ColIdx3][RI1.GetRowIdx()] = V1 - V2; }
4912  if (op == aoMul) { IntCols[ColIdx3][RI1.GetRowIdx()] = V1 * V2; }
4913  if (op == aoDiv) { IntCols[ColIdx3][RI1.GetRowIdx()] = V1 / V2; }
4914  if (op == aoMod) { IntCols[ColIdx3][RI1.GetRowIdx()] = V1 % V2; }
4915  }
4916  else {
4917  if (op == aoAdd) { Table.IntCols[ColIdx3][RI2.GetRowIdx()] = V1 + V2; }
4918  if (op == aoSub) { Table.IntCols[ColIdx3][RI2.GetRowIdx()] = V1 - V2; }
4919  if (op == aoMul) { Table.IntCols[ColIdx3][RI2.GetRowIdx()] = V1 * V2; }
4920  if (op == aoDiv) { Table.IntCols[ColIdx3][RI2.GetRowIdx()] = V1 / V2; }
4921  if (op == aoMod) { Table.IntCols[ColIdx3][RI2.GetRowIdx()] = V1 % V2; }
4922  }
4923  } else {
4924  TFlt V1 = (Arg1Type == atInt) ? (TFlt)RI1.GetIntAttr(ColIdx1) : RI2.GetFltAttr(ColIdx1);
4925  TFlt V2 = (Arg2Type == atInt) ? (TFlt)RI1.GetIntAttr(ColIdx2) : RI2.GetFltAttr(ColIdx2);
4926  if (AddToFirstTable) {
4927  if (op == aoAdd) { FltCols[ColIdx3][RI1.GetRowIdx()] = V1 + V2; }
4928  if (op == aoSub) { FltCols[ColIdx3][RI1.GetRowIdx()] = V1 - V2; }
4929  if (op == aoMul) { FltCols[ColIdx3][RI1.GetRowIdx()] = V1 * V2; }
4930  if (op == aoDiv) { FltCols[ColIdx3][RI1.GetRowIdx()] = V1 / V2; }
4931  if (op == aoMod) { TExcept::Throw("Cannot find modulo for float columns"); }
4932  } else {
4933  if (op == aoAdd) { Table.FltCols[ColIdx3][RI2.GetRowIdx()] = V1 + V2; }
4934  if (op == aoSub) { Table.FltCols[ColIdx3][RI2.GetRowIdx()] = V1 - V2; }
4935  if (op == aoMul) { Table.FltCols[ColIdx3][RI2.GetRowIdx()] = V1 * V2; }
4936  if (op == aoDiv) { Table.FltCols[ColIdx3][RI2.GetRowIdx()] = V1 / V2; }
4937  if (op == aoMod) { TExcept::Throw("Cannot find modulo for float columns"); }
4938  }
4939  }
4940  RI1++;
4941  RI2++;
4942  }
4943 
4944  if (RI1 != EndRI() || RI2 != Table.EndRI()) {
4945  TExcept::Throw("ColGenericOp: Iteration error");
4946  }
4947 }
Definition: table.h:259
TFlt GetFltAttr(TInt ColIdx) const
Returns value of floating point attribute specified by float column index for current row...
Definition: table.cpp:159
Definition: table.h:259
TBool IsAttr(const TStr &Attr)
Checks if Attr is an attribute of this table schema.
Definition: table.cpp:4628
Definition: table.h:259
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
TInt GetIntAttr(TInt ColIdx) const
Returns value of integer attribute specified by integer column index for current row.
Definition: table.cpp:155
void AddIntCol(const TStr &ColName)
Adds an integer column with name ColName.
Definition: table.cpp:4673
Definition: table.h:259
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1383
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TInt GetRowIdx() const
Gets the id of the row pointed by this iterator.
Definition: table.cpp:151
TPair< TAttrType, TInt > GetColTypeMap(const TStr &ColName) const
Gets column type and index of ColName.
Definition: table.h:666
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: ds.h:32
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
Definition: gbase.h:23
TVal1 Val1
Definition: ds.h:34
TVal2 Val2
Definition: ds.h:35
void AddFltCol(const TStr &ColName)
Adds a float column with name ColName.
Definition: table.cpp:4680
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
Definition: gbase.h:23
Definition: table.h:259
void TTable::ColGenericOp ( const TStr Attr1,
const TFlt Num,
const TStr ResAttr,
TArithOp  op,
const TBool  floatCast 
)

Performs arithmetic op of column values and given Num.

Definition at line 4975 of file table.cpp.

4975  {
4976  // check if attribute is valid
4977  if (!IsAttr(Attr1)) { TExcept::Throw("No attribute present: " + Attr1); }
4978 
4979  TPair<TAttrType, TInt> Info1 = GetColTypeMap(Attr1);
4980  TAttrType ArgType = Info1.Val1;
4981  if (ArgType == atStr) {
4982  TExcept::Throw("Only numeric columns supported in arithmetic operations.");
4983  }
4984  // source column index
4985  TInt ColIdx1 = Info1.Val2;
4986  // destination column index
4987  TInt ColIdx2 = ColIdx1;
4988 
4989  // Create empty result column with type that of first attribute
4990  TBool shouldCast = floatCast;
4991  if (ResAttr != "") {
4992  if ((ArgType == atInt) & !shouldCast) {
4993  AddIntCol(ResAttr);
4994  } else {
4995  AddFltCol(ResAttr);
4996  }
4997  ColIdx2 = GetColIdx(ResAttr);
4998  } else {
4999  // Cannot change type of existing attribute
5000  shouldCast = false;
5001  }
5002 
5003  #ifdef USE_OPENMP
5004  if(GetMP()){
5005  ColGenericOpMP(ColIdx1, ColIdx2, ArgType, Num, op, shouldCast);
5006  return;
5007  }
5008  #endif //USE_OPENMP
5009 
5010  for (TRowIterator RowI = BegRI(); RowI < EndRI(); RowI++) {
5011  if ((ArgType == atInt) && !shouldCast) {
5012  TInt CurVal = RowI.GetIntAttr(ColIdx1);
5013  TInt Val = static_cast<int>(Num);
5014  if (op == aoAdd) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal + Val; }
5015  if (op == aoSub) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal - Val; }
5016  if (op == aoMul) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal * Val; }
5017  if (op == aoDiv) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal / Val; }
5018  if (op == aoMod) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal % Val; }
5019  }
5020  else {
5021  TFlt CurVal = (ArgType == atFlt) ? RowI.GetFltAttr(ColIdx1) : (TFlt) RowI.GetIntAttr(ColIdx1);
5022  if (op == aoAdd) { FltCols[ColIdx2][RowI.GetRowIdx()] = CurVal + Num; }
5023  if (op == aoSub) { FltCols[ColIdx2][RowI.GetRowIdx()] = CurVal - Num; }
5024  if (op == aoMul) { FltCols[ColIdx2][RowI.GetRowIdx()] = CurVal * Num; }
5025  if (op == aoDiv) { FltCols[ColIdx2][RowI.GetRowIdx()] = CurVal / Num; }
5026  if (op == aoMod) { TExcept::Throw("Cannot find modulo for float columns"); }
5027  }
5028  }
5029 }
Definition: table.h:259
Definition: table.h:259
TBool IsAttr(const TStr &Attr)
Checks if Attr is an attribute of this table schema.
Definition: table.cpp:4628
Definition: table.h:259
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
void AddIntCol(const TStr &ColName)
Adds an integer column with name ColName.
Definition: table.cpp:4673
Definition: table.h:259
static TInt GetMP()
Definition: table.h:527
TRowIterator BegRI() const
Gets iterator to the first valid row of the table.
Definition: table.h:1241
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1383
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
TPair< TAttrType, TInt > GetColTypeMap(const TStr &ColName) const
Gets column type and index of ColName.
Definition: table.h:666
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: ds.h:32
TRowIterator EndRI() const
Gets iterator to the last valid row of the table.
Definition: table.h:1243
Definition: gbase.h:23
TVal1 Val1
Definition: ds.h:34
TVal2 Val2
Definition: ds.h:35
void AddFltCol(const TStr &ColName)
Adds a float column with name ColName.
Definition: table.cpp:4680
Definition: gbase.h:23
Definition: dt.h:971
Definition: table.h:259
void ColGenericOpMP(TInt ArgColIdx1, TInt ArgColIdx2, TAttrType ArgType1, TAttrType ArgType2, TInt ResColIdx, TArithOp op)
Definition: table.cpp:4708
void TTable::ColGenericOpMP ( TInt  ArgColIdx1,
TInt  ArgColIdx2,
TAttrType  ArgType1,
TAttrType  ArgType2,
TInt  ResColIdx,
TArithOp  op 
)

Definition at line 4708 of file table.cpp.

4708  {
4709  TAttrType ResType = atFlt;
4710  if(ArgType1 == atInt && ArgType2 == atInt){ ResType = atInt;}
4711  TIntPrV Partitions;
4712  GetPartitionRanges(Partitions, omp_get_max_threads()*CHUNKS_PER_THREAD);
4713  TInt PartitionSize = Partitions[0].GetVal2()-Partitions[0].GetVal1()+1;
4714  #pragma omp parallel for schedule(dynamic, CHUNKS_PER_THREAD)
4715  for (int i = 0; i < Partitions.Len(); i++){
4716  TRowIterator RowI(Partitions[i].GetVal1(), this);
4717  TRowIterator EndI(Partitions[i].GetVal2(), this);
4718  while(RowI < EndI){
4719  if(ResType == atInt){
4720  TInt V1 = RowI.GetIntAttr(ArgColIdx1);
4721  TInt V2 = RowI.GetIntAttr(ArgColIdx2);
4722  if (op == aoAdd) { IntCols[ResColIdx][RowI.GetRowIdx()] = V1 + V2; }
4723  if (op == aoSub) { IntCols[ResColIdx][RowI.GetRowIdx()] = V1 - V2; }
4724  if (op == aoMul) { IntCols[ResColIdx][RowI.GetRowIdx()] = V1 * V2; }
4725  if (op == aoDiv) { IntCols[ResColIdx][RowI.GetRowIdx()] = V1 / V2; }
4726  if (op == aoMod) { IntCols[ResColIdx][RowI.GetRowIdx()] = V1 % V2; }
4727  if (op == aoMin) { IntCols[ResColIdx][RowI.GetRowIdx()] = (V1 < V2) ? V1 : V2;}
4728  if (op == aoMax) { IntCols[ResColIdx][RowI.GetRowIdx()] = (V1 > V2) ? V1 : V2;}
4729  } else{
4730  TFlt V1 = (ArgType1 == atInt) ? (TFlt)RowI.GetIntAttr(ArgColIdx1) : RowI.GetFltAttr(ArgColIdx1);
4731  TFlt V2 = (ArgType2 == atInt) ? (TFlt)RowI.GetIntAttr(ArgColIdx2) : RowI.GetFltAttr(ArgColIdx2);
4732  if (op == aoAdd) { FltCols[ResColIdx][RowI.GetRowIdx()] = V1 + V2; }
4733  if (op == aoSub) { FltCols[ResColIdx][RowI.GetRowIdx()] = V1 - V2; }
4734  if (op == aoMul) { FltCols[ResColIdx][RowI.GetRowIdx()] = V1 * V2; }
4735  if (op == aoDiv) { FltCols[ResColIdx][RowI.GetRowIdx()] = V1 / V2; }
4736  if (op == aoMod) { TExcept::Throw("Cannot find modulo for float columns"); }
4737  if (op == aoMin) { FltCols[ResColIdx][RowI.GetRowIdx()] = (V1 < V2) ? V1 : V2;}
4738  if (op == aoMax) { FltCols[ResColIdx][RowI.GetRowIdx()] = (V1 > V2) ? V1 : V2;}
4739  }
4740  RowI++;
4741  }
4742  }
4743 }
Definition: table.h:259
Definition: table.h:259
Definition: table.h:259
enum TAttrType_ TAttrType
Types for tables, sparse and dense attributes.
void GetPartitionRanges(TIntPrV &Partitions, TInt NumPartitions) const
Partitions the table into NumPartitions and populate Partitions with the ranges.
Definition: table.cpp:1177
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Definition: table.h:259
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1383
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: gbase.h:23
Definition: table.h:259
Definition: table.h:259
Definition: table.h:259
void TTable::ColGenericOpMP ( const TInt ColIdx1,
const TInt ColIdx2,
TAttrType  ArgType,
const TFlt Num,
TArithOp  op,
TBool  ShouldCast 
)

Definition at line 5032 of file table.cpp.

5032  {
5033  TIntPrV Partitions;
5034  GetPartitionRanges(Partitions, omp_get_max_threads()*CHUNKS_PER_THREAD);
5035  TInt PartitionSize = Partitions[0].GetVal2()-Partitions[0].GetVal1()+1;
5036  #pragma omp parallel for schedule(dynamic, CHUNKS_PER_THREAD)
5037  for (int i = 0; i < Partitions.Len(); i++){
5038  TRowIterator RowI(Partitions[i].GetVal1(), this);
5039  TRowIterator EndI(Partitions[i].GetVal2(), this);
5040  while(RowI < EndI){
5041  if ((ArgType == atInt) && !ShouldCast) {
5042  TInt CurVal = RowI.GetIntAttr(ColIdx1);
5043  TInt Val = static_cast<int>(Num);
5044  if (op == aoAdd) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal + Val; }
5045  if (op == aoSub) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal - Val; }
5046  if (op == aoMul) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal * Val; }
5047  if (op == aoDiv) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal / Val; }
5048  if (op == aoMod) { IntCols[ColIdx2][RowI.GetRowIdx()] = CurVal % Val; }
5049  } else {
5050  TFlt CurVal = (ArgType == atFlt) ? RowI.GetFltAttr(ColIdx1) : (TFlt) RowI.GetIntAttr(ColIdx1);
5051  if (op == aoAdd) { FltCols[ColIdx2][RowI.GetRowIdx()] = CurVal + Num; }
5052  if (op == aoSub) { FltCols[ColIdx2][RowI.GetRowIdx()] = CurVal - Num; }
5053  if (op == aoMul) { FltCols[ColIdx2][RowI.GetRowIdx()] = CurVal * Num; }
5054  if (op == aoDiv) { FltCols[ColIdx2][RowI.GetRowIdx()] = CurVal / Num; }
5055  if (op == aoMod) { TExcept::Throw("Cannot find modulo for float columns"); }
5056  }
5057  RowI++;
5058  }
5059  }
5060 }
Definition: table.h:259
Definition: table.h:259
Definition: table.h:259
void GetPartitionRanges(TIntPrV &Partitions, TInt NumPartitions) const
Partitions the table into NumPartitions and populate Partitions with the ranges.
Definition: table.cpp:1177
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Definition: table.h:259
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
Definition: dt.h:1383
Iterator class for TTable rows.
Definition: table.h:330
static void Throw(const TStr &MsgStr)
Definition: ut.h:187
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
Definition: gbase.h:23
Definition: table.h:259
void TTable::ColMax ( const TStr Attr1,
const TStr Attr2,
const TStr ResultAttrName = "" 
)

Performs max of two columns. See TTable::ColGenericOp.

Definition at line 4840 of file table.cpp.

4840  {
4841  ColGenericOp(Attr1, Attr2, ResultAttrName, aoMax);
4842 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColMin ( const TStr Attr1,
const TStr Attr2,
const TStr ResultAttrName = "" 
)

Performs min of two columns. See TTable::ColGenericOp.

Definition at line 4836 of file table.cpp.

4836  {
4837  ColGenericOp(Attr1, Attr2, ResultAttrName, aoMin);
4838 }
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
Definition: table.h:259
void TTable::ColMod ( const TStr Attr1,
const TStr Attr2,
const TStr ResultAttrName = "" 
)

Performs columnwise modulus. See TTable::ColGenericOp.

Definition at line 4832 of file table.cpp.

4832  {
4833  ColGenericOp(Attr1, Attr2, ResultAttrName, aoMod);
4834 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColMod ( const TStr Attr1,
TTable Table,
const TStr Attr2,
const TStr ResAttr = "",
TBool  AddToFirstTable = true 
)

Performs columnwise modulus with column of given table.

Definition at line 4969 of file table.cpp.

4970  {
4971  ColGenericOp(Attr1, Table, Attr2, ResultAttrName, aoMod, AddToFirstTable);
4972 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColMod ( const TStr Attr1,
const TFlt Num,
const TStr ResultAttrName = "",
const TBool  floatCast = false 
)

Performs modulus of column values and given Num.

Definition at line 5079 of file table.cpp.

5079  {
5080  ColGenericOp(Attr1, Num, ResultAttrName, aoMod, floatCast);
5081 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColMul ( const TStr Attr1,
const TStr Attr2,
const TStr ResultAttrName = "" 
)

Performs columnwise multiplication. See TTable::ColGenericOp.

Definition at line 4824 of file table.cpp.

4824  {
4825  ColGenericOp(Attr1, Attr2, ResultAttrName, aoMul);
4826 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColMul ( const TStr Attr1,
TTable Table,
const TStr Attr2,
const TStr ResAttr = "",
TBool  AddToFirstTable = true 
)

Performs columnwise multiplication with column of given table.

Definition at line 4959 of file table.cpp.

4960  {
4961  ColGenericOp(Attr1, Table, Attr2, ResultAttrName, aoMul, AddToFirstTable);
4962 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColMul ( const TStr Attr1,
const TFlt Num,
const TStr ResultAttrName = "",
const TBool  floatCast = false 
)

Performs multiplication of column values and given Num.

Definition at line 5071 of file table.cpp.

5071  {
5072  ColGenericOp(Attr1, Num, ResultAttrName, aoMul, floatCast);
5073 }
Definition: table.h:259
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
void TTable::ColSub ( const TStr Attr1,
const TStr Attr2,
const TStr ResultAttrName = "" 
)

Performs columnwise subtraction. See TTable::ColGenericOp.

Definition at line 4820 of file table.cpp.

4820  {
4821  ColGenericOp(Attr1, Attr2, ResultAttrName, aoSub);
4822 }
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
Definition: table.h:259
void TTable::ColSub ( const TStr Attr1,
TTable Table,
const TStr Attr2,
const TStr ResAttr = "",
TBool  AddToFirstTable = true 
)

Performs columnwise subtraction with column of given table.

Definition at line 4954 of file table.cpp.

4955  {
4956  ColGenericOp(Attr1, Table, Attr2, ResultAttrName, aoSub, AddToFirstTable);
4957 }
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
Definition: table.h:259
void TTable::ColSub ( const TStr Attr1,
const TFlt Num,
const TStr ResultAttrName = "",
const TBool  floatCast = false 
)

Performs subtraction of column values and given Num.

Definition at line 5067 of file table.cpp.

5067  {
5068  ColGenericOp(Attr1, Num, ResultAttrName, aoSub, floatCast);
5069 }
void ColGenericOp(const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op)
Performs columnwise arithmetic operation.
Definition: table.cpp:4752
Definition: table.h:259
TInt TTable::CompareKeyVal ( const TInt K1,
const TInt V1,
const TInt K2,
const TInt V2 
)
staticprotected

Definition at line 5297 of file table.cpp.

5297  {
5298  // if (K1 == K2) {
5299  // if (V1 < V2) { return -1; }
5300  // else if (V1 > V2) { return 1; }
5301  // else return 0;
5302  // }
5303  // if (K1 < K2) { return -1; }
5304  // else { return 1; }
5305 
5306  if (K1 == K2) { return V1 - V2; }
5307  else { return K1 - K2; }
5308 }
TInt TTable::CompareRows ( TInt  R1,
TInt  R2,
const TAttrType CompareByType,
const TInt CompareByIndex,
TBool  Asc = true 
)
inlineprotected

Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics).

Definition at line 3064 of file table.cpp.

3064  {
3065  //printf("comparing rows %d %d by %s\n", R1.Val, R2.Val, CompareBy.CStr());
3066  switch (CompareByType) {
3067  case atInt:{
3068  if (IntCols[CompareByIndex][R1] > IntCols[CompareByIndex][R2]) { return (Asc ? 1 : -1); }
3069  if (IntCols[CompareByIndex][R1] < IntCols[CompareByIndex][R2]) { return (Asc ? -1 : 1); }
3070  return 0;
3071  }
3072  case atFlt:{
3073  if (FltCols[CompareByIndex][R1] > FltCols[CompareByIndex][R2]) { return (Asc ? 1 : -1); }
3074  if (FltCols[CompareByIndex][R1] < FltCols[CompareByIndex][R2]) { return (Asc ? -1 : 1); }
3075  return 0;
3076  }
3077  case atStr:{
3078  TStr S1 = GetStrVal(CompareByIndex, R1);
3079  TStr S2 = GetStrVal(CompareByIndex, R2);
3080  int CmpRes = strcmp(S1.CStr(), S2.CStr());
3081  return (Asc ? CmpRes : -CmpRes);
3082  }
3083  }
3084  // code should not come here, added to remove a compiler warning
3085  return 0;
3086 }
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
Definition: gbase.h:23
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TStr GetStrVal(TInt ColIdx, TInt RowIdx) const
Gets the value in column with id ColIdx at row RowIdx.
Definition: table.h:626
Definition: dt.h:412
Definition: gbase.h:23
Definition: gbase.h:23
char * CStr()
Definition: dt.h:476
TInt TTable::CompareRows ( TInt  R1,
TInt  R2,
const TVec< TAttrType > &  CompareByTypes,
const TIntV CompareByIndices,
TBool  Asc = true 
)
inlineprotected

Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics).

Definition at line 3088 of file table.cpp.

3088  {
3089  for (TInt i = 0; i < CompareByTypes.Len(); i++) {
3090  TInt res = CompareRows(R1, R2, CompareByTypes[i], CompareByIndices[i], Asc);
3091  if (res != 0) { return res; }
3092  }
3093  return 0;
3094 }
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Definition: dt.h:1134
TInt CompareRows(TInt R1, TInt R2, const TAttrType &CompareByType, const TInt &CompareByIndex, TBool Asc=true)
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strc...
Definition: table.cpp:3064
void TTable::ConcatTable ( const PTable T)
inlineprotected

Appends all rows of T to this table, and recalculate indices.

Definition at line 683 of file table.h.

683 {AddTable(*T); Reindex(); }
void Reindex()
Reinitializes row ids.
Definition: table.cpp:1889
void AddTable(const TTable &T)
Adds all the rows of the input table. Allows duplicate rows (not a union).
Definition: table.cpp:3975
void TTable::Count ( const TStr CountColName,
const TStr Col 
)

Counts number of unique elements.

Count the number of appearences of the different elements of column . Record results in column CountCol

Definition at line 1802 of file table.cpp.

1802  {
1803  TStrV GroupByAttrs;
1804  GroupByAttrs.Add(CountColName);
1805  Aggregate(GroupByAttrs, aaCount, "", Col);
1806 }
void Aggregate(const TStrV &GroupByAttrs, TAttrAggr AggOp, const TStr &ValAttr, const TStr &ResAttr, TBool Ordered=true)
Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new at...
Definition: table.cpp:1585
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
Definition: table.h:257
void TTable::Defrag ( )

Releases memory of deleted rows, and defrags.

Also updates meta-data as row indices have changed Need some liveness analysis of columns

Definition at line 3311 of file table.cpp.

3311  {
3312  TInt FreeIndex = 0;
3313  TIntV Mapping; // Mapping[old_index] = new_index/invalid
3314 
3315  TInt IdColIdx = GetColIdx(IdColName);
3316 
3317  for (TInt i = 0; i < Next.Len(); i++) {
3318  if (Next[i] != TTable::Invalid) {
3319  // "first row" properly set beforehand
3320  if (FreeIndex == 0) {
3321  Assert (i == FirstValidRow);
3322  FirstValidRow = 0;
3323  }
3324 
3325  if (Next[i] != Last) {
3326  Next[FreeIndex] = FreeIndex + 1;
3327  Mapping.Add(FreeIndex);
3328  } else {
3329  Next[FreeIndex] = Last;
3330  LastValidRow = FreeIndex;
3331  Mapping.Add(Last);
3332  }
3333 
3334  RowIdMap.AddDat(IntCols[IdColIdx][i], FreeIndex);
3335 
3336  for (TInt j = 0; j < IntCols.Len(); j++) {
3337  IntCols[j][FreeIndex] = IntCols[j][i];
3338  }
3339  for (TInt j = 0; j < FltCols.Len(); j++) {
3340  FltCols[j][FreeIndex] = FltCols[j][i];
3341  }
3342  for (TInt j = 0; j < StrColMaps.Len(); j++) {
3343  StrColMaps[j][FreeIndex] = StrColMaps[j][i];
3344  }
3345 
3346  FreeIndex++;
3347  } else {
3348  NumRows--;
3349  Mapping.Add(TTable::Invalid);
3350  }
3351  }
3352 
3353  // should match, or bug somewhere
3355 }
TInt FirstValidRow
Physical index of first valid row.
Definition: table.h:553
TInt GetColIdx(const TStr &ColName) const
Gets index of column ColName among columns of the same type in the schema.
Definition: table.h:1013
static const TInt Last
Special value for Next vector entry - last row in table.
Definition: table.h:486
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
TStr IdColName
A mapping from column name to column type and column index among columns of the same type...
Definition: table.h:565
TInt LastValidRow
Physical index of last valid row.
Definition: table.h:554
TVec< TIntV > IntCols
Next[i] is the successor of row i. Table iterators follow the order dictated by Next ...
Definition: table.h:558
#define Assert(Cond)
Definition: bd.h:251
TVec< TIntV > StrColMaps
Data columns of integer mappings of string attributes.
Definition: table.h:560
TIntIntH RowIdMap
Mapping of permanent row ids to physical id.
Definition: table.h:566
Definition: dt.h:1134
TVec< TFltV > FltCols
Data columns of floating point attributes.
Definition: table.h:559
TIntV Next
A vector describing the logical order of the rows.
Definition: table.h:555
TInt NumRows
Number of rows in the table (valid and invalid).
Definition: table.h:551
static const TInt Invalid
Special value for Next vector entry - logically removed row.
Definition: table.h:487
TInt NumValidRows
Number of valid rows in the table (i.e. rows that were not logically removed).
Definition: table.h:552
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
TDat & AddDat(const TKey &Key)
Definition: hash.h:238
void TTable::DelColType ( const TStr ColName)
inlineprotected

Adds column with name ColName and type ColType to the ColTypeMap.

Definition at line 661 of file table.h.

661  {
662  TStr NColName = NormalizeColName(ColName);
663  ColTypeMap.DelKey(NColName);
664  }
THash< TStr, TPair< TAttrType, TInt > > ColTypeMap
Definition: table.h:564
void DelKey(const TKey &Key)
Definition: hash.h:404
static TStr NormalizeColName(const TStr &ColName)
Adds suffix to column name if it doesn't exist.
Definition: table.h:530
Definition: dt.h:412
TStr TTable::DenormalizeColName ( const TStr ColName) const
protected

Removes suffix to column name if exists.

Definition at line 4648 of file table.cpp.

4648  {
4649  TStr DColName = ColName;
4650  if (DColName.Len() == 0) { return DColName; }
4651  if (DColName.GetCh(0) == '_') { return DColName; }
4652  if (DColName.GetCh(DColName.Len()-2) == '-') {
4653  DColName = DColName.GetSubStr(0,DColName.Len()-3);
4654  }
4655  TInt Conflicts = 0;
4656  for (TInt i = 0; i < Sch.Len(); i++) {
4657  if (DColName == Sch[i].Val1.GetSubStr(0,