SNAP Library 6.0, Developer Reference  2020-12-09 16:24:20
SNAP, a general purpose, high performance system for analysis and manipulation of large networks
TAGMFast Class Reference

Community detection with AGM. Sparse AGM-fast with coordinate ascent. More...

#include <agmfast.h>

Collaboration diagram for TAGMFast:

Public Member Functions

 TAGMFast (const PUNGraph &GraphPt, const int &InitComs, const int RndSeed=0)
 
void SetGraph (const PUNGraph &GraphPt)
 
void SetRegCoef (const double _RegCoef)
 
double GetRegCoef ()
 
void RandomInit (const int InitComs)
 
void NeighborComInit (const int InitComs)
 
void SetCmtyVV (const TVec< TIntV > &CmtyVV)
 
double Likelihood (const bool DoParallel=false)
 
double LikelihoodForRow (const int UID)
 
double LikelihoodForRow (const int UID, const TIntFltH &FU)
 
int MLENewton (const double &Thres, const int &MaxIter, const TStr &PlotNm=TStr())
 Newton method: DEPRECATED. More...
 
void GradientForRow (const int UID, TIntFltH &GradU, const TIntSet &CIDSet)
 
double GradientForOneVar (const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
 
double HessianForOneVar (const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
 
double LikelihoodForOneVar (const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
 
void GetCmtyVV (TVec< TIntV > &CmtyVV)
 
void GetCmtyVV (TVec< TIntV > &CmtyVV, const double Thres, const int MinSz=3)
 extract community affiliation from F_uc More...
 
int FindComsByCV (TIntV &ComsV, const double HOFrac=0.2, const int NumThreads=20, const TStr &PlotLFNm=TStr(), const double StepAlpha=0.3, const double StepBeta=0.1)
 
int FindComsByCV (const int NumThreads, const int MaxComs, const int MinComs, const int DivComs, const TStr &OutFNm, const double StepAlpha=0.3, const double StepBeta=0.3)
 estimate number of communities using cross validation More...
 
double LikelihoodHoldOut (const bool DoParallel=false)
 
double GetStepSizeByLineSearch (const int UID, const TIntFltH &DeltaV, const TIntFltH &GradV, const double &Alpha, const double &Beta, const int MaxIter=10)
 
int MLEGradAscent (const double &Thres, const int &MaxIter, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
 
int MLEGradAscentParallel (const double &Thres, const int &MaxIter, const int ChunkNum, const int ChunkSize, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
 
int MLEGradAscentParallel (const double &Thres, const int &MaxIter, const int ChunkNum, const TStr &PlotNm=TStr(), const double StepAlpha=0.3, const double StepBeta=0.1)
 
void Save (TSOut &SOut)
 
void Load (TSIn &SIn, const int &RndSeed=0)
 
double GetCom (const int &NID, const int &CID)
 
void AddCom (const int &NID, const int &CID, const double &Val)
 
void DelCom (const int &NID, const int &CID)
 
double DotProduct (const TIntFltH &UV, const TIntFltH &VV)
 
double DotProduct (const int &UID, const int &VID)
 
double Prediction (const TIntFltH &FU, const TIntFltH &FV)
 
double Prediction (const int &UID, const int &VID)
 
double Sum (const TIntFltH &UV)
 
double Norm2 (const TIntFltH &UV)
 

Public Attributes

TVec< TIntSetHOVIDSV
 
TFlt MinVal
 
TFlt MaxVal
 
TFlt NegWgt
 
TFlt PNoCom
 
TBool DoParallel
 

Private Attributes

PUNGraph G
 
TVec< TIntFltHF
 
TRnd Rnd
 
TIntV NIDV
 
TFlt RegCoef
 
TFltV SumFV
 
TBool NodesOk
 
TInt NumComs
 

Detailed Description

Community detection with AGM. Sparse AGM-fast with coordinate ascent.

Definition at line 7 of file agmfast.h.

Constructor & Destructor Documentation

TAGMFast::TAGMFast ( const PUNGraph GraphPt,
const int &  InitComs,
const int  RndSeed = 0 
)
inline

Definition at line 25 of file agmfast.h.

References RandomInit(), and SetGraph().

25  : Rnd(RndSeed), RegCoef(0),
26  NodesOk(true), MinVal(0.0), MaxVal(1000.0), NegWgt(1.0) { SetGraph(GraphPt); RandomInit(InitComs); }
TFlt MinVal
Definition: agmfast.h:19
TFlt NegWgt
Definition: agmfast.h:21
TBool NodesOk
Definition: agmfast.h:15
void SetGraph(const PUNGraph &GraphPt)
Definition: agmfast.cpp:146
TFlt RegCoef
Definition: agmfast.h:13
TRnd Rnd
Definition: agmfast.h:11
TFlt MaxVal
Definition: agmfast.h:20
void RandomInit(const int InitComs)
Definition: agmfast.cpp:38

Here is the call graph for this function:

Member Function Documentation

void TAGMFast::AddCom ( const int &  NID,
const int &  CID,
const double &  Val 
)
inline

Definition at line 64 of file agmfast.h.

References TVec< TVal, TSizeTy >::GetDat().

Referenced by MLEGradAscent(), MLENewton(), NeighborComInit(), RandomInit(), and SetCmtyVV().

64  {
65  if (F[NID].IsKey(CID)) {
66  SumFV[CID] -= F[NID].GetDat(CID);
67  }
68  F[NID].AddDat(CID) = Val;
69  SumFV[CID] += Val;
70  }
TVec< TIntFltH > F
Definition: agmfast.h:10
TFltV SumFV
Definition: agmfast.h:14

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::DelCom ( const int &  NID,
const int &  CID 
)
inline

Definition at line 72 of file agmfast.h.

References TVec< TVal, TSizeTy >::GetDat().

Referenced by MLEGradAscent(), and MLENewton().

72  {
73  if (F[NID].IsKey(CID)) {
74  SumFV[CID] -= F[NID].GetDat(CID);
75  F[NID].DelKey(CID);
76  }
77  }
TVec< TIntFltH > F
Definition: agmfast.h:10
TFltV SumFV
Definition: agmfast.h:14

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::DotProduct ( const TIntFltH UV,
const TIntFltH VV 
)
inline

Definition at line 78 of file agmfast.h.

References THash< TKey, TDat, THashFunc >::BegI(), THash< TKey, TDat, THashFunc >::EndI(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::IsKey(), and THash< TKey, TDat, THashFunc >::Len().

Referenced by DotProduct(), GetStepSizeByLineSearch(), LikelihoodForRow(), MLENewton(), and Prediction().

78  {
79  double DP = 0;
80  if (UV.Len() > VV.Len()) {
81  for (TIntFltH::TIter HI = UV.BegI(); HI < UV.EndI(); HI++) {
82  if (VV.IsKey(HI.GetKey())) {
83  DP += VV.GetDat(HI.GetKey()) * HI.GetDat();
84  }
85  }
86  } else {
87  for (TIntFltH::TIter HI = VV.BegI(); HI < VV.EndI(); HI++) {
88  if (UV.IsKey(HI.GetKey())) {
89  DP += UV.GetDat(HI.GetKey()) * HI.GetDat();
90  }
91  }
92  }
93  return DP;
94  }
TIter BegI() const
Definition: hash.h:213
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
TIter EndI() const
Definition: hash.h:218
bool IsKey(const TKey &Key) const
Definition: hash.h:258
int Len() const
Definition: hash.h:228

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::DotProduct ( const int &  UID,
const int &  VID 
)
inline

Definition at line 95 of file agmfast.h.

References DotProduct().

95  {
96  return DotProduct(F[UID], F[VID]);
97  }
TVec< TIntFltH > F
Definition: agmfast.h:10
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78

Here is the call graph for this function:

int TAGMFast::FindComsByCV ( TIntV ComsV,
const double  HOFrac = 0.2,
const int  NumThreads = 20,
const TStr PlotLFNm = TStr(),
const double  StepAlpha = 0.3,
const double  StepBeta = 0.1 
)

Definition at line 554 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), TUNGraph::BegEI(), TStr::Empty(), TUNGraph::EndEI(), G, TVec< TVal, TSizeTy >::Gen(), TUNGraph::GetEdges(), TUNGraph::GetNIdV(), TUNGraph::GetNodes(), TRnd::GetUniDevInt(), HOVIDSV, TVec< TVal, TSizeTy >::Last(), TVec< TVal, TSizeTy >::Len(), Likelihood(), LikelihoodHoldOut(), MLEGradAscent(), MLEGradAscentParallel(), TFlt::Mn, NeighborComInit(), TGnuPlot::PlotValV(), RandomInit(), Rnd, and TMath::Round().

Referenced by FindComsByCV().

554  {
555  if (ComsV.Len() == 0) {
556  int MaxComs = G->GetNodes() / 5;
557  ComsV.Add(2);
558  while(ComsV.Last() < MaxComs) { ComsV.Add(ComsV.Last() * 2); }
559  }
560  TIntPrV EdgeV(G->GetEdges(), 0);
561  for (TUNGraph::TEdgeI EI = G->BegEI(); EI < G->EndEI(); EI++) {
562  EdgeV.Add(TIntPr(EI.GetSrcNId(), EI.GetDstNId()));
563  }
564  EdgeV.Shuffle(Rnd);
565  int MaxIterCV = 3;
566 
567  TVec<TVec<TIntSet> > HoldOutSets(MaxIterCV);
568  if (EdgeV.Len() > 50) { //if edges are many enough, use CV
569  printf("generating hold out set\n");
570  TIntV NIdV1, NIdV2;
571  G->GetNIdV(NIdV1);
572  G->GetNIdV(NIdV2);
573  for (int IterCV = 0; IterCV < MaxIterCV; IterCV++) {
574  // generate holdout sets
575  HoldOutSets[IterCV].Gen(G->GetNodes());
576  const int HOTotal = int(HOFrac * G->GetNodes() * (G->GetNodes() - 1) / 2.0);
577  int HOCnt = 0;
578  int HOEdges = (int) TMath::Round(HOFrac * G->GetEdges());
579  printf("holding out %d edges...\n", HOEdges);
580  for (int he = 0; he < (int) HOEdges; he++) {
581  HoldOutSets[IterCV][EdgeV[he].Val1].AddKey(EdgeV[he].Val2);
582  HoldOutSets[IterCV][EdgeV[he].Val2].AddKey(EdgeV[he].Val1);
583  HOCnt++;
584  }
585  printf("%d Edges hold out\n", HOCnt);
586  while(HOCnt++ < HOTotal) {
587  int SrcNID = Rnd.GetUniDevInt(G->GetNodes());
588  int DstNID = Rnd.GetUniDevInt(G->GetNodes());
589  HoldOutSets[IterCV][SrcNID].AddKey(DstNID);
590  HoldOutSets[IterCV][DstNID].AddKey(SrcNID);
591  }
592  }
593  printf("hold out set generated\n");
594  }
595 
596  TFltV HOLV(ComsV.Len());
597  TIntFltPrV ComsLV;
598  for (int c = 0; c < ComsV.Len(); c++) {
599  const int Coms = ComsV[c];
600  printf("Try number of Coms:%d\n", Coms);
601  NeighborComInit(Coms);
602  printf("Initialized\n");
603 
604  if (EdgeV.Len() > 50) { //if edges are many enough, use CV
605  for (int IterCV = 0; IterCV < MaxIterCV; IterCV++) {
606  HOVIDSV = HoldOutSets[IterCV];
607 
608  if (NumThreads == 1) {
609  printf("MLE without parallelization begins\n");
610  MLEGradAscent(0.05, 10 * G->GetNodes(), "", StepAlpha, StepBeta);
611  } else {
612  printf("MLE with parallelization begins\n");
613  MLEGradAscentParallel(0.05, 100, NumThreads, "", StepAlpha, StepBeta);
614  }
615  double HOL = LikelihoodHoldOut();
616  HOL = HOL < 0? HOL: TFlt::Mn;
617  HOLV[c] += HOL;
618  }
619  }
620  else {
621  HOVIDSV.Gen(G->GetNodes());
622  MLEGradAscent(0.0001, 100 * G->GetNodes(), "");
623  double BIC = 2 * Likelihood() - (double) G->GetNodes() * Coms * 2.0 * log ( (double) G->GetNodes());
624  HOLV[c] = BIC;
625  }
626  }
627  int EstComs = 2;
628  double MaxL = TFlt::Mn;
629  printf("\n");
630  for (int c = 0; c < ComsV.Len(); c++) {
631  ComsLV.Add(TIntFltPr(ComsV[c].Val, HOLV[c].Val));
632  printf("%d(%f)\t", ComsV[c].Val, HOLV[c].Val);
633  if (MaxL < HOLV[c]) {
634  MaxL = HOLV[c];
635  EstComs = ComsV[c];
636  }
637  }
638  printf("\n");
639  RandomInit(EstComs);
640  HOVIDSV.Gen(G->GetNodes());
641  if (! PlotLFNm.Empty()) {
642  TGnuPlot::PlotValV(ComsLV, PlotLFNm, "hold-out likelihood", "communities", "likelihood");
643  }
644  return EstComs;
645 }
Edge iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:121
TPair< TInt, TInt > TIntPr
Definition: ds.h:83
PUNGraph G
Definition: agmfast.h:9
TEdgeI EndEI() const
Returns an iterator referring to the past-the-end edge in the graph.
Definition: graph.h:280
int MLEGradAscentParallel(const double &Thres, const int &MaxIter, const int ChunkNum, const int ChunkSize, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
Definition: agmfast.cpp:761
double LikelihoodHoldOut(const bool DoParallel=false)
Definition: agmfast.cpp:647
void NeighborComInit(const int InitComs)
Definition: agmfast.cpp:60
void GetNIdV(TIntV &NIdV) const
Gets a vector IDs of all nodes in the graph.
Definition: graph.cpp:153
int GetEdges() const
Returns the number of edges in the graph.
Definition: graph.cpp:82
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
int MLEGradAscent(const double &Thres, const int &MaxIter, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
Definition: agmfast.cpp:690
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
TPair< TInt, TFlt > TIntFltPr
Definition: ds.h:87
static double Round(const double &Val)
Definition: xmath.h:16
TEdgeI BegEI() const
Returns an iterator referring to the first edge in the graph.
Definition: graph.h:278
const TVal & Last() const
Returns a reference to the last element of the vector.
Definition: ds.h:579
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool Empty() const
Definition: dt.h:491
TRnd Rnd
Definition: agmfast.h:11
double Likelihood(const bool DoParallel=false)
Definition: agmfast.cpp:173
int GetUniDevInt(const int &Range=0)
Definition: dt.cpp:39
void RandomInit(const int InitComs)
Definition: agmfast.cpp:38
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
static void PlotValV(const TVec< TVal1 > &ValV, const TStr &OutFNmPref, const TStr &Desc="", const TStr &XLabel="", const TStr &YLabel="", const TGpScaleTy &ScaleTy=gpsAuto, const bool &PowerFit=false, const TGpSeriesTy &SeriesTy=gpwLinesPoints)
Definition: gnuplot.h:398
static const double Mn
Definition: dt.h:1390

Here is the call graph for this function:

Here is the caller graph for this function:

int TAGMFast::FindComsByCV ( const int  NumThreads,
const int  MaxComs,
const int  MinComs,
const int  DivComs,
const TStr OutFNm,
const double  StepAlpha = 0.3,
const double  StepBeta = 0.3 
)

estimate number of communities using cross validation

Definition at line 541 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), FindComsByCV(), TVec< TVal, TSizeTy >::Last(), TVec< TVal, TSizeTy >::Len(), TMath::Log(), and TInt::Val.

541  {
542  double ComsGap = exp(TMath::Log((double) MaxComs / (double) MinComs) / (double) DivComs);
543  TIntV ComsV;
544  ComsV.Add(MinComs);
545  while (ComsV.Len() < DivComs) {
546  int NewComs = int(ComsV.Last() * ComsGap);
547  if (NewComs == ComsV.Last().Val) { NewComs++; }
548  ComsV.Add(NewComs);
549  }
550  if (ComsV.Last() < MaxComs) { ComsV.Add(MaxComs); }
551  return FindComsByCV(ComsV, 0.1, NumThreads, OutFNm + ".CV.likelihood", StepAlpha, StepBeta);
552 }
int Val
Definition: dt.h:1139
int FindComsByCV(TIntV &ComsV, const double HOFrac=0.2, const int NumThreads=20, const TStr &PlotLFNm=TStr(), const double StepAlpha=0.3, const double StepBeta=0.1)
Definition: agmfast.cpp:554
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
const TVal & Last() const
Returns a reference to the last element of the vector.
Definition: ds.h:579
static double Log(const double &Val)
Definition: xmath.h:14
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

void TAGMFast::GetCmtyVV ( TVec< TIntV > &  CmtyVV)

Definition at line 508 of file agmfast.cpp.

References G, TUNGraph::GetEdges(), and TUNGraph::GetNodes().

508  {
509  GetCmtyVV(CmtyVV, sqrt(2.0 * (double) G->GetEdges() / G->GetNodes() / G->GetNodes()), 3);
510 }
PUNGraph G
Definition: agmfast.h:9
int GetEdges() const
Returns the number of edges in the graph.
Definition: graph.cpp:82
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
void GetCmtyVV(TVec< TIntV > &CmtyVV)
Definition: agmfast.cpp:508

Here is the call graph for this function:

void TAGMFast::GetCmtyVV ( TVec< TIntV > &  CmtyVV,
const double  Thres,
const int  MinSz = 3 
)

extract community affiliation from F_uc

Definition at line 513 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::AddDat(), F, TVec< TVal, TSizeTy >::Gen(), GetCom(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::GetKey(), IAssert, TVec< TVal, TSizeTy >::Len(), NIDV, NodesOk, NumComs, THash< TKey, TDat, THashFunc >::SortByDat(), and SumFV.

513  {
514  CmtyVV.Gen(NumComs, 0);
515  TIntFltH CIDSumFH(NumComs);
516  for (int c = 0; c < SumFV.Len(); c++) {
517  CIDSumFH.AddDat(c, SumFV[c]);
518  }
519  CIDSumFH.SortByDat(false);
520  for (int c = 0; c < NumComs; c++) {
521  int CID = CIDSumFH.GetKey(c);
522  TIntFltH NIDFucH(F.Len() / 10);
523  TIntV CmtyV;
524  IAssert(SumFV[CID] == CIDSumFH.GetDat(CID));
525  if (SumFV[CID] < Thres) { continue; }
526  for (int u = 0; u < F.Len(); u++) {
527  int NID = u;
528  if (! NodesOk) { NID = NIDV[u]; }
529  if (GetCom(u, CID) >= Thres) { NIDFucH.AddDat(NID, GetCom(u, CID)); }
530  }
531  NIDFucH.SortByDat(false);
532  NIDFucH.GetKeyV(CmtyV);
533  if (CmtyV.Len() >= MinSz) { CmtyVV.Add(CmtyV); }
534  }
535  if ( NumComs != CmtyVV.Len()) {
536  printf("Community vector generated. %d communities are ommitted\n", NumComs.Val - CmtyVV.Len());
537  }
538 }
#define IAssert(Cond)
Definition: bd.h:262
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
TBool NodesOk
Definition: agmfast.h:15
TVec< TIntFltH > F
Definition: agmfast.h:10
const TVal & GetDat(const TVal &Val) const
Returns reference to the first occurrence of element Val.
Definition: ds.h:838
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TIntV NIDV
Definition: agmfast.h:12
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:523
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602

Here is the call graph for this function:

double TAGMFast::GetCom ( const int &  NID,
const int &  CID 
)
inline

Definition at line 57 of file agmfast.h.

References TVec< TVal, TSizeTy >::GetDat().

Referenced by GetCmtyVV(), GetStepSizeByLineSearch(), GradientForOneVar(), GradientForRow(), LikelihoodForOneVar(), LikelihoodForRow(), MLEGradAscent(), and MLENewton().

57  {
58  if (F[NID].IsKey(CID)) {
59  return F[NID].GetDat(CID);
60  } else {
61  return 0.0;
62  }
63  }
TVec< TIntFltH > F
Definition: agmfast.h:10

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::GetRegCoef ( )
inline

Definition at line 29 of file agmfast.h.

References RegCoef.

29 { return RegCoef; }
TFlt RegCoef
Definition: agmfast.h:13
double TAGMFast::GetStepSizeByLineSearch ( const int  UID,
const TIntFltH DeltaV,
const TIntFltH GradV,
const double &  Alpha,
const double &  Beta,
const int  MaxIter = 10 
)

Definition at line 665 of file agmfast.cpp.

References DotProduct(), GetCom(), THash< TKey, TDat, THashFunc >::GetDat(), THash< TKey, TDat, THashFunc >::GetKey(), THash< TKey, TDat, THashFunc >::Len(), LikelihoodForRow(), MaxVal, and MinVal.

Referenced by MLEGradAscent(), and MLEGradAscentParallel().

665  {
666  double StepSize = 1.0;
667  double InitLikelihood = LikelihoodForRow(UID);
668  TIntFltH NewVarV(DeltaV.Len());
669  for(int iter = 0; iter < MaxIter; iter++) {
670  for (int i = 0; i < DeltaV.Len(); i++){
671  int CID = DeltaV.GetKey(i);
672  double NewVal = GetCom(UID, CID) + StepSize * DeltaV.GetDat(CID);
673  if (NewVal < MinVal) { NewVal = MinVal; }
674  if (NewVal > MaxVal) { NewVal = MaxVal; }
675  NewVarV.AddDat(CID, NewVal);
676  }
677  if (LikelihoodForRow(UID, NewVarV) < InitLikelihood + Alpha * StepSize * DotProduct(GradV, DeltaV)) {
678  StepSize *= Beta;
679  } else {
680  break;
681  }
682  if (iter == MaxIter - 1) {
683  StepSize = 0.0;
684  break;
685  }
686  }
687  return StepSize;
688 }
TFlt MinVal
Definition: agmfast.h:19
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
double LikelihoodForRow(const int UID)
Definition: agmfast.cpp:193
TFlt MaxVal
Definition: agmfast.h:20
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78
int Len() const
Definition: hash.h:228
const TKey & GetKey(const int &KeyId) const
Definition: hash.h:252

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::GradientForOneVar ( const TFltV AlphaKV,
const int  UID,
const int  CID,
const double &  Val 
)

Definition at line 339 of file agmfast.cpp.

References F, G, GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, IAssert, NegWgt, RegCoef, and SumFV.

Referenced by MLENewton().

339  {
340  TUNGraph::TNodeI UI = G->GetNI(UID);
341  double Grad = 0.0, PNoEdge;
342  int VID = 0;
343  for (int e = 0; e < UI.GetDeg(); e++) {
344  VID = UI.GetNbrNId(e);
345  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
346  if (! F[VID].IsKey(CID)) { continue; }
347  PNoEdge = AlphaKV[e] * exp (- F[VID].GetDat(CID) * Val);
348  IAssert(PNoEdge <= 1.0 && PNoEdge >= 0.0);
349  //PNoEdge = PNoEdge >= 1.0 - PNoCom? 1 - PNoCom: PNoEdge;
350  Grad += ((PNoEdge * F[VID].GetDat(CID)) / (1.0 - PNoEdge) + NegWgt * F[VID].GetDat(CID));
351  }
352  Grad -= NegWgt * (SumFV[CID] - GetCom(UID, CID));
353  //add regularization
354  if (RegCoef > 0.0) { //L1
355  Grad -= RegCoef;
356  }
357  if (RegCoef < 0.0) { //L2
358  Grad += 2 * RegCoef * Val;
359  }
360 
361  return Grad;
362 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::GradientForRow ( const int  UID,
TIntFltH GradU,
const TIntSet CIDSet 
)

Definition at line 250 of file agmfast.cpp.

References THash< TKey, TDat, THashFunc >::AddDat(), DoParallel, G, THash< TKey, TDat, THashFunc >::Gen(), TVec< TVal, TSizeTy >::Gen(), GetCom(), TUNGraph::TNodeI::GetDeg(), THashSet< TKey, THashFunc >::GetKey(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, IAssert, THash< TKey, TDat, THashFunc >::Len(), TVec< TVal, TSizeTy >::Len(), THashSet< TKey, THashFunc >::Len(), NegWgt, Prediction(), RegCoef, and SumFV.

Referenced by MLEGradAscent(), and MLEGradAscentParallel().

250  {
251  GradU.Gen(CIDSet.Len());
252 
253  TFltV HOSumFV; //adjust for Fv of v hold out
254  if (HOVIDSV[UID].Len() > 0) {
255  HOSumFV.Gen(SumFV.Len());
256 
257  for (int e = 0; e < HOVIDSV[UID].Len(); e++) {
258  for (int c = 0; c < SumFV.Len(); c++) {
259  HOSumFV[c] += GetCom(HOVIDSV[UID][e], c);
260  }
261  }
262  }
263 
264  TUNGraph::TNodeI NI = G->GetNI(UID);
265  int Deg = NI.GetDeg();
266  TFltV PredV(Deg), GradV(CIDSet.Len());
267  TIntV CIDV(CIDSet.Len());
268  if (DoParallel && Deg + CIDSet.Len() > 10) {
269 #pragma omp parallel for schedule(static, 1)
270  for (int e = 0; e < Deg; e++) {
271  if (NI.GetNbrNId(e) == UID) { continue; }
272  if (HOVIDSV[UID].IsKey(NI.GetNbrNId(e))) { continue; }
273  PredV[e] = Prediction(UID, NI.GetNbrNId(e));
274  }
275 
276 #pragma omp parallel for schedule(static, 1)
277  for (int c = 0; c < CIDSet.Len(); c++) {
278  int CID = CIDSet.GetKey(c);
279  double Val = 0.0;
280  for (int e = 0; e < Deg; e++) {
281  int VID = NI.GetNbrNId(e);
282  if (VID == UID) { continue; }
283  if (HOVIDSV[UID].IsKey(VID)) { continue; }
284  Val += PredV[e] * GetCom(VID, CID) / (1.0 - PredV[e]) + NegWgt * GetCom(VID, CID);
285  }
286  double HOSum = HOVIDSV[UID].Len() > 0? HOSumFV[CID].Val: 0.0;//subtract Hold out pairs only if hold out pairs exist
287  Val -= NegWgt * (SumFV[CID] - HOSum - GetCom(UID, CID));
288  CIDV[c] = CID;
289  GradV[c] = Val;
290  }
291  }
292  else {
293  for (int e = 0; e < Deg; e++) {
294  if (NI.GetNbrNId(e) == UID) { continue; }
295  if (HOVIDSV[UID].IsKey(NI.GetNbrNId(e))) { continue; }
296  PredV[e] = Prediction(UID, NI.GetNbrNId(e));
297  }
298 
299  for (int c = 0; c < CIDSet.Len(); c++) {
300  int CID = CIDSet.GetKey(c);
301  double Val = 0.0;
302  for (int e = 0; e < Deg; e++) {
303  int VID = NI.GetNbrNId(e);
304  if (VID == UID) { continue; }
305  if (HOVIDSV[UID].IsKey(VID)) { continue; }
306  Val += PredV[e] * GetCom(VID, CID) / (1.0 - PredV[e]) + NegWgt * GetCom(VID, CID);
307  }
308  double HOSum = HOVIDSV[UID].Len() > 0? HOSumFV[CID].Val: 0.0;//subtract Hold out pairs only if hold out pairs exist
309  Val -= NegWgt * (SumFV[CID] - HOSum - GetCom(UID, CID));
310  CIDV[c] = CID;
311  GradV[c] = Val;
312  }
313  }
314  //add regularization
315  if (RegCoef > 0.0) { //L1
316  for (int c = 0; c < GradV.Len(); c++) {
317  GradV[c] -= RegCoef;
318  }
319  }
320  if (RegCoef < 0.0) { //L2
321  for (int c = 0; c < GradV.Len(); c++) {
322  GradV[c] += 2 * RegCoef * GetCom(UID, CIDV[c]);
323  }
324  }
325 
326 
327  for (int c = 0; c < GradV.Len(); c++) {
328  if (GetCom(UID, CIDV[c]) == 0.0 && GradV[c] < 0.0) { continue; }
329  if (fabs(GradV[c]) < 0.0001) { continue; }
330  GradU.AddDat(CIDV[c], GradV[c]);
331  }
332  for (int c = 0; c < GradU.Len(); c++) {
333  if (GradU[c] >= 10) { GradU[c] = 10; }
334  if (GradU[c] <= -10) { GradU[c] = -10; }
335  IAssert(GradU[c] >= -10);
336  }
337 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
TBool DoParallel
Definition: agmfast.h:23
double Prediction(const TIntFltH &FU, const TIntFltH &FV)
Definition: agmfast.h:98
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
const TKey & GetKey(const int &KeyId) const
Definition: shash.h:1141
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
void Gen(const int &ExpectVals)
Definition: hash.h:222
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TFltV SumFV
Definition: agmfast.h:14
int Len() const
Definition: shash.h:1121
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:523
int Len() const
Definition: hash.h:228
TDat & AddDat(const TKey &Key)
Definition: hash.h:238

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::HessianForOneVar ( const TFltV AlphaKV,
const int  UID,
const int  CID,
const double &  Val 
)

Definition at line 394 of file agmfast.cpp.

References F, G, TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, IAssert, and RegCoef.

Referenced by MLENewton().

394  {
395  TUNGraph::TNodeI UI = G->GetNI(UID);
396  double H = 0.0, PNoEdge;
397  int VID = 0;
398  for (int e = 0; e < UI.GetDeg(); e++) {
399  VID = UI.GetNbrNId(e);
400  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
401  if (! F[VID].IsKey(CID)) { continue; }
402  PNoEdge = AlphaKV[e] * exp (- F[VID].GetDat(CID) * Val);
403  IAssert(PNoEdge <= 1.0 && PNoEdge >= 0.0);
404  //PNoEdge = PNoEdge == 1.0? 1 - PNoCom: PNoEdge;
405  H += (- PNoEdge * F[VID].GetDat(CID) * F[VID].GetDat(CID)) / (1.0 - PNoEdge) / (1.0 - PNoEdge);
406  }
407  //add regularization
408  if (RegCoef < 0.0) { //L2
409  H += 2 * RegCoef;
410  }
411  IAssert (H <= 0.0);
412  return H;
413 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::Likelihood ( const bool  DoParallel = false)

Definition at line 173 of file agmfast.cpp.

References F, and LikelihoodForRow().

Referenced by FindComsByCV(), MLEGradAscent(), MLEGradAscentParallel(), and MLENewton().

173  {
174  TExeTm ExeTm;
175  double L = 0.0;
176  if (_DoParallel) {
177  #pragma omp parallel for
178  for (int u = 0; u < F.Len(); u++) {
179  double LU = LikelihoodForRow(u);
180  #pragma omp atomic
181  L += LU;
182  }
183  }
184  else {
185  for (int u = 0; u < F.Len(); u++) {
186  double LU = LikelihoodForRow(u);
187  L += LU;
188  }
189  }
190  return L;
191 }
Definition: tm.h:355
TVec< TIntFltH > F
Definition: agmfast.h:10
double LikelihoodForRow(const int UID)
Definition: agmfast.cpp:193

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::LikelihoodForOneVar ( const TFltV AlphaKV,
const int  UID,
const int  CID,
const double &  Val 
)

Definition at line 364 of file agmfast.cpp.

References F, G, GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, IAssert, NegWgt, RegCoef, and SumFV.

364  {
365  TUNGraph::TNodeI UI = G->GetNI(UID);
366  double L = 0.0, PNoEdge;
367  int VID = 0;
368  for (int e = 0; e < UI.GetDeg(); e++) {
369  VID = UI.GetNbrNId(e);
370  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
371  if (! F[VID].IsKey(CID)) {
372  PNoEdge = AlphaKV[e];
373  } else {
374  PNoEdge = AlphaKV[e] * exp (- F[VID].GetDat(CID) * Val);
375  }
376  IAssert(PNoEdge <= 1.0 && PNoEdge >= 0.0);
377  //PNoEdge = PNoEdge >= 1.0 - PNoCom? 1 - PNoCom: PNoEdge;
378  L += log(1.0 - PNoEdge) + NegWgt * GetCom(VID, CID) * Val;
379 
380  // += ((PNoEdge * F[VID].GetDat(CID)) / (1.0 - PNoEdge) + NegWgt * F[VID].GetDat(CID));
381  }
382  L -= NegWgt * (SumFV[CID] - GetCom(UID, CID)) * Val;
383  //add regularization
384  if (RegCoef > 0.0) { //L1
385  L -= RegCoef * Val;
386  }
387  if (RegCoef < 0.0) { //L2
388  L += RegCoef * Val * Val;
389  }
390 
391  return L;
392 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111

Here is the call graph for this function:

double TAGMFast::LikelihoodForRow ( const int  UID)

Definition at line 193 of file agmfast.cpp.

References F.

Referenced by GetStepSizeByLineSearch(), and Likelihood().

193  {
194  return LikelihoodForRow(UID, F[UID]);
195 }
TVec< TIntFltH > F
Definition: agmfast.h:10
double LikelihoodForRow(const int UID)
Definition: agmfast.cpp:193

Here is the caller graph for this function:

double TAGMFast::LikelihoodForRow ( const int  UID,
const TIntFltH FU 
)

Definition at line 198 of file agmfast.cpp.

References THash< TKey, TDat, THashFunc >::BegI(), DoParallel, DotProduct(), THash< TKey, TDat, THashFunc >::EndI(), F, G, TVec< TVal, TSizeTy >::Gen(), GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), HOVIDSV, TVec< TVal, TSizeTy >::Len(), NegWgt, Norm2(), Prediction(), RegCoef, Sum(), and SumFV.

198  {
199  double L = 0.0;
200  TFltV HOSumFV; //adjust for Fv of v hold out
201  if (HOVIDSV[UID].Len() > 0) {
202  HOSumFV.Gen(SumFV.Len());
203 
204  for (int e = 0; e < HOVIDSV[UID].Len(); e++) {
205  for (int c = 0; c < SumFV.Len(); c++) {
206  HOSumFV[c] += GetCom(HOVIDSV[UID][e], c);
207  }
208  }
209  }
210 
211  TUNGraph::TNodeI NI = G->GetNI(UID);
212  if (DoParallel && NI.GetDeg() > 10) {
213 #pragma omp parallel for schedule(static, 1)
214  for (int e = 0; e < NI.GetDeg(); e++) {
215  int v = NI.GetNbrNId(e);
216  if (v == UID) { continue; }
217  if (HOVIDSV[UID].IsKey(v)) { continue; }
218  double LU = log (1.0 - Prediction(FU, F[v])) + NegWgt * DotProduct(FU, F[v]);
219 #pragma omp atomic
220  L += LU;
221  }
222  for (TIntFltH::TIter HI = FU.BegI(); HI < FU.EndI(); HI++) {
223  double HOSum = HOVIDSV[UID].Len() > 0? HOSumFV[HI.GetKey()].Val: 0.0;//subtract Hold out pairs only if hold out pairs exist
224  double LU = NegWgt * (SumFV[HI.GetKey()] - HOSum - GetCom(UID, HI.GetKey())) * HI.GetDat();
225  L -= LU;
226  }
227  } else {
228  for (int e = 0; e < NI.GetDeg(); e++) {
229  int v = NI.GetNbrNId(e);
230  if (v == UID) { continue; }
231  if (HOVIDSV[UID].IsKey(v)) { continue; }
232  L += log (1.0 - Prediction(FU, F[v])) + NegWgt * DotProduct(FU, F[v]);
233  }
234  for (TIntFltH::TIter HI = FU.BegI(); HI < FU.EndI(); HI++) {
235  double HOSum = HOVIDSV[UID].Len() > 0? HOSumFV[HI.GetKey()].Val: 0.0;//subtract Hold out pairs only if hold out pairs exist
236  L -= NegWgt * (SumFV[HI.GetKey()] - HOSum - GetCom(UID, HI.GetKey())) * HI.GetDat();
237  }
238  }
239  //add regularization
240  if (RegCoef > 0.0) { //L1
241  L -= RegCoef * Sum(FU);
242  }
243  if (RegCoef < 0.0) { //L2
244  L += RegCoef * Norm2(FU);
245  }
246 
247  return L;
248 }
double Norm2(const TIntFltH &UV)
Definition: agmfast.h:113
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
TBool DoParallel
Definition: agmfast.h:23
double Prediction(const TIntFltH &FU, const TIntFltH &FV)
Definition: agmfast.h:98
TIter BegI() const
Definition: hash.h:213
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
TIter EndI() const
Definition: hash.h:218
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
double Sum(const TIntFltH &UV)
Definition: agmfast.h:106
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:523
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78

Here is the call graph for this function:

double TAGMFast::LikelihoodHoldOut ( const bool  DoParallel = false)

Definition at line 647 of file agmfast.cpp.

References G, HOVIDSV, TUNGraph::IsEdge(), NegWgt, and Prediction().

Referenced by FindComsByCV().

647  {
648  double L = 0.0;
649  for (int u = 0; u < HOVIDSV.Len(); u++) {
650  for (int e = 0; e < HOVIDSV[u].Len(); e++) {
651  int VID = HOVIDSV[u][e];
652  if (VID == u) { continue; }
653  double Pred = Prediction(u, VID);
654  if (G->IsEdge(u, VID)) {
655  L += log(1.0 - Pred);
656  }
657  else {
658  L += NegWgt * log(Pred);
659  }
660  }
661  }
662  return L;
663 }
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
double Prediction(const TIntFltH &FU, const TIntFltH &FV)
Definition: agmfast.h:98
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool IsEdge(const int &SrcNId, const int &DstNId) const
Tests whether an edge between node IDs SrcNId and DstNId exists in the graph.
Definition: graph.cpp:137

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::Load ( TSIn SIn,
const int &  RndSeed = 0 
)

Definition at line 22 of file agmfast.cpp.

References F, G, HOVIDSV, TUNGraph::Load(), TVec< TVal, TSizeTy >::Load(), TBool::Load(), TInt::Load(), TFlt::Load(), MaxVal, MinVal, NegWgt, NIDV, NodesOk, NumComs, PNoCom, TRnd::PutSeed(), RegCoef, Rnd, and SumFV.

22  {
23  G->Load(SIn);
24  F.Load(SIn);
25  NIDV.Load(SIn);
26  RegCoef.Load(SIn);
27  SumFV.Load(SIn);
28  NodesOk.Load(SIn);
29  MinVal.Load(SIn);
30  MaxVal.Load(SIn);
31  NegWgt.Load(SIn);
32  NumComs.Load(SIn);
33  HOVIDSV.Load(SIn);
34  PNoCom.Load(SIn);
35  Rnd.PutSeed(RndSeed);
36 }
PUNGraph G
Definition: agmfast.h:9
TFlt MinVal
Definition: agmfast.h:19
TFlt NegWgt
Definition: agmfast.h:21
void Load(TSIn &SIn)
Definition: dt.h:994
void Load(TSIn &SIn)
Definition: ds.h:946
TBool NodesOk
Definition: agmfast.h:15
TFlt PNoCom
Definition: agmfast.h:22
TVec< TIntFltH > F
Definition: agmfast.h:10
void Load(TSIn &SIn)
Definition: dt.h:1152
TInt NumComs
Definition: agmfast.h:16
static PUNGraph Load(TSIn &SIn)
Static constructor that loads the graph from a stream SIn and returns a pointer to it...
Definition: graph.h:178
TFltV SumFV
Definition: agmfast.h:14
void PutSeed(const int &_Seed)
Definition: dt.cpp:18
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
TIntV NIDV
Definition: agmfast.h:12
void Load(TSIn &SIn)
Definition: dt.h:1405
TRnd Rnd
Definition: agmfast.h:11
TFlt MaxVal
Definition: agmfast.h:20

Here is the call graph for this function:

int TAGMFast::MLEGradAscent ( const double &  Thres,
const int &  MaxIter,
const TStr PlotNm,
const double  StepAlpha = 0.3,
const double  StepBeta = 0.1 
)

Definition at line 690 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), AddCom(), THash< TKey, TDat, THashFunc >::BegI(), DelCom(), TStr::Empty(), THash< TKey, TDat, THashFunc >::EndI(), F, G, GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), TUNGraph::GetNodes(), GetStepSizeByLineSearch(), TExeTm::GetTmStr(), GradientForRow(), HOVIDSV, IAssert, Likelihood(), TFlt::Mn, Norm2(), TGnuPlot::PlotValV(), and Rnd.

Referenced by FindComsByCV().

690  {
691  time_t InitTime = time(NULL);
692  TExeTm ExeTm, CheckTm;
693  int iter = 0, PrevIter = 0;
694  TIntFltPrV IterLV;
695  TUNGraph::TNodeI UI;
696  double PrevL = TFlt::Mn, CurL = 0.0;
697  TIntV NIdxV(F.Len(), 0);
698  for (int i = 0; i < F.Len(); i++) { NIdxV.Add(i); }
699  IAssert(NIdxV.Len() == F.Len());
700  TIntFltH GradV;
701  while(iter < MaxIter) {
702  NIdxV.Shuffle(Rnd);
703  for (int ui = 0; ui < F.Len(); ui++, iter++) {
704  int u = NIdxV[ui]; //
705  //find set of candidate c (we only need to consider c to which a neighbor of u belongs to)
706  UI = G->GetNI(u);
707  TIntSet CIDSet(5 * UI.GetDeg());
708  for (int e = 0; e < UI.GetDeg(); e++) {
709  if (HOVIDSV[u].IsKey(UI.GetNbrNId(e))) { continue; }
710  TIntFltH& NbhCIDH = F[UI.GetNbrNId(e)];
711  for (TIntFltH::TIter CI = NbhCIDH.BegI(); CI < NbhCIDH.EndI(); CI++) {
712  CIDSet.AddKey(CI.GetKey());
713  }
714  }
715  for (TIntFltH::TIter CI = F[u].BegI(); CI < F[u].EndI(); CI++) { //remove the community membership which U does not share with its neighbors
716  if (! CIDSet.IsKey(CI.GetKey())) {
717  DelCom(u, CI.GetKey());
718  }
719  }
720  if (CIDSet.Empty()) { continue; }
721  GradientForRow(u, GradV, CIDSet);
722  if (Norm2(GradV) < 1e-4) { continue; }
723  double LearnRate = GetStepSizeByLineSearch(u, GradV, GradV, StepAlpha, StepBeta);
724  if (LearnRate == 0.0) { continue; }
725  for (int ci = 0; ci < GradV.Len(); ci++) {
726  int CID = GradV.GetKey(ci);
727  double Change = LearnRate * GradV.GetDat(CID);
728  double NewFuc = GetCom(u, CID) + Change;
729  if (NewFuc <= 0.0) {
730  DelCom(u, CID);
731  } else {
732  AddCom(u, CID, NewFuc);
733  }
734  }
735  if (! PlotNm.Empty() && (iter + 1) % G->GetNodes() == 0) {
736  IterLV.Add(TIntFltPr(iter, Likelihood(false)));
737  }
738  }
739  printf("\r%d iterations (%f) [%lu sec]", iter, CurL, time(NULL) - InitTime);
740  fflush(stdout);
741  if (iter - PrevIter >= 2 * G->GetNodes() && iter > 10000) {
742  PrevIter = iter;
743  CurL = Likelihood();
744  if (PrevL > TFlt::Mn && ! PlotNm.Empty()) {
745  printf("\r%d iterations, Likelihood: %f, Diff: %f", iter, CurL, CurL - PrevL);
746  }
747  fflush(stdout);
748  if (CurL - PrevL <= Thres * fabs(PrevL)) { break; }
749  else { PrevL = CurL; }
750  }
751 
752  }
753  printf("\n");
754  printf("MLE for Lambda completed with %d iterations(%s)\n", iter, ExeTm.GetTmStr());
755  if (! PlotNm.Empty()) {
756  TGnuPlot::PlotValV(IterLV, PlotNm + ".likelihood_Q");
757  }
758  return iter;
759 }
#define IAssert(Cond)
Definition: bd.h:262
double Norm2(const TIntFltH &UV)
Definition: agmfast.h:113
PUNGraph G
Definition: agmfast.h:9
Definition: tm.h:355
void DelCom(const int &NID, const int &CID)
Definition: agmfast.h:72
TIter BegI() const
Definition: hash.h:213
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
TIter EndI() const
Definition: hash.h:218
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
TPair< TInt, TFlt > TIntFltPr
Definition: ds.h:87
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
const char * GetTmStr() const
Definition: tm.h:370
TVec< TIntFltH > F
Definition: agmfast.h:10
double GetStepSizeByLineSearch(const int UID, const TIntFltH &DeltaV, const TIntFltH &GradV, const double &Alpha, const double &Beta, const int MaxIter=10)
Definition: agmfast.cpp:665
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool Empty() const
Definition: dt.h:491
TRnd Rnd
Definition: agmfast.h:11
double Likelihood(const bool DoParallel=false)
Definition: agmfast.cpp:173
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
void GradientForRow(const int UID, TIntFltH &GradU, const TIntSet &CIDSet)
Definition: agmfast.cpp:250
static void PlotValV(const TVec< TVal1 > &ValV, const TStr &OutFNmPref, const TStr &Desc="", const TStr &XLabel="", const TStr &YLabel="", const TGpScaleTy &ScaleTy=gpsAuto, const bool &PowerFit=false, const TGpSeriesTy &SeriesTy=gpwLinesPoints)
Definition: gnuplot.h:398
static const double Mn
Definition: dt.h:1390
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:430

Here is the call graph for this function:

Here is the caller graph for this function:

int TAGMFast::MLEGradAscentParallel ( const double &  Thres,
const int &  MaxIter,
const int  ChunkNum,
const int  ChunkSize,
const TStr PlotNm,
const double  StepAlpha = 0.3,
const double  StepBeta = 0.1 
)

Definition at line 761 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), THash< TKey, TDat, THashFunc >::BegI(), TStr::Empty(), THashKeyDatI< TKey, TDat >::EndI, THash< TKey, TDat, THashFunc >::EndI(), F, G, TSecTm::GetAbsSecs(), TSecTm::GetCurTm(), THash< TKey, TDat, THashFunc >::GetDat(), TVec< TVal, TSizeTy >::GetDat(), TUNGraph::TNodeI::GetDeg(), THash< TKey, TDat, THashFunc >::GetKey(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), TUNGraph::GetNodes(), GetStepSizeByLineSearch(), TExeTm::GetTmStr(), GradientForRow(), HOVIDSV, IAssert, THash< TKey, TDat, THashFunc >::Len(), TVec< TVal, TSizeTy >::Len(), Likelihood(), TFlt::Mn, Norm2(), TGnuPlot::PlotValV(), TVec< TVal, TSizeTy >::PutAll(), Rnd, and SumFV.

Referenced by FindComsByCV(), and MLEGradAscentParallel().

761  {
762  //parallel
763  time_t InitTime = time(NULL);
764  uint64 StartTm = TSecTm::GetCurTm().GetAbsSecs();
765  TExeTm ExeTm, CheckTm;
766  double PrevL = Likelihood(true);
767  TIntFltPrV IterLV;
768  int PrevIter = 0;
769  int iter = 0;
770  TIntV NIdxV(F.Len(), 0);
771  for (int i = 0; i < F.Len(); i++) { NIdxV.Add(i); }
772  TIntV NIDOPTV(F.Len()); //check if a node needs optimization or not 1: does not require optimization
773  NIDOPTV.PutAll(0);
774  TVec<TIntFltH> NewF(ChunkNum * ChunkSize);
775  TIntV NewNIDV(ChunkNum * ChunkSize);
776  for (iter = 0; iter < MaxIter; iter++) {
777  NIdxV.Clr(false);
778  for (int i = 0; i < F.Len(); i++) {
779  if (NIDOPTV[i] == 0) { NIdxV.Add(i); }
780  }
781  IAssert (NIdxV.Len() <= F.Len());
782  NIdxV.Shuffle(Rnd);
783  // compute gradient for chunk of nodes
784 #pragma omp parallel for schedule(static, 1)
785  for (int TIdx = 0; TIdx < ChunkNum; TIdx++) {
786  TIntFltH GradV;
787  for (int ui = TIdx * ChunkSize; ui < (TIdx + 1) * ChunkSize; ui++) {
788  NewNIDV[ui] = -1;
789  if (ui > NIdxV.Len()) { continue; }
790  int u = NIdxV[ui]; //
791  //find set of candidate c (we only need to consider c to which a neighbor of u belongs to)
792  TUNGraph::TNodeI UI = G->GetNI(u);
793  TIntSet CIDSet(5 * UI.GetDeg());
794  TIntFltH CurFU = F[u];
795  for (int e = 0; e < UI.GetDeg(); e++) {
796  if (HOVIDSV[u].IsKey(UI.GetNbrNId(e))) { continue; }
797  TIntFltH& NbhCIDH = F[UI.GetNbrNId(e)];
798  for (TIntFltH::TIter CI = NbhCIDH.BegI(); CI < NbhCIDH.EndI(); CI++) {
799  CIDSet.AddKey(CI.GetKey());
800  }
801  }
802  if (CIDSet.Empty()) {
803  CurFU.Clr();
804  }
805  else {
806  for (TIntFltH::TIter CI = CurFU.BegI(); CI < CurFU.EndI(); CI++) { //remove the community membership which U does not share with its neighbors
807  if (! CIDSet.IsKey(CI.GetKey())) {
808  CurFU.DelIfKey(CI.GetKey());
809  }
810  }
811  GradientForRow(u, GradV, CIDSet);
812  if (Norm2(GradV) < 1e-4) { NIDOPTV[u] = 1; continue; }
813  double LearnRate = GetStepSizeByLineSearch(u, GradV, GradV, StepAlpha, StepBeta, 5);
814  if (LearnRate == 0.0) { NewNIDV[ui] = -2; continue; }
815  for (int ci = 0; ci < GradV.Len(); ci++) {
816  int CID = GradV.GetKey(ci);
817  double Change = LearnRate * GradV.GetDat(CID);
818  double NewFuc = CurFU.IsKey(CID)? CurFU.GetDat(CID) + Change : Change;
819  if (NewFuc <= 0.0) {
820  CurFU.DelIfKey(CID);
821  } else {
822  CurFU.AddDat(CID) = NewFuc;
823  }
824  }
825  CurFU.Defrag();
826  }
827  //store changes
828  NewF[ui] = CurFU;
829  NewNIDV[ui] = u;
830  }
831  }
832  int NumNoChangeGrad = 0;
833  int NumNoChangeStepSize = 0;
834  for (int ui = 0; ui < NewNIDV.Len(); ui++) {
835  int NewNID = NewNIDV[ui];
836  if (NewNID == -1) { NumNoChangeGrad++; continue; }
837  if (NewNID == -2) { NumNoChangeStepSize++; continue; }
838  for (TIntFltH::TIter CI = F[NewNID].BegI(); CI < F[NewNID].EndI(); CI++) {
839  SumFV[CI.GetKey()] -= CI.GetDat();
840  }
841  }
842 #pragma omp parallel for
843  for (int ui = 0; ui < NewNIDV.Len(); ui++) {
844  int NewNID = NewNIDV[ui];
845  if (NewNID < 0) { continue; }
846  F[NewNID] = NewF[ui];
847  }
848  for (int ui = 0; ui < NewNIDV.Len(); ui++) {
849  int NewNID = NewNIDV[ui];
850  if (NewNID < 0) { continue; }
851  for (TIntFltH::TIter CI = F[NewNID].BegI(); CI < F[NewNID].EndI(); CI++) {
852  SumFV[CI.GetKey()] += CI.GetDat();
853  }
854  }
855  // update the nodes who are optimal
856  for (int ui = 0; ui < NewNIDV.Len(); ui++) {
857  int NewNID = NewNIDV[ui];
858  if (NewNID < 0) { continue; }
859  TUNGraph::TNodeI UI = G->GetNI(NewNID);
860  NIDOPTV[NewNID] = 0;
861  for (int e = 0; e < UI.GetDeg(); e++) {
862  NIDOPTV[UI.GetNbrNId(e)] = 0;
863  }
864  }
865  int OPTCnt = 0;
866  for (int i = 0; i < NIDOPTV.Len(); i++) { if (NIDOPTV[i] == 1) { OPTCnt++; } }
867  if (! PlotNm.Empty()) {
868  printf("\r%d iterations [%s] %d secs", iter * ChunkSize * ChunkNum, ExeTm.GetTmStr(), int(TSecTm::GetCurTm().GetAbsSecs() - StartTm));
869  if (PrevL > TFlt::Mn) { printf(" (%f) %d g %d s %d OPT", PrevL, NumNoChangeGrad, NumNoChangeStepSize, OPTCnt); }
870  fflush(stdout);
871  }
872  if ((iter - PrevIter) * ChunkSize * ChunkNum >= G->GetNodes()) {
873  PrevIter = iter;
874  double CurL = Likelihood(true);
875  IterLV.Add(TIntFltPr(iter * ChunkSize * ChunkNum, CurL));
876  printf("\r%d iterations, Likelihood: %f, Diff: %f [%d secs]", iter, CurL, CurL - PrevL, int(time(NULL) - InitTime));
877  fflush(stdout);
878  if (CurL - PrevL <= Thres * fabs(PrevL)) {
879  break;
880  }
881  else {
882  PrevL = CurL;
883  }
884  }
885  }
886  if (! PlotNm.Empty()) {
887  printf("\nMLE completed with %d iterations(%d secs)\n", iter, int(TSecTm::GetCurTm().GetAbsSecs() - StartTm));
888  TGnuPlot::PlotValV(IterLV, PlotNm + ".likelihood_Q");
889  } else {
890  printf("\rMLE completed with %d iterations(%d secs)", iter, int(time(NULL) - InitTime));
891  fflush(stdout);
892  }
893  return iter;
894 }
#define IAssert(Cond)
Definition: bd.h:262
double Norm2(const TIntFltH &UV)
Definition: agmfast.h:113
PUNGraph G
Definition: agmfast.h:9
Definition: tm.h:355
TIter BegI() const
Definition: hash.h:213
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
const TDat & GetDat(const TKey &Key) const
Definition: hash.h:262
TIter EndI() const
Definition: hash.h:218
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
TPair< TInt, TFlt > TIntFltPr
Definition: ds.h:87
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
const char * GetTmStr() const
Definition: tm.h:370
TVec< TIntFltH > F
Definition: agmfast.h:10
double GetStepSizeByLineSearch(const int UID, const TIntFltH &DeltaV, const TIntFltH &GradV, const double &Alpha, const double &Beta, const int MaxIter=10)
Definition: agmfast.cpp:665
void PutAll(const TVal &Val)
Sets all elements of the vector to value Val.
Definition: ds.h:1229
unsigned long long uint64
Definition: bd.h:38
const TVal & GetDat(const TVal &Val) const
Returns reference to the first occurrence of element Val.
Definition: ds.h:838
uint GetAbsSecs() const
Definition: tm.h:150
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool Empty() const
Definition: dt.h:491
TRnd Rnd
Definition: agmfast.h:11
static TSecTm GetCurTm()
Definition: tm.cpp:697
double Likelihood(const bool DoParallel=false)
Definition: agmfast.cpp:173
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
int Len() const
Definition: hash.h:228
void GradientForRow(const int UID, TIntFltH &GradU, const TIntSet &CIDSet)
Definition: agmfast.cpp:250
THKeyDat * EndI
Definition: hash.h:54
static void PlotValV(const TVec< TVal1 > &ValV, const TStr &OutFNmPref, const TStr &Desc="", const TStr &XLabel="", const TStr &YLabel="", const TGpScaleTy &ScaleTy=gpsAuto, const bool &PowerFit=false, const TGpSeriesTy &SeriesTy=gpwLinesPoints)
Definition: gnuplot.h:398
const TKey & GetKey(const int &KeyId) const
Definition: hash.h:252
static const double Mn
Definition: dt.h:1390
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:430

Here is the call graph for this function:

Here is the caller graph for this function:

int TAGMFast::MLEGradAscentParallel ( const double &  Thres,
const int &  MaxIter,
const int  ChunkNum,
const TStr PlotNm = TStr(),
const double  StepAlpha = 0.3,
const double  StepBeta = 0.1 
)
inline

Definition at line 49 of file agmfast.h.

References TUNGraph::GetNodes(), and MLEGradAscentParallel().

49  {
50  int ChunkSize = G->GetNodes() / 10 / ChunkNum;
51  if (ChunkSize == 0) { ChunkSize = 1; }
52  return MLEGradAscentParallel(Thres, MaxIter, ChunkNum, ChunkSize, PlotNm, StepAlpha, StepBeta);
53  }
PUNGraph G
Definition: agmfast.h:9
int MLEGradAscentParallel(const double &Thres, const int &MaxIter, const int ChunkNum, const int ChunkSize, const TStr &PlotNm, const double StepAlpha=0.3, const double StepBeta=0.1)
Definition: agmfast.cpp:761
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192

Here is the call graph for this function:

int TAGMFast::MLENewton ( const double &  Thres,
const int &  MaxIter,
const TStr PlotNm = TStr() 
)

Newton method: DEPRECATED.

Definition at line 416 of file agmfast.cpp.

References TVec< TVal, TSizeTy >::Add(), AddCom(), THashSet< TKey, THashFunc >::AddKey(), THash< TKey, TDat, THashFunc >::BegI(), THashSet< TKey, THashFunc >::BegI(), DelCom(), DotProduct(), TStr::Empty(), THashSet< TKey, THashFunc >::Empty(), THash< TKey, TDat, THashFunc >::EndI(), THashSet< TKey, THashFunc >::EndI(), F, TStr::Fmt(), G, GetCom(), TUNGraph::TNodeI::GetDeg(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), TUNGraph::GetNIdV(), TUNGraph::GetNodes(), TExeTm::GetTmStr(), GradientForOneVar(), HessianForOneVar(), HOVIDSV, IAssertR, THashSet< TKey, THashFunc >::IsKey(), Likelihood(), TFlt::Mn, TGnuPlot::PlotValV(), PNoCom, Rnd, TVec< TVal, TSizeTy >::Shuffle(), and TFlt::Val.

416  {
417  TExeTm ExeTm;
418  int iter = 0, PrevIter = 0;
419  TIntFltPrV IterLV;
420  double PrevL = TFlt::Mn, CurL;
421  TUNGraph::TNodeI UI;
422  TIntV NIdxV;
423  G->GetNIdV(NIdxV);
424  int CID, UID, NewtonIter;
425  double Fuc;
426  //double PrevFuc;
427  double Grad, H;
428  while(iter < MaxIter) {
429  NIdxV.Shuffle(Rnd);
430  for (int ui = 0; ui < F.Len(); ui++, iter++) {
431  if (! PlotNm.Empty() && iter % G->GetNodes() == 0) {
432  IterLV.Add(TIntFltPr(iter, Likelihood(false)));
433  }
434  UID = NIdxV[ui];
435  //find set of candidate c (we only need to consider c to which a neighbor of u belongs to)
436  TIntSet CIDSet;
437  UI = G->GetNI(UID);
438  if (UI.GetDeg() == 0) { //if the node is isolated, clear its membership and skip
439  if (! F[UID].Empty()) { F[UID].Clr(); }
440  continue;
441  }
442  for (int e = 0; e < UI.GetDeg(); e++) {
443  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
444  TIntFltH& NbhCIDH = F[UI.GetNbrNId(e)];
445  for (TIntFltH::TIter CI = NbhCIDH.BegI(); CI < NbhCIDH.EndI(); CI++) {
446  CIDSet.AddKey(CI.GetKey());
447  }
448  }
449  for (TIntFltH::TIter CI = F[UID].BegI(); CI < F[UID].EndI(); CI++) { //remove the community membership which U does not share with its neighbors
450  if (! CIDSet.IsKey(CI.GetKey())) {
451  DelCom(UID, CI.GetKey());
452  }
453  }
454  if (CIDSet.Empty()) { continue; }
455  for (TIntSet::TIter CI = CIDSet.BegI(); CI < CIDSet.EndI(); CI++) {
456  CID = CI.GetKey();
457  //optimize for UID, CID
458  //compute constants
459  TFltV AlphaKV(UI.GetDeg());
460  for (int e = 0; e < UI.GetDeg(); e++) {
461  if (HOVIDSV[UID].IsKey(UI.GetNbrNId(e))) { continue; }
462  AlphaKV[e] = (1 - PNoCom) * exp(- DotProduct(UID, UI.GetNbrNId(e)) + GetCom(UI.GetNbrNId(e), CID) * GetCom(UID, CID));
463  IAssertR(AlphaKV[e] <= 1.0, TStr::Fmt("AlphaKV=%f, %f, %f", AlphaKV[e].Val, PNoCom.Val, GetCom(UI.GetNbrNId(e), CID)));
464  }
465  Fuc = GetCom(UID, CID);
466  //PrevFuc = Fuc;
467  Grad = GradientForOneVar(AlphaKV, UID, CID, Fuc), H = 0.0;
468  if (Grad <= 1e-3 && Grad >= -0.1) { continue; }
469  NewtonIter = 0;
470  while (NewtonIter++ < 10) {
471  Grad = GradientForOneVar(AlphaKV, UID, CID, Fuc), H = 0.0;
472  H = HessianForOneVar(AlphaKV, UID, CID, Fuc);
473  if (Fuc == 0.0 && Grad <= 0.0) { Grad = 0.0; }
474  if (fabs(Grad) < 1e-3) { break; }
475  if (H == 0.0) { Fuc = 0.0; break; }
476  double NewtonStep = - Grad / H;
477  if (NewtonStep < -0.5) { NewtonStep = - 0.5; }
478  Fuc += NewtonStep;
479  if (Fuc < 0.0) { Fuc = 0.0; }
480  }
481  if (Fuc == 0.0) {
482  DelCom(UID, CID);
483  }
484  else {
485  AddCom(UID, CID, Fuc);
486  }
487  }
488  }
489  if (iter - PrevIter >= 2 * G->GetNodes() && iter > 10000) {
490  PrevIter = iter;
491  CurL = Likelihood();
492  if (PrevL > TFlt::Mn && ! PlotNm.Empty()) {
493  printf("\r%d iterations, Likelihood: %f, Diff: %f", iter, CurL, CurL - PrevL);
494  }
495  fflush(stdout);
496  if (CurL - PrevL <= Thres * fabs(PrevL)) { break; }
497  else { PrevL = CurL; }
498  }
499 
500  }
501  if (! PlotNm.Empty()) {
502  printf("\nMLE for Lambda completed with %d iterations(%s)\n", iter, ExeTm.GetTmStr());
503  TGnuPlot::PlotValV(IterLV, PlotNm + ".likelihood_Q");
504  }
505  return iter;
506 }
PUNGraph G
Definition: agmfast.h:9
#define IAssertR(Cond, Reason)
Definition: bd.h:265
Definition: tm.h:355
double GradientForOneVar(const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
Definition: agmfast.cpp:339
TIter BegI() const
Definition: shash.h:1105
void GetNIdV(TIntV &NIdV) const
Gets a vector IDs of all nodes in the graph.
Definition: graph.cpp:153
void DelCom(const int &NID, const int &CID)
Definition: agmfast.h:72
TIter BegI() const
Definition: hash.h:213
double Val
Definition: dt.h:1388
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
TIter EndI() const
Definition: hash.h:218
bool IsKey(const TKey &Key) const
Definition: shash.h:1148
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
double GetCom(const int &NID, const int &CID)
Definition: agmfast.h:57
TFlt PNoCom
Definition: agmfast.h:22
TPair< TInt, TFlt > TIntFltPr
Definition: ds.h:87
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
const char * GetTmStr() const
Definition: tm.h:370
bool Empty() const
Definition: shash.h:1120
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
int AddKey(const TKey &Key)
Definition: shash.h:1254
TIter EndI() const
Definition: shash.h:1112
double HessianForOneVar(const TFltV &AlphaKV, const int UID, const int CID, const double &Val)
Definition: agmfast.cpp:394
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
bool Empty() const
Definition: dt.h:491
static TStr Fmt(const char *FmtStr,...)
Definition: dt.cpp:1599
void Shuffle(TRnd &Rnd)
Randomly shuffles the elements of the vector.
Definition: ds.h:1335
TRnd Rnd
Definition: agmfast.h:11
double Likelihood(const bool DoParallel=false)
Definition: agmfast.cpp:173
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
TSizeTy Add()
Adds a new element at the end of the vector, after its current last element.
Definition: ds.h:602
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78
static void PlotValV(const TVec< TVal1 > &ValV, const TStr &OutFNmPref, const TStr &Desc="", const TStr &XLabel="", const TStr &YLabel="", const TGpScaleTy &ScaleTy=gpsAuto, const bool &PowerFit=false, const TGpSeriesTy &SeriesTy=gpwLinesPoints)
Definition: gnuplot.h:398
static const double Mn
Definition: dt.h:1390
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:430

Here is the call graph for this function:

void TAGMFast::NeighborComInit ( const int  InitComs)

Definition at line 60 of file agmfast.cpp.

References AddCom(), F, G, TVec< TVal, TSizeTy >::Gen(), TAGMUtil::GetConductance(), TUNGraph::TNodeI::GetDeg(), TUNGraph::GetEdges(), TAGMUtil::GetNbhCom(), TUNGraph::TNodeI::GetNbrNId(), TUNGraph::GetNI(), TUNGraph::GetNodes(), TExeTm::GetTmStr(), TRnd::GetUniDev(), TRnd::GetUniDevInt(), IAssert, TVec< TVal, TSizeTy >::Len(), NumComs, Rnd, and SumFV.

Referenced by FindComsByCV().

60  {
61  //initialize with best neighborhood communities (Gleich et.al. KDD'12)
62  F.Gen(G->GetNodes());
63  SumFV.Gen(InitComs);
64  NumComs = InitComs;
65  const int Edges = G->GetEdges();
66  //TIntFltH NCPhiH(F.Len());
67  TFltIntPrV NIdPhiV(F.Len(), 0);
68  TIntSet InvalidNIDS(F.Len());
69  TIntV ChosenNIDV(InitComs, 0); //FOR DEBUG
70  TExeTm RunTm;
71  //compute conductance of neighborhood community
72  for (int u = 0; u < F.Len(); u++) {
73  TIntSet NBCmty(G->GetNI(u).GetDeg() + 1);
74  double Phi;
75  if (G->GetNI(u).GetDeg() < 5) { //do not include nodes with too few degree
76  Phi = 1.0;
77  } else {
78  TAGMUtil::GetNbhCom(G, u, NBCmty);
79  IAssert(NBCmty.Len() == G->GetNI(u).GetDeg() + 1);
80  Phi = TAGMUtil::GetConductance(G, NBCmty, Edges);
81  }
82  //NCPhiH.AddDat(u, Phi);
83  NIdPhiV.Add(TFltIntPr(Phi, u));
84  }
85  NIdPhiV.Sort(true);
86  printf("conductance computation completed [%s]\n", RunTm.GetTmStr());
87  fflush(stdout);
88  //choose nodes with local minimum in conductance
89  int CurCID = 0;
90  for (int ui = 0; ui < NIdPhiV.Len(); ui++) {
91  int UID = NIdPhiV[ui].Val2;
92  fflush(stdout);
93  if (InvalidNIDS.IsKey(UID)) { continue; }
94  ChosenNIDV.Add(UID); //FOR DEBUG
95  //add the node and its neighbors to the current community
96  AddCom(UID, CurCID, 1.0);
97  TUNGraph::TNodeI NI = G->GetNI(UID);
98  fflush(stdout);
99  for (int e = 0; e < NI.GetDeg(); e++) {
100  AddCom(NI.GetNbrNId(e), CurCID, 1.0);
101  }
102  //exclude its neighbors from the next considerations
103  for (int e = 0; e < NI.GetDeg(); e++) {
104  InvalidNIDS.AddKey(NI.GetNbrNId(e));
105  }
106  CurCID++;
107  fflush(stdout);
108  if (CurCID >= NumComs) { break; }
109  }
110  if (NumComs > CurCID) {
111  printf("%d communities needed to fill randomly\n", NumComs - CurCID);
112  }
113  //assign a member to zero-member community (if any)
114  for (int c = 0; c < SumFV.Len(); c++) {
115  if (SumFV[c] == 0.0) {
116  int ComSz = 10;
117  for (int u = 0; u < ComSz; u++) {
118  int UID = Rnd.GetUniDevInt(G->GetNodes());
119  AddCom(UID, c, Rnd.GetUniDev());
120  }
121  }
122  }
123 }
#define IAssert(Cond)
Definition: bd.h:262
static void GetNbhCom(const PUNGraph &Graph, const int NID, TIntSet &NBCmtyS)
Definition: agm.cpp:706
PUNGraph G
Definition: agmfast.h:9
TPair< TFlt, TInt > TFltIntPr
Definition: ds.h:97
Definition: tm.h:355
int GetEdges() const
Returns the number of edges in the graph.
Definition: graph.cpp:82
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
Node iterator. Only forward iteration (operator++) is supported.
Definition: graph.h:68
static double GetConductance(const PUNGraph &Graph, const TIntSet &CmtyS, const int Edges)
Definition: agm.cpp:683
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
const char * GetTmStr() const
Definition: tm.h:370
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TRnd Rnd
Definition: agmfast.h:11
double GetUniDev()
Definition: dt.h:30
int GetNbrNId(const int &NodeN) const
Returns ID of NodeN-th neighboring node.
Definition: graph.h:111
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:523
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
int GetUniDevInt(const int &Range=0)
Definition: dt.cpp:39
Vector is a sequence TVal objects representing an array that can change in size.
Definition: ds.h:430

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::Norm2 ( const TIntFltH UV)
inline

Definition at line 113 of file agmfast.h.

References THash< TKey, TDat, THashFunc >::BegI(), and THash< TKey, TDat, THashFunc >::EndI().

Referenced by LikelihoodForRow(), MLEGradAscent(), and MLEGradAscentParallel().

113  {
114  double N = 0.0;
115  for (TIntFltH::TIter HI = UV.BegI(); HI < UV.EndI(); HI++) {
116  N += HI.GetDat() * HI.GetDat();
117  }
118  return N;
119  }
TIter BegI() const
Definition: hash.h:213
TIter EndI() const
Definition: hash.h:218

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::Prediction ( const TIntFltH FU,
const TIntFltH FV 
)
inline

Definition at line 98 of file agmfast.h.

References DotProduct(), TStr::Fmt(), and IAssertR.

Referenced by GradientForRow(), LikelihoodForRow(), LikelihoodHoldOut(), and Prediction().

98  {
99  double DP = log (1.0 / (1.0 - PNoCom)) + DotProduct(FU, FV);
100  IAssertR(DP > 0.0, TStr::Fmt("DP: %f", DP));
101  return exp(- DP);
102  }
#define IAssertR(Cond, Reason)
Definition: bd.h:265
TFlt PNoCom
Definition: agmfast.h:22
static TStr Fmt(const char *FmtStr,...)
Definition: dt.cpp:1599
double DotProduct(const TIntFltH &UV, const TIntFltH &VV)
Definition: agmfast.h:78

Here is the call graph for this function:

Here is the caller graph for this function:

double TAGMFast::Prediction ( const int &  UID,
const int &  VID 
)
inline

Definition at line 103 of file agmfast.h.

References Prediction().

103  {
104  return Prediction(F[UID], F[VID]);
105  }
double Prediction(const TIntFltH &FU, const TIntFltH &FV)
Definition: agmfast.h:98
TVec< TIntFltH > F
Definition: agmfast.h:10

Here is the call graph for this function:

void TAGMFast::RandomInit ( const int  InitComs)

Definition at line 38 of file agmfast.cpp.

References AddCom(), F, G, TVec< TVal, TSizeTy >::Gen(), TUNGraph::TNodeI::GetDeg(), TUNGraph::GetNI(), TUNGraph::GetNodes(), TRnd::GetUniDev(), TRnd::GetUniDevInt(), TVec< TVal, TSizeTy >::Len(), NumComs, Rnd, and SumFV.

Referenced by FindComsByCV(), and TAGMFast().

38  {
39  F.Gen(G->GetNodes());
40  SumFV.Gen(InitComs);
41  NumComs = InitComs;
42  for (int u = 0; u < F.Len(); u++) {
43  //assign to just one community
44  int Mem = G->GetNI(u).GetDeg();
45  if (Mem > 10) { Mem = 10; }
46  for (int c = 0; c < Mem; c++) {
47  int CID = Rnd.GetUniDevInt(InitComs);
48  AddCom(u, CID, Rnd.GetUniDev());
49  }
50  }
51  //assign a member to zero-member community (if any)
52  for (int c = 0; c < SumFV.Len(); c++) {
53  if (SumFV[c] == 0.0) {
54  int UID = Rnd.GetUniDevInt(G->GetNodes());
55  AddCom(UID, c, Rnd.GetUniDev());
56  }
57  }
58 }
PUNGraph G
Definition: agmfast.h:9
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
int GetDeg() const
Returns degree of the current node.
Definition: graph.h:90
TVec< TIntFltH > F
Definition: agmfast.h:10
TNodeI GetNI(const int &NId) const
Returns an iterator referring to the node of ID NId in the graph.
Definition: graph.h:241
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TRnd Rnd
Definition: agmfast.h:11
double GetUniDev()
Definition: dt.h:30
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:523
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
int GetUniDevInt(const int &Range=0)
Definition: dt.cpp:39

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::Save ( TSOut SOut)

Definition at line 7 of file agmfast.cpp.

References F, G, HOVIDSV, MaxVal, MinVal, NegWgt, NIDV, NodesOk, NumComs, PNoCom, RegCoef, TUNGraph::Save(), TVec< TVal, TSizeTy >::Save(), TBool::Save(), TInt::Save(), TFlt::Save(), and SumFV.

7  {
8  G->Save(SOut);
9  F.Save(SOut);
10  NIDV.Save(SOut);
11  RegCoef.Save(SOut);
12  SumFV.Save(SOut);
13  NodesOk.Save(SOut);
14  MinVal.Save(SOut);
15  MaxVal.Save(SOut);
16  NegWgt.Save(SOut);
17  NumComs.Save(SOut);
18  HOVIDSV.Save(SOut);
19  PNoCom.Save(SOut);
20 }
PUNGraph G
Definition: agmfast.h:9
TFlt MinVal
Definition: agmfast.h:19
TFlt NegWgt
Definition: agmfast.h:21
void Save(TSOut &SOut) const
Definition: dt.h:1153
void Save(TSOut &SOut) const
Definition: dt.h:995
void Save(TSOut &SOut) const
Saves the graph to a (binary) stream SOut.
Definition: graph.h:170
void Save(TSOut &SOut) const
Definition: ds.h:954
TBool NodesOk
Definition: agmfast.h:15
TFlt PNoCom
Definition: agmfast.h:22
TVec< TIntFltH > F
Definition: agmfast.h:10
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TFlt RegCoef
Definition: agmfast.h:13
TIntV NIDV
Definition: agmfast.h:12
TFlt MaxVal
Definition: agmfast.h:20
void Save(TSOut &SOut) const
Definition: dt.h:1402

Here is the call graph for this function:

void TAGMFast::SetCmtyVV ( const TVec< TIntV > &  CmtyVV)

Definition at line 125 of file agmfast.cpp.

References AddCom(), THash< TKey, TDat, THashFunc >::AddDat(), F, G, TVec< TVal, TSizeTy >::Gen(), TVec< TVal, TSizeTy >::GetDat(), TUNGraph::GetNodes(), TUNGraph::IsNode(), TVec< TVal, TSizeTy >::Len(), NIDV, NodesOk, NumComs, and SumFV.

125  {
126  F.Gen(G->GetNodes());
127  SumFV.Gen(CmtyVV.Len());
128  NumComs = CmtyVV.Len();
129  TIntH NIDIdxH(NIDV.Len());
130  if (! NodesOk) {
131  for (int u = 0; u < NIDV.Len(); u++) {
132  NIDIdxH.AddDat(NIDV[u], u);
133  }
134  }
135  for (int c = 0; c < CmtyVV.Len(); c++) {
136  for (int u = 0; u < CmtyVV[c].Len(); u++) {
137  int UID = CmtyVV[c][u];
138  if (! NodesOk) { UID = NIDIdxH.GetDat(UID); }
139  if (G->IsNode(UID)) {
140  AddCom(UID, c, 1.0);
141  }
142  }
143  }
144 }
PUNGraph G
Definition: agmfast.h:9
TSizeTy Len() const
Returns the number of elements in the vector.
Definition: ds.h:575
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
TBool NodesOk
Definition: agmfast.h:15
TVec< TIntFltH > F
Definition: agmfast.h:10
const TVal & GetDat(const TVal &Val) const
Returns reference to the first occurrence of element Val.
Definition: ds.h:838
TInt NumComs
Definition: agmfast.h:16
TFltV SumFV
Definition: agmfast.h:14
TIntV NIDV
Definition: agmfast.h:12
bool IsNode(const int &NId) const
Tests whether ID NId is a node.
Definition: graph.h:235
void Gen(const TSizeTy &_Vals)
Constructs a vector (an array) of _Vals elements.
Definition: ds.h:523
void AddCom(const int &NID, const int &CID, const double &Val)
Definition: agmfast.h:64
TDat & AddDat(const TKey &Key)
Definition: hash.h:238

Here is the call graph for this function:

void TAGMFast::SetGraph ( const PUNGraph GraphPt)

Definition at line 146 of file agmfast.cpp.

References TSnap::DelSelfEdges(), DoParallel, G, TUNGraph::GetNIdV(), TUNGraph::GetNodes(), TSnap::GetSubGraph(), HOVIDSV, IAssert, TUNGraph::IsNode(), TFlt::Mx, NegWgt, NIDV, NodesOk, and PNoCom.

Referenced by TAGMFast().

146  {
147  G = GraphPt;
148  HOVIDSV.Gen(G->GetNodes());
149  NodesOk = true;
150  GraphPt->GetNIdV(NIDV);
151  // check that nodes IDs are {0,1,..,Nodes-1}
152  for (int nid = 0; nid < GraphPt->GetNodes(); nid++) {
153  if (! GraphPt->IsNode(nid)) {
154  NodesOk = false;
155  break;
156  }
157  }
158  if (! NodesOk) {
159  printf("rearrage nodes\n");
160  G = TSnap::GetSubGraph(GraphPt, NIDV, true);
161  for (int nid = 0; nid < G->GetNodes(); nid++) {
162  IAssert(G->IsNode(nid));
163  }
164  }
166 
167  PNoCom = 1.0 / (double) G->GetNodes();
168  DoParallel = false;
169  if (1.0 / PNoCom > sqrt(TFlt::Mx)) { PNoCom = 0.99 / sqrt(TFlt::Mx); } // to prevent overflow
170  NegWgt = 1.0;
171 }
#define IAssert(Cond)
Definition: bd.h:262
PUNGraph G
Definition: agmfast.h:9
TFlt NegWgt
Definition: agmfast.h:21
TBool DoParallel
Definition: agmfast.h:23
static const double Mx
Definition: dt.h:1391
int GetNodes() const
Returns the number of nodes in the graph.
Definition: graph.h:192
TBool NodesOk
Definition: agmfast.h:15
TFlt PNoCom
Definition: agmfast.h:22
PUNGraph GetSubGraph(const PUNGraph &Graph, const TIntV &NIdV, const bool &RenumberNodes)
Returns an induced subgraph of an undirected graph Graph with NIdV nodes with an optional node renumb...
Definition: subgraph.cpp:7
TVec< TIntSet > HOVIDSV
Definition: agmfast.h:18
TIntV NIDV
Definition: agmfast.h:12
bool IsNode(const int &NId) const
Tests whether ID NId is a node.
Definition: graph.h:235
void DelSelfEdges(const PGraph &Graph)
Removes all the self-edges from the graph.
Definition: alg.h:419

Here is the call graph for this function:

Here is the caller graph for this function:

void TAGMFast::SetRegCoef ( const double  _RegCoef)
inline

Definition at line 28 of file agmfast.h.

28 { RegCoef = _RegCoef; }
TFlt RegCoef
Definition: agmfast.h:13
double TAGMFast::Sum ( const TIntFltH UV)
inline

Definition at line 106 of file agmfast.h.

References THash< TKey, TDat, THashFunc >::BegI(), and THash< TKey, TDat, THashFunc >::EndI().

Referenced by LikelihoodForRow().

106  {
107  double N = 0.0;
108  for (TIntFltH::TIter HI = UV.BegI(); HI < UV.EndI(); HI++) {
109  N += HI.GetDat();
110  }
111  return N;
112  }
TIter BegI() const
Definition: hash.h:213
TIter EndI() const
Definition: hash.h:218

Here is the call graph for this function:

Here is the caller graph for this function:

Member Data Documentation

TBool TAGMFast::DoParallel

Definition at line 23 of file agmfast.h.

Referenced by GradientForRow(), LikelihoodForRow(), and SetGraph().

TFlt TAGMFast::MaxVal

Definition at line 20 of file agmfast.h.

Referenced by GetStepSizeByLineSearch(), Load(), and Save().

TFlt TAGMFast::MinVal

Definition at line 19 of file agmfast.h.

Referenced by GetStepSizeByLineSearch(), Load(), and Save().

TIntV TAGMFast::NIDV
private

Definition at line 12 of file agmfast.h.

Referenced by GetCmtyVV(), Load(), Save(), SetCmtyVV(), and SetGraph().

TBool TAGMFast::NodesOk
private

Definition at line 15 of file agmfast.h.

Referenced by GetCmtyVV(), Load(), Save(), SetCmtyVV(), and SetGraph().

TInt TAGMFast::NumComs
private

Definition at line 16 of file agmfast.h.

Referenced by GetCmtyVV(), Load(), NeighborComInit(), RandomInit(), Save(), and SetCmtyVV().

TFlt TAGMFast::PNoCom

Definition at line 22 of file agmfast.h.

Referenced by Load(), MLENewton(), Save(), and SetGraph().

TFlt TAGMFast::RegCoef
private
TRnd TAGMFast::Rnd
private

The documentation for this class was generated from the following files: