Enron email communication network covers all the email communication within a dataset of around half million emails. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation. Nodes of the network are email addresses and if an address i sent at least one email to address j, the graph contains an undirected edge from i to j. Note that non-Enron email addresses act as sinks and sources in the network as we only observe their communication with the Enron email addresses.
The Enron email data was originally released by William Cohen at CMU.
Dataset statistics | |
---|---|
Nodes | 36692 |
Edges | 183831 |
Nodes in largest WCC | 33696 (0.918) |
Edges in largest WCC | 180811 (0.984) |
Nodes in largest SCC | 33696 (0.918) |
Edges in largest SCC | 180811 (0.984) |
Average clustering coefficient | 0.4970 |
Number of triangles | 727044 |
Fraction of closed triangles | 0.03015 |
Diameter (longest shortest path) | 11 |
90-percentile effective diameter | 4.8 |
File | Description |
---|---|
email-Enron.txt.gz | Enron email network |
Enron email data | Complete Enron email dataset (includes full email message text and attachments) |