[Bigbang-dev] Analyze Senders and consolidating email senders
npdoty at ischool.berkeley.edu
Thu May 5 17:42:15 PDT 2016
Hi BigBang dev,
I've been turning back to this project and trying to get the code on my machine up to date with the subsequent changes to BigBang; in particular, the Analyze Senders notebook.
This pull request (using changes from Niels and some fixes of my own) returns functionality for generating a matrix of similarities, using the new from_header_distance function. The notebook shows walking through this similarity, visualizing it with a color map, finding a cutoff for similarities and consolidating senders.
However, I see also that Seb was working on a separate function to do this with some graph functionality, in `resolve_sender_entities`. When I ran that function on my test mailing list, however, it didn't seem to consolidate anything. Maybe I'm misunderstanding how this function works, but it would be great to know, especially if it gets more accurate similarity calculations or does them faster.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
More information about the BigBang-dev