[Bigbang-dev] Analyze Senders and consolidating email senders

Nick Doty npdoty at ischool.berkeley.edu
Thu May 5 17:42:15 PDT 2016


Hi BigBang dev,

I've been turning back to this project and trying to get the code on my machine up to date with the subsequent changes to BigBang; in particular, the Analyze Senders notebook.

This pull request (using changes from Niels and some fixes of my own) returns functionality for generating a matrix of similarities, using the new from_header_distance function. The notebook shows walking through this similarity, visualizing it with a color map, finding a cutoff for similarities and consolidating senders.

https://github.com/sbenthall/bigbang/pull/242

However, I see also that Seb was working on a separate function to do this with some graph functionality, in `resolve_sender_entities`. When I ran that function on my test mailing list, however, it didn't seem to consolidate anything. Maybe I'm misunderstanding how this function works, but it would be great to know, especially if it gets more accurate similarity calculations or does them faster.

Thanks,
Nick
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://sudoroom.org/pipermail/bigbang-dev/attachments/20160505/693e96e7/attachment.sig>


More information about the BigBang-dev mailing list