Hi BigBang dev,
I've been turning back to this project and trying to get the code on my machine up to
date with the subsequent changes to BigBang; in particular, the Analyze Senders notebook.
This pull request (using changes from Niels and some fixes of my own) returns
functionality for generating a matrix of similarities, using the new from_header_distance
function. The notebook shows walking through this similarity, visualizing it with a color
map, finding a cutoff for similarities and consolidating senders.
https://github.com/sbenthall/bigbang/pull/242
However, I see also that Seb was working on a separate function to do this with some graph
functionality, in `resolve_sender_entities`. When I ran that function on my test mailing
list, however, it didn't seem to consolidate anything. Maybe I'm misunderstanding
how this function works, but it would be great to know, especially if it gets more
accurate similarity calculations or does them faster.
Thanks,
Nick