[Bigbang-dev] Default ICANN data in BigBang package?

Sebastian Benthall sbenthall at gmail.com
Tue Aug 16 11:44:25 PDT 2016


Generally it's good not to have all the data one works with checked into
version control.

Actually currently *no* data is checked into version control. When you
install BigBang you have to run the collect_mail scripts before getting
anything out of the notebooks.

If there's a project that uses BigBang for extensive analysis of data from
a single source, then it's probably best to keep that as a fork and have it
update from the core repository.

What I'm wondering now is whether all, some, or none of the Summer School
notebooks should make it in as is. Currently there are many near-duplicate
notebooks in the examples/ directory, along with a lot of other stuff from
previous uses of the software.

Some hard work that's going to need to happen soon is pruning and
standardizing the stuff in that directory. Along the way we should come up
with code quality guidelines and standards for new notebooks.

On Tue, Aug 16, 2016 at 11:23 AM, Niels ten Oever <niels at article19.org>
wrote:

> Hi Sebastian,
>
> We can include the ICANN data, and soon we should also be able to
> introduce IETF data :)
>
> Cheers,
>
> Niels
>
>
> Niels ten Oever
> Head of Digital
>
> Article 19
> www.article19.org
>
> PGP fingerprint    8D9F C567 BEE4 A431 56C4
>                    678B 08B5 A0F2 636D 68E9
>
> On 08/16/2016 12:31 PM, Sebastian Benthall wrote:
> > Many of the new notebooks from the DMI Summer School are designed to
> > work with a subset of ICANN email data having to do with human rights.
> >
> > Ideally, what gets included in the core BigBang repository is easy for
> > people to started with. That's why all the other notebooks have used
> > just a few SciPy mailing lists.
> >
> > I'm wondering whether we should include the ICANN data in the core
> > BigBang repository.
> >
> > I don't think there's a privacy issue with that, though maybe somebody
> > else might have a reason to object.
> >
> > It would also be a strong signal that BigBang is now intended to be used
> > to analyze Internet governance, not just open source communities.
> >
> > Thoughts?
> >
> > - s
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://sudoroom.org/pipermail/bigbang-dev/attachments/20160816/97e16d0e/attachment.html>


More information about the BigBang-dev mailing list