I hope this email finds you all very well!
In preparation for the IETF hackathon I have spent three days
downloading ~33 GB of IETF mailinglist archives from
I will bring it on a disk to the hackathon in London, but am also gonna
make it available as a zipfile from a server, which should allow for
much quicker download. Will share the IP address here soon.
I will also put an archive for all ICANN mailinglists there.
Am still looking for a way to create csv's from the archives to be able
to directly use the archives in bigbang (see discussion underneath with
Sebastian). All input appreciated!
-------- Forwarded Message --------
Subject: Re: quick q
Date: Wed, 14 Mar 2018 22:03:36 +0100
From: Niels ten Oever <niels(a)article19.org>
To: Sebastian Benthall <sbenthall(a)gmail.com>
Not sure if I am doing correct what you're saying, but:
On 03/14/2018 08:20 PM, Sebastian Benthall wrote:
> You may have trouble getting all 33 gigs into memory at the same time.
> I've never tried that.
> Have you tried creating an Archive object for just one group, as it
> illustrated in the example notebooks?
If I use for instance:
$ python2 bin/collect_mail.py -u
I get infinite chardet errors and ends in:
DEBUG:chardet.charsetprober:windows-1255 Hebrew confidence = 0.0
tzinfo.utcoffset() returned 1440; must be in -1439 .. 1439
'ascii' codec can't encode character u'\xe4' in position 1084: ordinal
not in range(128)
Can't export data. Aborting.
So this was not a durable way to get all the mailinglists for the
hackathon, so that is why I used wget to get them.
So now I am looking for a way to make them easily usable for the
participants, but am not sure how to do this.
Not sure which example notebook you meant I can do this with, all the
ones I looked through actually need a csv, or try to download the list
> I believe that when it creates one from raw email it will generate a
> .CSV file of the same data for you.
> On Mar 14, 2018 1:41 PM, "Niels ten Oever" <niels(a)article19.org
> <mailto:firstname.lastname@example.org>> wrote:
> In other words, over the past three days I downloaded all these:
> And now I would like to import them in BigBang, but not sure what
> command to use.
> When I try to use the notebooks they are asking for csv's.
> Niels ten Oever
> Article 19
> www.article19.org <http://www.article19.org>
> PGP fingerprint 2458 0B70 5C4A FD8A 9488
> 643A 0ED8 3F3A 468A C8B3
> On 03/14/2018 06:31 PM, Niels ten Oever wrote:
> > Hiya Seb,
> > All good? I have a quick question. Do you know how I can import
> > emaillists that I already have downloaded? In other words, how do I
> > create csv's of the 33 GB of mailinglists I just harvested :)
> > Hope all is well! I think I will be churning on this stuff this night,
> > so maybe expect some mails later ;) xx
> > ~n.,