<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
Yo's-<br>
<br>
Interesting. A couple of things to keep in mind (speaking from more
than casual knowledge here):<br>
<br>
One can start with unclassified material, apply analysis, and
produce an output that itself can become classified material. This
is how OSI works (open-source intelligence, where "open source"
means "not classified sources"). So it is possible that some of
what Wikileaks has produced, may be classifiable material. If it's
deemed classified, it may or may not be advisable to access even if
publicly available (for example if one ever applies for certain
kinds of jobs for which one might be subject to extensive screening
questions). In my experience one should either err on the side of
caution or give thought to the question of how one might defend
one's actions if need be. <br>
<br>
General rule in intel: Collection is easy, post-processing is
tedious, and analysis is hard. Wikileaks appears to have collected
the material lawfully from released archives, and post-processed
it. What Eddan is calling for here is to add software functionality
that simplifies the analytic task. Adding that functionality to
software used for analysis of material, shouldn't raise any
controversies in and of itself, especially if the software is useful
for other tasks aside from analyzing leaked data dumps. <br>
<br>
Analysis requires a certain amount of training and a certain
mindset. <br>
<br>
By way of training, there is extensive unclassified published
literature we can add to our library if anyone's interested. Some
of it is technique-specific, most of it is general but still useful
to read by way of acquiring certain ways of thinking about data.
Anyone else here with cog sci background may find themselves amused
at the degree to which the US Gov is behind the times in that area:
surely we can do better, and we might consider a project to improve
upon gov-recommended techniques. <br>
<br>
The intel mindset is a personality trait that may not be easy to
acquire, but one approximation is a combination of above-average
pattern sense (roughly equivalent to mild paranoia;-) combined with
the ability to doubt your own hypotheses and conclusions, the
ability to empathize with others (such as the subjects of one's
inquiries, or at least being able to approximate an understanding of
their own personality traits), and relentlessly apply Occam and
other reasoning tools to sort the wheat from the chaff. There is a
tendency that must be overcome, to get stuck in a groove of either
overestimating or underestimating apparent patterns in the data.
Getting the balance right, is very difficult and can be refined with
training. <br>
<br>
And of course one needs to adopt a scientific attitude of
objectivity or at least non-prejudice about one's subject matter,
for example one can't go into an exercise with predetermined notions
about the motives of individuals etc. (In other words, don't jump
to conclusions based on the names "Nixon" and "Kissinger.")<br>
<br>
Very often it turns out that the key to an analytic exercise is not
something obvious in and of itself, such as ferreting out a damning
quote from a subject, but rather something that emerges in the
relationships between two or more pieces of data each of which is
unremarkable. <br>
<br>
Anyway, if anyone's interested, we can discuss further. Though,
this week I'm majorly busy with work.<br>
<br>
laters-<br>
<br>
-G.<br>
<br>
<br>
=====<br>
<br>
<br>
<br>
<div class="moz-cite-prefix">On 13-04-08-Mon 2:53 AM, Eddan wrote:<br>
</div>
<blockquote
cite="mid:CAMvNwqHdgoCG9uAgbSnMuaptwF75Qr0yCuh0wUP2=e1rdS8KWg@mail.gmail.com"
type="cite">
<div dir="ltr">
<div style="">now this is really dangerous... Add metadata
search functionality to Wikileaks-released excessively
classified diplomatic cables. Make this comprehensible to
people who aren't foreign policy geeks and the Arab Spring
will have been just warm-up practice. </div>
<div><br>
</div>
<div style="">it seems like it would be very useful to better
understand what Wikileaks means by their reverse engineering
of the US gov metadata. it would be _monumental_ to further
enhance this treasure trove with some natural language search
and more sophisticated pattern recognition. Sudo-Leaks,
anyone?</div>
<div><br>
</div>
<div style="">[excerpt from</div>
<a moz-do-not-send="true"
href="http://wikileaks.org/plusd/about/">http://wikileaks.org/plusd/about/</a>]
<div><br>
</div>
<h1
style="margin:0px;padding:4px;background-color:rgb(142,209,224);font-size:large;color:rgb(0,0,0);font-family:sans-serif"><a
moz-do-not-send="true"
href="http://wikileaks.org/plusd/about/#tkc" id="tkc"
style="text-decoration:none;color:rgb(0,0,0)">The Kissinger
Cables</a></h1>
<p style="color:rgb(0,0,0);font-family:sans-serif">The Kissinger
Cables comprise more than 1.7 million US diplomatic records
for the period 1973 to 1976. Dating from January 1, 1973 to
December 31, 1976 they cover a variety of diplomatic traffic
including cables, intelligence reports and congressional
correspondence. They include more than 320,000 originally
classified records, including 286,000 full US diplomatic
cables. There are more than 12,000 documents with the
sensitive handling restriction "NODIS", 'no distribution', and
more than 9,000 labelled "Eyes Only". Full cables originally
classed as "SECRET" total more than 61,000 and "CONFIDENTIAL"
more than 250,000.</p>
<p style="color:rgb(0,0,0);font-family:sans-serif">The records
were reviewed by the United States Department of State's
systematic 25-year declassification process. At review, the
records were assessed and either declassified or kept
classified with some or all of the metadata records
declassified. Both sets of records were then subject to an
additional review by the National Archives and Records
Administration (NARA). Once believed to be releasable, they
were placed as individual PDFs at the National Archives as
part of their Central Foreign Policy Files collection. Despite
the review process supposedly assessing documents after 25
years there are no diplomatic records later than 1976. The
formal declassification and review process of these extremely
valuable historical documents is therefore currently running
12 years late.</p>
<p style="color:rgb(0,0,0);font-family:sans-serif">The form in
which these documents were at NARA was 1.7 million individual
PDFs. To prepare these documents for integration into the
PlusD collection, WikiLeaks obtained and reverse-engineered
all 1.7 million PDFs and performed a detailed analysis of
individual fields, developed sophisticated technical systems
to deal with the complex and voluminous data and corrected a
great many errors introduced by NARA, the State Department or
its diplomats, for example harmonizing the many different ways
in which departments, capitals and people's names were
spelled. All our corrective work is referenced and available
from the links in the individual field descriptions on the
PlusD text search interface: <a moz-do-not-send="true"
href="https://search.wikileaks.org/plusd"
style="text-decoration:none;color:rgb(33,107,124)">https://search.wikileaks.org/plusd</a>.
For more information on what WikiLeaks did to prepare the
Kissinger Cables please see <a moz-do-not-send="true"
href="http://wikileaks.org/plusd/about/#ptk"
style="text-decoration:none;color:rgb(33,107,124)">here</a>.</p>
<p style="color:rgb(0,0,0);font-family:sans-serif">Not all
records from the period 1973-1976 have been obtained. NARA
claims diplomatic records for the period 1973 to 1976 chosen
for content deletion were of a ephemeral character. These
records were identified by the "TAGS" that were attached to
them. TAGS ("Traffic Analysis by Geography and Subject")
refers to the content tagging system implemented by the
Department of State for its central foreign policy files in
1973. There are geographic, organization and subject TAGS.
This system was developed to standardise search terms for
departmental uses and was not static - TAGS were added and
deleted as necessary over time. At review, all cables that
only contained "temporary" TAGS, such as embassy logistical or
staffing requests, were permanently destroyed.</p>
<p style="color:rgb(0,0,0);font-family:sans-serif">Tens of
thousands of documents were irreversibly corrupted in this
data set due to technical errors when the documents were moved
as computer systems were upgraded, or so the US Department of
State claims. This caused the content of the document to be
lost, though the metadata is still available. These are often
noted by a error message in the content of the document. The
documents lost in this manner are most documents from the
following periods:</p>
<ul
style="color:rgb(0,0,0);font-family:sans-serif;font-size:12px">
<li>December 1, 1975 to December 15, 1975</li>
<li>March 8, 1976 to April 2, 1976</li>
<li>May 25, 1976 to July 1, 1976</li>
</ul>
<p style="color:rgb(0,0,0);font-family:sans-serif">
You can see the absence of these weeks by constructing a
Timegraph of "TAGS" as this term occurs in the content of
nearly every document:<a moz-do-not-send="true"
href="https://search.wikileaks.org/plusd/graph"
style="text-decoration:none;color:rgb(33,107,124)">https://search.wikileaks.org/plusd/graph</a></p>
<p style="color:rgb(0,0,0);font-family:sans-serif">Top Secret
documents are also not available. During a migration of
records the Department of State printed out all Top Secret
documents for "preservation purposes" and the electronic
versions were destroyed permanently. These documents now only
exist as hardcopies and so are unavailable online in any form,
even if declassified.</p>
<p style="color:rgb(0,0,0);font-family:sans-serif">The documents
not deleted either remained classified (or were deemed
unreleasable for other reasons), or were declassified and
publicly released. For the former, a "withdrawal card" was
provided giving some limited metadata about the document, the
fields of which that were decided as releasable vary from
document to document. This metadata provides some information
about the document, for example the date and destination, that
can be used for research purposes and also allows a detailed
FOIA request to be made for the document. These FOIA requests
can be directed to NARA's Special Access and FOIA staff. For
more information about this, please see their online guide <a
moz-do-not-send="true"
href="http://www.archives.gov/foia/foia-guide.html"
style="text-decoration:none;color:rgb(33,107,124)">here</a>.
You will need the document number and the To and From
information.</p>
<p style="color:rgb(0,0,0);font-family:sans-serif">There are
nine different "Types" of document included in the Kissinger
Cables. The majority are of type "TE" - telegram (cable),
which are official diplomatic messages sent between embassies
and the US Secretary of State conveying official information
about policy proposals and implementation, program activities,
or personnel and diplomatic post operations. From 1973 onwards
diplomatic cables were mostly electronic, therefore most
cables made releasable include the body (content) of the
cable. However, the other types of documents are paper
records, including airgrams and diplomatic notes. These are
stored on microfilm (from 1974 onwards, as the Department of
State did not microfilm documents until then) and so were not
released with the full content of the documents, even if
marked for public release. Although the body of the message is
not available online the full index (metadata) is provided for
those "P-reel" documents that were marked for release. Even
though the whole document has not been digitised the metadata
is still useful for research purposes and the documents can be
requested under the Freedom of Information Act. For those
documents on P-reel that were not declassified and released a
P-reel "withdrawal card" is provided giving limited metadata.
To access P-reel documents that have a withdrawal card you
should follow the same FOIA procedure as for Telegram
withdrawal cards. For the content of P-reel documents which
have been released, the process depends slightly on which year
the document you are requesting was created, but all requests
should be directed <a moz-do-not-send="true"
href="mailto:to%3Aarchives2reference@nara.gov">to:archives2reference@nara.gov</a>.</p>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
sudo-discuss mailing list
<a class="moz-txt-link-abbreviated" href="mailto:sudo-discuss@lists.sudoroom.org">sudo-discuss@lists.sudoroom.org</a>
<a class="moz-txt-link-freetext" href="http://lists.sudoroom.org/listinfo/sudo-discuss">http://lists.sudoroom.org/listinfo/sudo-discuss</a>
</pre>
</blockquote>
<br>
</body>
</html>