Saturday, November 21. 2009
Ok, having spent a fair bit of tme sifting through the files referenced in my original Ticker on this subject, I have some additional observations.
Please note - in my "past life" I ran an ISP, and am a qualified expert in these matters. I write spam filtering software commercially and have since 1995, being the author of the first ISP-centered spam interdiction package. As such when it comes to issues like Internet mail transport I can easily speak to what is supposed to be present - and what is not.
Further, I want to note that my interest in this has absolutely nothing to do with the underlying claim - "Is Man-Made Global Warming Real?" Rather, my interest in this is whether or not the alleged scientific process has been followed - or subverted.
There is one axiom that I believe we can all agree on:
The climate is always in flux - that is, it is always changing. It has done so over the millions of years in the past, and will in the millions of years in the future.
Science is the process by which we take a question and:
Form a hypothesis.
Design an experiment to test that hypothesis.
Perform the experiment and collect the data thus generated.
Analyze the resulting data.
Form a conclusion from the data thus collected.
That's "The Scientific Method."
To the extent that method is corrupted on purpose one does not have science. To the extent that it is corrupted out of necessity (e.g. missing data that one requires, and thus one "guesses") this is accepted provided one discloses one's guess and how it was derived - that is, provided there is no material concealment.
In the "Big Science World" the check and balance on concealment - and outright fraud - is peer review and post-publication duplication. To be able to duplicate the results claimed, however, the algorithms, code, methods and data sets must be made publicly available so that anyone who desires to do so can validate the claimed experimental results.
In the spirit of science, I will note that I fully expect others to try to validate (or dispute) my observations below. As such you can find the original archive at Wikileaks should you decide you would like to do so, and I encourage all other independent investigation.
Now, on to the observations, after spending an evening and morning with the data (and no, I haven't gone through it all yet - there's a hell of a lot here folks.)
There are apparently 1073 emails, each with a sequence number but those numbers are not sequential. That is, there are a lot of sequence numbers missing. However, the dates in the files appear to be ordinal (that is, increasing from earliest to latest) with the last entry being November 12th of this year.
This strongly implies this is a partial data set intercept of email from some point. The same person does not appear as a "to" or "from" in each email (although there is a lot of commonality), which belies the general idea that this was someone's "saved storage" - at least at first blush.
The intercept, wherever it happened, does not appear to have been done at the system or transport level. Specifically, the "Received:" and "Message-ID:" lines that are part of all internet-transported email are missing. This strongly implies that wherever these emails came from, they were saved/stored by one or more user(s) and were not an automated process that was maintaining archival (or forensic) logs.
The emails themselves, however, look authentic. That is, the formatting is consistent with character mode operation in many of the messages (Unix) and Windows or MAC format programs in others. The quoting is consistent - and correct for the time period in question. Attachments are missing, again implying that this is someone's "saved copy" and NOT from a system-level stream. The early emails contain a fairly significant number of messages that are consistent with the user being on a character-mode terminal (e.g. ELM, MUTT or similar on a Unix system), including the quoting and line formatting. The message content shift toward "desktop email programs" - that is, appearing to be more and more programs such as Eudora, Thunderbird, Outlook and the like is also apparent as time goes on.
My conclusions on the email data set itself are that this is very likely to be either (1) someone's "private email" storage of things they wanted to save, or (2) it was a working directory of someone who was in the process of putting forward a response to an FOI request or internal inquiry of some sort. The messages are not the entire email stream to or from any specific set of users, but rather are a set selected in some fashion - either by the person saving them as "important" or by someone collating messages for the purpose of responding to some sort of request. The majority of the messages themselves are what appear to be ordinary and reasonable discourse between scientists and researchers with an occasional "revealing glance" at the various defensive (and offensive!) approaches to those who question their premise and conclusions. Wikileaks concurs with the latter assessment.
In short, I see nothing in that data set that implies that the messages have been tampered with, but there is also no reasonable way to prove their provenance as the necessary information to do so (routing and message-id information) is missing. A well-place FOI request should resolve that problem, if anyone is particularly interested in doing so.
The data sets included in the archive are also interesting. Again, a reasonably-detailed look through them shows nothing implying that they have been tampered with, and they include data and computer code (source program code) from a wide variety of time periods. It appears authentic.
Comments within, however, disclose an extraordinary amount of extrapolation and "curve fitting" - that is, fitting results to data, not the other way around as it should be that appears to have been going on in the process of so-called "analysis." Worse, there are plenty of comments that make clear that the researchers are literally making things up as they go along - much of the data sets are claimed to be incomplete, inaccurate in terms of their time frames .vs. what is claimed in the headers and titles, and containing junk values.
There is some real trouble here, in that if you're not sure what you've got (that is, you're not sure what the data is!) or worse, you're knowingly missing pieces that you need to perform an analysis, what are you "analyzing"?
Worse, there are comments in the files that make clear that there are observations that are outside of what has been published - and worse, some of those observations are ten times outside the alleged "resolution" of claimed results. Uh, that's a major problem, and goes back to what I have repeatedly said about so-called "climate science" for a very long time, specifically (from Musings):
It is, however, entirely possible that we will find that indeed man is responsible for some of the warming that is taking place, but that this contribution is extremely small - say, 5%. That is, if the global temperature is due to rise by 10 degrees F in the next 100 years, we are responsible for only 0.5F of that rise! Thus, were we to completely cut off CO2 emissions, we'd STILL see a 9.5F rise in temperature. Obviously, if this is the case, then the data does not support taking any sort of drastic action at this time.
The problem with the current political-speak coming from these so-called "scientists" is that it contains no real data and no ranges of uncertainty on their alleged measurements.
That's not science folks - its politics - and we must, as a nation and people, refuse to be cowed by bald claims without the presence of facts behind them.
I have long argued that the major problem with so-called "published papers" on global warming is that it is rare to see find measurement uncertainties reported in the alleged findings, and competing studies have cited wildly different values for the same thing (e.g. atmospheric CO2 emitted by man per year.)
I believe we can now deduce why those uncertainties are missing - they are not being carried through the computational process as is required for any scientific calculation and this omission is in fact intentional.
This is, quite literally, first-semester college physics (or chemistry, or any other "hard" science.) If you turn in an answer to the question "How long is that ruler?" that reads "12 inches" you get a zero.
The scientist says "12 inches +/- 0.1 inch", reflecting the limits of his measurement. The carrying through of uncertainties is essential to hard science, as only from that process can one compute the statistical bands of probability that the result reported is actually the result in the real world.
Uncertainties in measurement are additive - that is, if I measure two rulers and each is reported as "12 inches +/- 0.1 inch" then the total length of the two rulers is 24 inches +/- 0.2 inch - because it is possible that both errors were on the same side.
When one performs complex mathematical functions on input data uncertainties must also be carried through the mathematical functions. Without that we know nothing about the quality of the result - it is entirely possible, given data with enough noise in it, to produce what looks like a perfectly valid answer but have it be absolute trash and of no value at all.
The only way to know if that is possible is for all measurements to be reported with their uncertainties attached, and for all uncertainties to be carried through all computational processes.
It is quite clear, from the data sets I have looked at, that this is simply not being done. Instead computations are being "fudged" to fit data to expected previously claimed results and/or data sets simply discarded or modified that do not fit with either previously-published numbers or desired outcomes. Here's just one example from the comments in the files:
ARGH. Just went back to check on synthetic production. Apparently - I have no memory of this at all - we're not doing observed rain days! It's all synthetic from 1990 onwards. So I'm going to need conditionals in the update program to handle that. And separate gridding before 1989. And what TF happens to station counts?
OH F**K THIS. It's Sunday evening, I've worked all weekend, and just when I thought it was done I'm hitting yet another problem that's based on the hopeless state of our databases. There is no uniform data integrity, it's just a catalogue of issues that continues to grow as they're found.
This, by the way, is exactly the (intentional) "error" that was made by the "ratings agencies" and banks when it came to securitized debt that had "less than fully-verified income and assets" as a component. Uncertainties on the reported income and assets were never determined from experimental sampling and carried through the computational process. If they had been then the outcomes that we have actually seen would have been predicted within the range of possible outcomes for this debt. Instead, the issued securities were rated "AAA" because the agencies did not apply an uncertainty to each of the alleged reported numbers. That's what happens when you ignore the scientific method - you put garbage into a computation, you get garbage back out and it is impossible for an outside observer to detect that you did so because you refuse to give him the uncertainties associated with your claimed "measurements"!
Some of the guys working on this stuff appear to be genuinely trying to clean up other people's trash. But trash in produces trash out, and if you can't successfully defend the statistical integrity of the data going into your computational models you have nothing.
This leaves me with one final question: since we have emails now apparently documenting an attempt to "paper over" temperature decreases in recent years, and we also have claims of "lost" data, one wonders - was the data really lost, or was it intentionally deleted or withheld from other researchers who asked for it, as providing it would show that measurement uncertainties were not carried through computationally - and if they were, the claimed results in the so-called "peer reviewed" paper would be impossible to validate?
Without hard proof of whatever answer is propounded to that question we as the people of this planet must insist on a full stop for all purported "climate amelioration" efforts, as there is every possibility that the entirety of this so-called science in fact proves exactly nothing, except that the so-called "researchers" have added much CO2 to the atmosphere producing the electricity required to power their computers!
Extraordinary claims require extraordinary proof, and from the released set of data that proof is, quite simply, not present and accounted for.
All material herein Copyright 2007/2009 Karl Denninger. All rights reserved.