However, imagine this situation:
[...]
Good examples of relevant questions. I would suggest that data is treated as singular or plural depending on context. If you are talking many individual points, for instance, then 'are' would be appropriate. But if you're talking a bunch of points en bloc then 'is' would make more sense.
Well, you draw the right conclusion from a wrong premise. (See also my earlier reply above).
Tomorokoshi presented a wrong argument that stems from a logical fallacy. I shall present it in a more formal way so that it is clearer.
Let D = {d_1, ... d_n} be the data set we are talking about. Tomorokoshi's argument first talks about the individual d_i's and concludes that it is ok to say "these data are" because there are many of them (items 1 and 2 of the argument). So far, so good.
Then, in item 3, his argument shifts the subject of discussion from the individual d_i's to the set D. D is, of course, distinct from the individual d_i's; this follows from the mathematical definition of sets. Now, the error follows: The argument implies that it is ok to call the individual d_i's "this data is" because there is only exactly one D. That conclusion is wrong.
Because not all people are interested in mathematical arguments, let me use a non-mathematical analogy (but please note that it is only an analogy for illustration): I cannot say "there is 100 apple on this tree" just because there is only one tree. I have to say "there are 100 apples on this tree". Of course, it would be correct to say "there is a tree", it would be wrong to say "there are tree". You see, we just have to be clear what we are talking about: either the individual apples or the tree itself. The rest follows from that.
It's the same with data: Are we talking about D (the data set itself) or the individual records, the d_i's? It's that simple. The rest follows from that decision -- at least logically. Of course, you may choose to think non-logical.
In item 5 and item 6, complete and utter confusion follows. What is stated there has nothing to do whatsoever with the subject matter.
There is no clarity in the argument, everything gets mixed up. The fact that it is wrong is actually not the problem, but it throws multiple smoke grenades, and *that* is a problem. Talking about encryption using "a very secure algorithm", polls and losing half of a spreadsheet has nothing to do with the subject, it only obscures the matter and bedazzles the reader. If the argument had used plain logic instead of talking about spreadsheets, polls, and fancy encryption, everything would have been in plain sight and maybe the error would have not even be made because it would have been so simple to see it.