"Data" is actually a mass noun. And the ODE says:
usage: In Latin, data is the plural of datum and, historically and in specialized scientific fields, it is also treated as a plural in English, taking a plural verb, as in the data were collected and classified. In modern non-scientific use, however, it is generally not treated as a plural. Instead, it is treated as a mass noun, similar to a word like information, which takes a singular verb. Sentences such as data was collected over a number of years are now widely accepted in standard English.
So data is. Sorry
Yes, I use, "This data is..." even though I'm fully aware that grammarian and scientist pedants will use, endorse, and cajole the use of, "These data are...".
However, imagine this situation:
1. A scientist collects a large volume of data points into a spreadsheet.
2. At this point is it referred to as "This data is..." or as "These data are..."?
3. The spreadsheet is encrypted using a very secure algorithm.
4. At this point is it referred to as "This data is..." or as "These data are..."?
5. Somewhere in the middle of the file one byte is changed. Now the spreadsheet can't be decrypted.
6. At this point is it referred to as "This data is..." or as "These data are..."? No data points are available, as all the data points were formed into a unitary item.
7. Working backwards, apply the same reasoning to all versions of the spreadsheet.
Using the term, "These data are..." carries an implication that some aspect of the data could be removed, yielding much the same result.
For instance, I poll 1000 people about the use of, "These data are..." compared to "This data is...". The results are collected into a spreadsheet, and sorted by response with the first 500 endorsing "This data is...", and the second 500 endorsing "These data are...". The spreadsheet gets corrupted, losing the second half. A statistical review of the spreadsheet would then reveal that 100% of respondents endorsed, "This data is...".
The point being that the data set is a unitary collection, otherwise one is corrupting the data set. Therefore, "This data is...".