I just updated two corpus files (second and third), as they had invalid XML encoding information. They used utf8 instead of UTF-8.
Unfortunately, XMLStarlet – which I used to pretty print the XML – accepted utf8 as valid value and put it right into the file. This did not affect everybody, but at least one person was stumped enough to email, so I felt it should be fixed.
Those who do not want to redownload the file and can open large XML files (e.g. in Vim), can fix the problem by themselves, as it is extremely obvious right at the end of the first line.