CleanXMLTV | RemoveOld | ReplChanList | ReplChanID | Convert_Latin1_UTF8 | Technobabble | Support
For example references to "UTC" (Coordinated Universal Time, in practice equivalent to GMT, Greenwich Mean Time), or references to "CET" (Central European Time, which usually implicitly assumes that the viewing software knows about daylight saving time), or the order of XML items is problematic ("category" coming before "description" may generate errors).
Enter parameters:
CleanXMLTV INPUTFILE --output OUTPUTFILE [--cet [cetoffset]] [--kenmerk] [--tba] [--talpa]
With only the parameters INPUTFILE and OUTPUTFILE CleanXMLTV:
If you enter the --cet parameter, to all timecodes without GMT/UTC reference one will be added. If you enter a cetoffset, this will be added or subtracted.
Example: if a timecode for a programme in July is missing, and you use --cet, CleanXMLTV will add "+0200".
The benefit is that CleanXMLTV can calculate the proper daylight saving time offset. If you are in London, and you use the parameter "--cet -1", a timecode in July with missing GMT/UTC reference will receive "+0100". If you are in New York you might want to try "--cet -6", and in Athens "--cet +1"
If timecodes contain UTC or CET reference, CleanXMLTV always converts to a GMT/UTC reference. I.e.: "UTC" is converted to "+0000" and "CET" is converted to +0100 or +0200 according to rules for calculating the daylight saving time offset. If there already is GMT/UTC reference (i.e. "+0100") CleanXMLTV will change nothing.
If you use the excellent grabber (for Dutch channels, not written by me, but by Amontillado) TVgids_to_XML the XML data may contain items called "Kenmerken". Since there is no equivalent of this in the standard XMLTV data description, you can add the parameter "--kenmerk". Then the information under "Kenmerken" will be copied to the end of "<desc>".
If you get data from the RadioTimes (tv_grab_uk_rt), schedules may be so distant in the future that data are unknown. RadioTimes lists "To be announced" or "Movie to be announced". If you use GB-PVR your database will keep these data, and NOT overwrite them when the right data come in. To prevent this, use the option --tba. CleanXMLTV will not copy these XMLTV entries. If the XMLTV data contains "to be announced" somewhere in the beginning of the title, the whole record will be left open.
For further development: in the past in the Netherlands Nickelodeon and Talpa shared the same channel, but the guides were not combined in the same XMLTV channel. In one channel there would be a long block from 18:00 hours until 2:00 hours that only says "Talpa". This complicated merging the two data files for this channel. With the "hidden" feature [--talpa], an XML title element that only contained "talpa" (no matter what combination of upper- and lowercase, would be removed.
*** Update 14-1-2008
If the XML tv guide file contains "script" codes, it will not process with XMLTV software. Therefore CleanXMLTV now removes the scripts. **** Update 21-6-2007Added cleaning of html links (suddenly appeared in TVGids.nl and had no end tag, causing xmltv errors)
**** Update 18-7-2005
Added --tba functionality
**** Update 9-9-2005
Removed bug that converted codes like "&" to "&"
Added Talpa-block removal
**** Update 4-10-2005
"Talpa" block is now recognised no matter if upper- or lowercase
Enter parameters:
RemoveOld DIRECTORY --age GENERAL_MAX_AGE --mask [FILEMASK, f.e. *.xml] --range [A-B, f.e. 1-3] --rangeage [RANGE_MAX_AGE]
RemoveOld removes all files older than GENERAL_MAX_AGE days. The number of days is rounded on 12 hours. F.e. if "age" is 0, files that are 11 hours old will not be deleted. You can set a filemask. If you set a range (f.e. --range 1-2), you can set a second max. age in "rangeage". If you use --range 1-3 --rangeage 4, the alphabetically 1st, 2nd and 3rd file in the directory should be younger than 4 days, or they will be deleted.
This is helpful when combining with the XMLTV tool tv_grep. With tv_grep you can generate a subset of an XML file. But, tv_grep does not change the TV Channels list. If you manually delete channels from the XML file that do not apply, and save the resulting file, you can use ReplChanList to replace the channel list from one XMLTV file for the list in another one.
ReplChanList INPUTFILE --output OUTPUTFILE --chanlist CHANNELFILEBe careful that you only with a CHANNELFILE that has channels with the proper ID. Otherwise the information from the TV Guide may not be picked up.
ReplChanID INPUTFILE --from ID1 --to ID2 [--display DISP] --output OUTPUTFILEReplChanID replaces the channel ID ID1 to ID2 and writes OUTPUTFILE. If specified DISP is written as the new display name for that channel.
Convert_Latin1_UTF8 INPUTFILE --output OUTPUTFILE [--ToFormat UTF-8|Latin1]
Technobabble
The tools are all pretty straight-forward. One routine I was searching for myself is about the determination of the start and end of Daylight Saving Time. It starts at 2 a.m. on the last Sunday of March, and ends at 3 a.m. on the last Sunday of October. I developed the following routine myself, since I could not find one on the Web. If you are looking for this: come and steal it. Donations are welcome, however.
uses DateUtils; ... var timezonechange, recordingtime: TDateTime; ... {Get year, month, hour, minute information from the "start time" in the XML file} recordingtime:=EncodeDateTime(dtyear, dtmonth, dtday, dthour, dtminute, dtsecond, 0); weeksinmonth:=5; while TryEncodeDayOfWeekInMonth(dtyear,3,weeksinmonth,DaySunday, timezonechange)=false do weeksinmonth:=weeksinmonth-1; timezonechange:=RecodeHour(timezonechange,2); {Now we have start of Daylight Saving Time}
The "TryEncodeDayOfWeekInMonth" function first tries the fifth Sunday (the DaySunday constant comes for free with Delphi!) of the month March (=3, third month of the year) and it puts the time code in "timezonechange". If there is no 5th Sunday, it will try the 4th Sunday. Once succesful it sets the time to 2 a.m. "timezonechange" now holds the exact time code for the change to Daylight Saving Time.
Joost Smits, jsmits@prize.nl
Updated: 14 January 2008
All rights reserved.