EPG Autocompleter

Vicky and I have been thinking about how we can make it easier for people to use social media while watching TV even when they don’t have, say, MythTV and an iPhone and a bunch of custom software available.

Often it’s tiny things that make people decide to do one thing or another, or say something or not, and taken from this point of view one of the small impediments to saying something online about a TV programme is looking up some information about it. So for very popular programmes such as the Apprentice, this doesn’t matter, people can ‘shout at the TV’ using twitter, and people who are watching will know what they mean and if enough people talk about the same incident or people, trending topics will pick it up.

A more tricky case is when you’re watching something slightly obscure or timeshifted. I’ve often watched TV on Twitter recommendations (like @article_dan). Our thought was that if it was slightly easier and slightly more structured, then it could also be more automatable but also perhaps happen more often.

An oddity about TV compared to the world of the Web is that TV is so geographically resticted and particular to a region or country. Geographical restriction on TV doesn’t seem to be going away – web-based TV services such as Hulu, iPlayer and Joost are geographically restricted because market segmentation by region is one of the ways content owners make money.

This seems to be reflecting its way into social media usage for TV too. At the recent W3C Technical Plenary (TPAC) panel on Web and TV, I asked if any other countries were seeing the amount of Twitter activity as the UK has (including tags at the starts of some programmes) but no-one had seen anything like this example:

where seven of the top ten UK trending topics are related to a single TV programme.

However social media is not geographically restricted in this way – I have lots of Twitter friends who are not in the UK. Therefore I consider it good manners, if talking about something and particularly if recommending it, to provide some way for people to find out more about it. Most people won’t care, but some people might look. It’s the diversity of Twitter networks that makes it a very common pattern to add a link to a tweet: common context cannot be assumed.

So given two aspects of the problem, we have made a simple autocompleter for TV that looks up the URL where available. It works only for UK TV for the current day, and for the URL, it works only for BBC content currently. It does however pull out some basic text about the programme such as title, channel, time, which should help identify programmes without URLs. Atlas API and TVPixie are useful here for finding or creating URLs for non-BBC TV programmes: we will try to integrate those.

How it works

Every day, it takes the title, channel, description, start time and channel number from a MythTV dump and puts them in a database table. A very simple crawler runs, grabs a fixed list of urls containing json BBC schedule data and puts them in another table. This amounts to a few hundred records in each.

We use the jQuery autocomplete plugin with a remote backend to show it working. The remote backend is a ruby servlet that runs the following SQL query:

select todays_epg.title,crid,channel,ABS(TIMEDIFF(NOW(), todays_epg.start)) as foo,todays_epg.start,pid from todays_epg left join pid_data on (todays_epg.start = pid_data.start and todays_epg.channel = pid_data.dvb_service_title) where match(todays_epg.title) against ('news') order by foo, chan_num limit 10;

This looks complicated but it’s just:

  • Selecting pid as well as title, channel, date etc when and only when it’s available (i.e. just for BBC content, matching against channel and start time)
  • Ordering firstly by time (nearness to now) and then by channel number (to make the more obscure channels come further down the list).

I use the same data to make available what’s on on any particular channel now, as it was trivial:

select title,crid,channel,start from todays_epg where channel='BBC ONE' and start (NOW()) order by chan_num;

and a crid to pid resolver, as it was also trivial.

Here they are in action.

A few problems / issues

  • Hardcoded urls: there doesn’t seem to be a list of all BBC channel schedule URLs anywhere, so that’s hardcoded in. They are below if you need them (if you do use that list, be nice – identify yourself and leave at least one second between requests)
  • Hardcoded channel names: there doesn’t seem to be a straightforward mapping between the channel names you get from DVB-T EIT and the ones used on the BBC site, so I’ve hand mapped them (see below)
  • Timezones – I’ve not figured out how to make NOW() in MySQL reflect a timezone: it seems to always use UTC, so we’ll get ordering problems on BST.
  • I had to disable the stoplist in MySQL, as it was ignoring things like ‘new'(s) – TV programmes often have terms in them that are excluded by MySQL’s stoplist
  • I don’t think this would work with something like Sphinx – it’s nearly fast enough as is, anyway, but the query is so complex I’m guessing (but not yet checked) that Sphinx couldn’t handle the ordering correctly.

All BBC schedule urls


BBC channel name mappings

mysql> select distinct service, service_title, dvb_service_title from pid_data;
| service          | service_title                 | dvb_service_title |
| cbbc             | CBBC                          | CBBC Channel      |
| cbeebies         | CBeebies                      | CBeebies          |
| bbctwo           | BBC Two                       | BBC TWO           |
| worldservice     | BBC World Service             | BBC World Sv.     |
| parliament       | BBC Parliament                | BBC Parliament    |
| 6music           | BBC 6 Music                   | BBC 6 Music       |
| radio1           | BBC Radio 1                   | BBC Radio 1       |
| bbcnews          | BBC News Channel              | BBC NEWS          |
| 5live            | BBC Radio 5 live              | BBC R5L           |
| radio2           | BBC Radio 2                   | BBC Radio 2       |
| bbcone           | BBC One                       | BBC ONE           |
| 1xtra            | BBC 1Xtra                     | BBC 1Xtra         |
| 5livesportsextra | BBC Radio 5 live sports extra | BBC R5SX          |
| radio4           | BBC Radio 4                   | BBC Radio 4       |
| bbcfour          | BBC Four                      | BBC FOUR          |
| bbchd            | BBC HD                        | BBC HD            |
| bbcthree         | BBC Three                     | BBC THREE         |
| asiannetwork     | BBC Asian Network             | BBC Asian Net.    |
| radio7           | BBC Radio 7                   | BBC Radio 7       |
19 rows in set (0.00 sec)


Is in github.

This entry was posted in Code, Demos. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s