3/02/2013

Getting information from website into Google Calendar

Lectio2Cal

Introduction

In Denmark, almost all gymnasiums use a web-service called Lectio to organise class timetables. The largest downside with this is the lack of support for syncing with a personal calendar. Alle the timetables are available to the public by going to http://www.lectio.dk and choosing the school and class.
The chosen programming language for this project is Python.

Solution

Downloading the website

To get the information from the webpage and into a calendar, the first step is to split apart the website. For downloading the website, i use cURL with the URL as the only parameter.
 import subprocess  
 subprocess.call('curl "http://www.lectio.dk/lectio/523/SkemaNy.aspx?type=stamklasse&klasseid=2243069721&week=092013"',shell=True)  


Parsing the website

This is a very complicated task, and will not make that much sense to explain, because it probably can't be reused. If you are really interested, then look in the source code.
In short, I open the previously downloaded website and send it through a predefined parsing algorithm.
 parser = MyHTMLParser()  
 searchfile = open(temp_path, "r")  
 htmlstr=searchfile.read()  
 parser.feed(htmlstr)  

I use the module bundled with Python called HTMLParser to find the right html-tags and search, cut and format the information. The result is a 2D array (list) with everything from date, teacher, subject to homework for all the classes that week.


Uploading to calendar

For the calendar, i chose Google Calendar. The only reason for this is to get a solution that will be transferable to smartphones, tablet or other systems. If i had chosen a solution where i created and imported a .csv file or something like that, i would make the process of importing to post-pc devices much more complicated.
I started off by downloading the python sample from Googles website. I began removing everything i didn't see any use for. I let some of the functions stay that i might use later. The most important function is the _InsertSingleEvent() that, as the name explains, uploads a single event to the calendar.
   def _InsertSingleEvent(self,title,content,where,start_time,end_time):  

Now the last task is just to go through every single class from the 2D array and insert them into the function above.


Conclusion

The software is able to download a week from Lectio at a time, parse it and upload it to Google Calendar. It does have some minor/large flaws:
- It always upload to the main calendar, so you can't separate it from other events.
- If the algorithm is run twice, the events will be duplicated, not updated.


Source code

Windows - Requires cURL to be located in the same folder.
Mac OSX - It should run without problems.
Linux - Not tested

No comments:

Post a Comment