In the “Skype 3.X discussion” public chat, there was a discussion about having a plugin that could extract all your information (profile, buddy list, history list, chat logs, etc.) from Skype and save it in a neutral format such as RSS. For chat, it was mentioned that Pamela and Skylook record chat sessions, but the original need came from a Mac user, while these two plugins are only available under Windows. How complicated is it to write a plugin that would extract say chat messages from Skype? Can it be made portable across all desktop platforms supported by Skype?
For writing simple plugins quickly and easily, and have a portable result, I recommend using Python and the excellent Public API wrapper Skype4Py. The requirements are simple, only Python and Skype4Py are needed. Under Linux Python is most likely already there, for Windows and Mac it can be obtained from Python web site. Skype4Py can be downloaded from the Skype4Py site.
The little script I will write will use Skype’s API and Skype4Py to get all the chats, and for each chat print every messages. The Skype4Py reference manual has all the documentation we need.
Before sending any command to Skype, we need to initialize Skype4Py and connect to the Skype client. We can optionally set the name under which the plugin will appear in the API authorization request window (if not set, ‘Skype4Py’ is used).
import Skype4Py skype = Skype4Py.Skype() skype.FriendlyName = 'Extract_chat_history' skype.Attach()
Then we need to get the list of all Chat objects. With the text API this would be done using the “SEARCH CHATS” command. The corresponding command in Skype4Py is hidden behind the Chats property of the ISkype class. This property has a list of IChat objects, containing all chats the user has had:
chats = skype.Chats
The IChat class has the Messages property, representing all messages of this chat, as a list of IChatMessage objects. Printing all messages bodies of all chats becomes then trivial:
for c in chats: for m in c.Messages: print m.Body
If you have a lot of chats, like I do, running this script will be quite long. Therefore, for testing I recommend printing only a few chats, which in Python means 4 characters only:
for c in chats[:2]: # print only 2 chats
The script must be run from the command line, as it prints the messages on the standard output. To get the messages in a text file, just redirect the output to a file (’python extract_chat_history.py > history.txt’).
We only print the body of the chat. To make it really useful we need to add the name of the sender (FromDisplayName property of IChatMessage) and the time of the message (Timestamp property). We can also print additional information about the chat itself, such as the topic (Topic property of IChat). Getting the member list is as easy (MemberObjects or Members properties of IChat).
When running the script, we notice that the message bodies are not sorted by time. It’s quite easy to fix it with the sort function and a custom comparison function:
for c in chats: # c.Messages is a tuple, to be able to sort it, # convert it into a list msg_list = list(c.Messages) # Sorting based on timestamps, with custom compare fct def message_timestamp_cmp(x, y): return int(x.Timestamp - y.Timestamp) # why are timestamps float and not int in Skype4Py? msg_list.sort(message_timestamp_cmp) for m in msg_list: print m.Body
Another possible annoyance is that for each buddy, there are many single chats instead of just one. If we wish, we can group them during the extraction (for example, search all IChat objects that have only 2 members and group them by their buddy Skype name).
One can see how easy it is to write a plugin that extracts data from Skype using Skype4Py, in a portable way. More examples are available on my web site.