March 19, 2003
Driving win32 GUIs with Python, part 1

I'm doing a lot of this at the moment, so I thought I'd jot down some of the stuff I've learnt. Certainly, I couldn't find any of this on the 'net at the moment.

I'm not going to post everything at once - I'll do it a piece at a time.

OK, so you are trying to automate some process, and one of the applications involved doesn't want to play. No COM automation, no command line, no nothing. You are going to have to drive its GUI.

One possibility (the first that I tried) is to use something like AutoIT. This (free) automation tool can drive a lot of things, and a COM control is provided, so you can script it all in Python. Problem is, if you push it too hard, the rough edges start to show. I find I need to reset the control (using the Init() method) all the time, or it would stop recognising windows. One of the applications I'm working with doesn't play by the Windows rules, and so not everything works. Lastly, it's inherently limited - you cannot, for example, use it to get the text from an edit control. For all these reasons, I frequently have to roll my own.

To run this lot, you'll need Python, and Mark Hammond's win32 stuff. I'm using version 2.2.2 and build 152 respectively, on NT 4 service pack 6. If any of this works (or doesn't) on other versions, I'd be grateful to know.

First step - before you can do anything with a GUI component, you'll have to find it. Bit of terminology or you - every Windows GUI component is a window, whatever it is; text area, application window, scroll bar, button, whatever, they are all windows. And each of them has a window handle (often called hwnd). Windows often live in other windows, and can contain yet more windows themselves, so we will end up getting all nasty and recursive. We'll start out by getting a list of the top level windows.

First, we'll import some stuff. We'll need all these modules eventually,

import win32api
import win32con
import win32gui

We are going to use the win32gui.EnumWindows() function to get our top level window information. This is one of those nasty functions which wants a callback function passed to it (consult the docs if you are really bored). So here's one I made earlier:

def windowEnumerationHandler(hwnd, resultList):
'''Pass to win32gui.EnumWindows() to generate list of window handle, window text tuples.'''
resultList.append((hwnd, win32gui.GetWindowText(hwnd)))

We can pass this, along a list to hold the results, into win32gui.EnumWindows(), as so:

topWindows = []
win32gui.EnumWindows(windowEnumerationHandler, topWindows)

Our topWindows list now contains an entry for each of the top level windows, probably quite a lot of them. Go on print it out. Each entry has the windows handle, and the window text, which we'll use to decide which top level window we are going to dig into. Once we have this, we'll use win32gui.EnumChildWindows() to burrow down to find whichever control we want to do something with.

But that's enough for one day...

Update: See the whole series: Driving win32 GUIs with Python, part 1, Driving win32 GUIs with Python, part 2, Driving win32 GUIs with Python, part 3, Driving win32 GUIs with Python, part 4 and 7 hours, one line of code.

Posted to Python by Simon Brunning at March 19, 2003 01:50 PM
Post a comment

Email Address:



Remember info?