Module Iterators, as defined in
pkgutil.py, aren’t really part of the mess that has been imposed on us by PEP-302 and its follow-on attempts to rationalize the loading process, but they’re used by so many different libraries that when we talk about creating a new general class of importers, we have to talk about iterators.
Iterators, after all, are why I started down this project in the first place. It was Django’s inability to find heterogeneously defined modules that I set out to fix.
Iterators are define in the
pgkutil module; their entire purpose is, given some kind of reference to an archive, to be able to list the contents of that archive, and to recursively descend into that archive if it happens to be a tree-like structure.
When you call
pkgutil.iter_modules(path, prefix), you get back a list of all the modules within that path or, if no path is supplied, all the paths in
sys.path. As I pointed out in my last post, the paths is
sys.path aren’t necessarily paths on the filesystem or, if they are, they’re not necessarily directory paths. All that matters is that for each path, a
path_hook exists that can return a Finder, and that Finder has a method for listing the contents of the path found.
In Python 2,
pkgutil depends upon Finders (those things we said were attached to
path_hooks) to have a special function called
iter_modules; if it does, that function is used to list the contents of the “path”.
In Python 3, the
functools.singledispatch tools is used to differentiate between different Finders; once a Finder has been identified by
singledispath us used to find a corresponding resource iterator for that Finder. It doesn’t necessarily have to be a method on the Finder, although the default has a classmethod that is its finder.
An iterator is pretty straightforward; once you know the “path” (resource identifier) and the Finder for that path, you can call a function that checks for the presence of modules. In the case of FileFinder, that function is a combination of
isdir/isfile to check for
dir/__init__ pairs indicating a submodule.
For our purposes, of course, we had to provide a path_hook that eclipses the existing path_hook, and we had a provide a Finder that was more precisely ours than the inherited base
FileFinder, so that single dispatch would find ours before it found
FileFinder‘s and still work correctly.
There is one other module I have to worry about:
modulefinder. It’s not used often, it’s not used by Django or any of the other major tools that I usually use, and it’s never been covered by Python Module of the Week. That doesn’t mean that it’s hard-coding of the ‘.py’ suffix isn’t problematic. I’m just not sure what to do about it at this point.