When Python encounters an import statement, the interpreter follows a specific search order to locate modules:
- Check if the module is already cached in
sys.modules - If not cached, check built-in modules
- If not found in built-in modules, search paths in
sys.path:- Current script’s directory first
- System default paths (Python installation directory, standard library, etc.)
Absolute vs. Relative Imports
Let’s start with a simple example: two files moduleA.py and moduleB.py in the same directory.
moduleA.py:
def foo():
print('moduleA foo()')
moduleB.py:
from moduleA import foo
def bar():
foo()
if __name__ == '__main__':
bar()
Running python moduleB.py outputs moduleA foo() as expected.
This can also be written as:
import moduleA
def bar():
moduleA.foo()
if __name__ == '__main__':
bar()
Both approaches work identically.
Now let’s move both files into a packageA/subpackageA directory:
packageA/
└──subpackageA/
├──moduleA.py
└──moduleB.py
Running python packageA/subpackageA/moduleB.py from the root directory still works fine.
However, if we create main.py in the root directory:
from packageA.subpackageA.moduleB import bar
if __name__ == '__main__':
bar()
Running python main.py fails with:
Traceback (most recent call last):
File "main.py", line 1, in <module>
from packageA.subpackageA.moduleB import bar
File "/path/to/packageA/subpackageA/moduleB.py", line 1, in <module>
from moduleA import foo
ModuleNotFoundError: No module named 'moduleA'
This happens because Python only searches in main.py’s directory, not in subdirectories where the modules actually reside.
There are two solutions:
1. Absolute Import
Modify the import in moduleB.py:
from packageA.subpackageA.moduleA import foo
This works when running from main.py, but breaks when running moduleB.py directly:
Traceback (most recent call last):
File "packageA/subpackageA/moduleB.py", line 1, in <module>
from packageA.subpackageA.moduleA import foo
ModuleNotFoundError: No module named 'packageA'
The current working directory is packageA/subpackageA, which doesn’t contain the packageA package.
2. Relative Import
Modify the import in moduleB.py:
from .moduleA import foo
The dot (.) refers to the current package (subpackageA). This works regardless of where you run the code from.
You can also use .. for parent directories. Let’s create packageA/subpackageB/moduleC.py:
from ..subpackageA.moduleA import foo
def baz():
foo()
Directory structure:
main.py
packageA/
├──subpackageA/
│ ├──moduleA.py
│ └──moduleB.py
└──subpackageB/
└──moduleC.py
In main.py:
from packageA.subpackageB.moduleC import baz
if __name__ == '__main__':
baz()
This outputs moduleA foo() correctly. Python converts the relative import in moduleC.py to the absolute path packageA.subpackageA.moduleA.
Package Scope Pitfall
Let’s create packageA/submain.py:
from subpackageB.moduleC import baz
if __name__ == '__main__':
baz()
Directory structure:
main.py
packageA/
├──submain.py
├──subpackageA/
│ ├──moduleA.py
│ └──moduleB.py
└──subpackageB/
└──moduleC.py
Running python packageA/submain.py fails:
Traceback (most recent call last):
File "/path/to/packageA/submain.py", line 1, in <module>
from subpackageB.moduleC import baz
File "/path/to/packageA/subpackageB/moduleC.py", line 1, in <module>
from ..subpackageA.moduleA import foo
ImportError: attempted relative import beyond top-level package
The issue is that relative imports only work within packages. When submain.py imports subpackageB.moduleC, subpackageB becomes the top-level package, preventing moduleC.py from accessing subpackageA outside this scope.
This principle applies to both from ... import ... and import ... as ... statements.
Using absolute imports in moduleC.py solves this problem.
In our earlier main.py example, packageA was the top-level package, allowing moduleC.py to access subpackageA within the same top-level package.
This explains another common issue. If moduleB.py contains:
from .moduleA import foo
def bar():
foo()
if __name__ == '__main__':
bar()
Running python packageA/subpackageA/moduleB.py directly fails:
Traceback (most recent call last):
File "packageA/subpackageA/moduleB.py", line 1, in <module>
from .moduleA import foo
ImportError: attempted relative import with no known parent package
When run as a script, moduleB.py doesn’t belong to any package, so relative imports have no reference point.
__init__.py Gotchas
Since Python 3.3, __init__.py files are optional. Any directory can be imported as a package, and any .py file can be imported as a module.
However, __init__.py files still serve important purposes. When a package is imported for the first time, Python automatically executes its __init__.py file.
Example directory structure:
main.py
packageA/
└──__init__.py
__init__.py contains:
print('packageA __init__.py')
In main.py:
import packageA
Running python main.py outputs packageA __init__.py.
Even with multiple imports:
import packageA
import packageA
import packageA
import packageA
The output is still packageA __init__.py only once, due to Python’s module caching mechanism mentioned at the beginning.
Comments