






















This PEP changes the way packages influence Python’s startup process.
Previously controlled through legacy .pth files parsed and executed by the
site.py file during interpreter startup, such files are used to extend
sys.path and execute package initialization code before control is passed
to the first line of user code.
This PEP proposes:
import lines in .pth files with entry point specifications
(i.e. pkg.mod:callable) in .start files..start file disables import line
processing in the matched .pth file.sys.path extension functionality of .pth files is retained.Support for import lines in .pth files will be gradually removed:
import line processing in .pth files will remain, except in the
presence of a matching .start file.import
lines in .pth files will be silently ignored.import lines in .pth files.Python’s .pth files (processed by Lib/site.py at startup)
support two functions:
sys.path – Lines in this file (excluding comments
and lines that start with import) name directories to be
appended to sys.path. Relative paths are implicitly anchored at
the site-packages directory.import (or
import\\t) are executed immediately by passing the source string
to exec().While there are valid use cases for both, the import line feature is the
most problematic because:
import can be extended by separating multiple
statements with a semicolon. As long as all the code to be
executed appears on the same line, it all gets executed when the
.pth file is processed.import lines are executed using exec() during interpreter
startup, which opens a broad attack surface.import lines rather than explicitly
declaring entry points.This PEP proposes the following:
<name>.pth file format, but deprecate import line
processing for three years, after which such lines will be disallowed.sys.path extension feature of <name>.pth files
unchanged. Specifically, absolute paths are used verbatim while relative
paths are anchored at the directory in which the .pth file is located.<name>.start is added which names entry points
conforming to the “colon-form” of pkgutil.resolve_name() arguments.<name>.start file
matching a <name>.pth file disables the execution of import lines in
the <name>.pth file in favor of entry points from the <name>.start
file. This provides a migration path straddling Python versions which
support this PEP and earlier versions which do not. In this case, warnings
about import lines are not printed.During the deprecation period, for any <name>.pth file without a
matching <name>.start file, the processing of the former is unchanged,
although a warning about import lines is issued when -v (verbose)
flag is given to Python.
After the deprecation period import lines in <name>.pth files are
ignored and a warning is issued, regardless of whether there is a matching
<name>.start file or not.
See the How to Teach This section for specific migration guidelines.
Both <name>.pth and <name>.start files are processed by the
site.py module, just like current .pth files. This means that
disabling site.py processing with -S disables processing of
both files.
site.py start up code is divided into these explicit phases:
<name>.pth files (see File Naming and Discovery for additional details)
and sort them in alphabetical order by filename.<name>.pth files in sorted order, keeping a global list of
all path extensions, preserving order file-by-file and then by entry
appearance. Duplicates are ignored.import lines found from
<name>.pth files. Processing of these lines is deferred until after
<name>.start file scanning.sys.path in the global preserved order.<name>.start files (see File Naming and Discovery for additional
details) and sort them in alphabetical order by filename.For any <name>.start that matches a previously scanned <name>.pth
file, discard all import lines from those matched <name>.pth files.
See the How to Teach This section for more details and rationale.
<name>.start files in sorted order, keeping a global list of
all entry points, preserving order file-by-file and then by entry
appearance. Duplicates are not ignored.pkgutil.resolve_name()
to resolve the entry point into a callable. Call the entry point with no
arguments and any return value is discarded. The resolved object is not
tested for callability before it is called (and thus any TypeError that
might result is reported).In both <name>.pth files and <name>.start files, comment lines
(i.e. lines beginning with # as the first non-whitespace character) and
blank lines are ignored. Any other parsing error causes the line to be
ignored.
pkgutil.resolve_name() is used to resolve an entry point specification
into a callable object. However, in Python 3.14 and earlier, this function
accepts two forms, described in the documentation with pseudo-regular
expressions:
W(.W)* - no colon formW(.W)*:(W(.W)*)? - colon form with optional callable suffixThis PEP proposes to only allow pkg.mod:callable form of entry points, and
requires that the callable be specified. See the open issues for further discussion.
<name>.pth and <name>.start. The <name> prefix is arbitrary and
need not match the package name or each other, although all else being
equal, it is recommended that they do match the package name for clarity.
The interpreter does not enforce any constraints on the prefix.<name>.start files live in the same site-packages directories where
<name>.pth files are found today. The <name>.pth location stays the
same.<name>.start files are the same as with
<name>.pth files today. File names that start with a single .
(e.g. .start) and files with OS-level hidden attributes (UF_HIDDEN,
FILE_ATTRIBUTE_HIDDEN) are excluded.During parsing, errors are generally skipped and only reported when -v
(verbose) flag is given to Python. Unlike with .pth files currently,
processing does not abort for the entire file when an error is encountered.
<name>.pth or <name>.start file cannot be opened or read, it
is skipped and processing continues to the next file.During execution, errors are printed to sys.stderr and processing
continues.
sys.path extension directory pointing to an invalid or nonexistent
path is ignored and processing continues to the next path entry.<name>.start files MUST be encoded with utf-8-sig,
i.e. UTF-8 with optional byte-order mark.
<name>.pth files SHOULD also be utf-8-sig encoded as well.
Currently, decoding <name>.pth files falls back to the current locale if
not encoded with utf-8-sig, but this PEP deprecates that support for 5
years, after which <name>.pth files MUST be encoded with utf-8-sig
as well.
The introduction of 2-phase processing of .pth and .start files gives
us the ability to implement future improvements, where some global site policy
can be applied, providing finer grained control over both sys.path
extension and entry point execution. One could imagine that after parsing, a
policy could be applied to either allow or deny path extensions or entry
points based on a number of different criteria, such as the <name> prefix
used to specify the extension, the path locations, or the modules in which the
entry points are defined.
This PEP deliberately leaves the design of such a policy mechanism to a future specification.
A previous iteration of this PEP proposed the use of a unified
<name>.site.toml file with tables to specify metadata, a list of path
extensions, and a list of entry points. While the PEP author and several
discussion participants liked this structured approach, a number of detractors
expressed the opinion that TOML files were overkill for this proposal. The
PEP author believes that the processing overhead of TOML files was negligible
and that the structured approach was useful for readability and future
extensibility. Detractors countered with YAGNI.
The two-file approach is a simple evolutionary improvement over the previous
.pth file process. The first improvement is the deprecation and removal
of arbitrary code execution through exec() of import lines. Such
lines are a wide attack vector that even the exec() standard library
documentation strongly warns against. Replacing these lines with the narrower
invocation of entry point function inside modules indirectly reduces the
attack vector because functions inside modules are easier to audit, both by
humans and automatic vulnerability scanners.
The second improvement is splitting sys.path extension from entry point
specification into two files, special purposed for the exact use case they
support. There’s no co-mingling of purposes, no possible interleaving of
effects, more readable file formats, and clear processing rules. sys.path
extensions are processed first, setting up the ability to import modules, and
then entry points are processed. It’s unambiguously clear which file format
supports which use case.
The third improvement is the 2-phase approach to processing these files.
Parsing errors can be reported early and need not terminate the further
processing of lines in each file. The transition from processing to execution
for both sys.path extension and entry point invocation gives us a chance
(in the future) to design and implement global policies for
explicitly controlling which path extensions and entry points are allowed (and
by implication, deemed safe), without resorting to the heavy hammer of
disabling site.py processing completely.
All valid sys.path extensions in all <name>.pth files found are
processed before any entry points in <name>.start files are called.
This is to ensure that all sys.path modifications required to import entry
point modules are applied first.
Entry points are not de-duplicated, regardless of whether they’re defined
multiple times in the same <name>.start file or across more than one
<name>.start file. This means that if an entry point appears more than
once it will get called more than once. Unlike with the de-duplication of
sys.path entries (where the appearance of a directory path later on
sys.path than its duplicate has no effect), users could – however
unlikely – actually want multiple invocations of their entry points. This
also avoids the complexity of defining de-duplicating entry point semantics
across independently-authored <name>.start files.
This PEP proposes a 3 year deprecation period for processing of import
lines inside .pth files. sys.path extensions in .pth files remain
unchanged.
There should always be a simple migration strategy for any packages which
utilize the import line arbitrary code execution feature of current
.pth files. They can simply move the code into a callable inside an
importable module inside the package, and then name this callable in an entry
point specification inside a <name>.start file.
This PEP makes it easier to audit code execution paths during interpreter startup.
sys.path extension from code execution into two separate
files means that you can tell by listing the files in the site-dir, exactly
where arbitrary code execution occurs.exec() with entry point
execution, which is more constrained and auditable.package.module:callable syntax limits execution to callables within
importable modules.The overall pre-start code execution attack surface is not eliminated by this PEP. A malicious package can still cause arbitrary code execution via entry points, but the mechanism proposed in this PEP is more structured, auditable, and amenable to future policy controls.
The site module documentation will be updated to describe the operation
and best practices for <name>.pth and <name>.start files. The
following migration guidelines for package authors will be included:
<name>.pth file, analyze whether you
are using it for sys.path extension or start up code execution. You can
keep all the sys.path extension lines unchanged.import lines feature, create a
callable (taking zero arguments) within an importable module inside your
package. Name these as pkg.mod:callable entry points in a matching
<name>.start file.import lines in your
<name>.pth to use the following form:import pkg.mod; pkg.mod.callable()
This way, older Pythons will execute these import lines, and newer
Pythons will ignore them, using the <name>.start file instead. In both
cases the same code is effectively used, so while there’s some
duplication, it is minimal.
import lines from your
<name>.pth file.Non-normatively, build tools may want to emit a warning if a package includes
both a <name>.pth file and a <name>.start file where the former
includes import lines that don’t match lines in the latter.
The reference implementation supports the current version of this PEP.
.pth files and leave the import lines alone<name>.site.toml files<package>.pth convention that tools already
understand.<name>.pth processing
order. Priority could be addressed by a future site-wide policy
configuration file, not per-package metadata.pkgutil.resolve_name(), i.e. specifically
requiring the pkg.mod:callable syntax. This is because we don’t want to
encourage code execution by direct import side-effect (i.e. functionality at
module scope level).Assuming this restriction is acceptable, how this is implemented is an open
question. site.py could enforce it directly, but then that sort of
defeats the purpose of using pkgutil.resolve_name(). The PEP author’s
preference would be to add an optional keyword-only argument to
pkgutil.resolve_name(), i.e. strict defaulting to False for
backward compatibility. strict=True would narrow the acceptable inputs
to effectively W(.W)*:(W(.W)*), namely, rejecting the older, non-colon
form, and making the callable after the colon required.
site.addpackage() is an undocumented function that processes a single
.pth file. It is not listed in site.__all__, not covered in
Doc/library/site.rst, and
has no stability guarantees. A GitHub code search found roughly half a
dozen third-party projects calling it directly, mostly with the pattern
site.addpackage(dir, "apps.pth", set()) — all of which can be replaced
by site.addsitedir(dir).During the work on the reference implementation, addpackage() becomes a
thin wrapper around the new internal pipeline for processing .pth and
.start files. Maintaining it as a separate functions adds complexity
for no documented use case. This function should be deprecated, with the
suggestion that users migrate to addsitedir() instead, a documented
public API.
import lines in .pth files. Packages can straddle
without warnings because the presence of a matching .start file
disables warnings for import lines in .pth files during the
deprecation period. However, as currently written, warnings will be
re-enabled at the end of the deprecation period, so at that point there
isn’t a way to straddle without warnings.The preferred solution is to simply hide all warnings for import lines
in .pth files behind the -v (verbose) flag, either for a full 5 year
period (keeping the 3 year processing deprecation timeline), or
indefinitely.
-X options provide fine-grained control over error
reporting or entry point execution?._pth files because
the purpose and behavior is completely different, despite the similar name.19-Apr-2026
<name>.start and
<name>.pth files.import lines in
<name>.pth files with no matching <name>.start file are only issued
when -v (verbose) is given.import lines in <name>.pth files where there is a
matching <name>.start file are ignored.site.addpackage()
deprecation, and extending the suppression of import line warnings in
.pth files unless -v is given, for an additional two years.Packaging Topic.<name>.pth file format and the
addition of the <name>.start file for entry point specification. The
<name>.site.toml file from the previous version is removed.import lines in .pth files is proposed.The PEP author thanks Paul Moore for the constructive and pleasant conversation leading to the compromise from the first draft of this proposal to its current form. Thanks also go to Emma Smith and Brett Cannon for their feedback and encouragement.
This document is placed in the public domain or under the CC0-1.0-Universal license, whichever is more permissive.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。