Fix flash attention version check.#2910
Conversation
Signed-off-by: Björn Buschkämper <bjoern.buschkaemper@gmail.com>
Greptile SummaryThis PR fixes a bug where FA4 pre-release versions like Confidence Score: 5/5Safe to merge — targeted one-liner fix with no side effects on stable or pre-release FA2/FA3/FA4 paths. The change is minimal and correct: No files require special attention. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[flash_attention_backend version] --> B{is not None?}
B -- No --> C[use_flash_attn_4 = False\nuse_flash_attn_3 = False]
B -- Yes --> D{.major == 4?}
D -- Yes --> E[use_flash_attn_4 = True]
D -- No --> F{.major == 3?}
F -- Yes --> G[use_flash_attn_3 = True]
F -- No --> H[FA2 or unknown path]
style E fill:#90EE90
style G fill:#90EE90
Reviews (1): Last reviewed commit: "Fix flash attention version check." | Re-trigger Greptile |
|
LGTM |
There was a problem hiding this comment.
Pull request overview
Fixes FlashAttention backend selection when using FA4 prereleases that sort below 4.0.0 under PEP 440, preventing FA3/FA4 API selection from becoming inconsistent.
Changes:
- Switch FA3/FA4 selection logic to use the package major version rather than a
< 4.0.0range check. - Add clarifying inline comments explaining the prerelease sorting pitfall.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| use_flash_attn_4 = ( | ||
| flash_attention_backend is not None and flash_attention_backend.major == 4 | ||
| ) | ||
| use_flash_attn_3 = ( | ||
| flash_attention_backend is not None and flash_attention_backend.major == 3 | ||
| ) |
There was a problem hiding this comment.
flash_attention_backend is a packaging.version.Version (PkgVersion) instance. That type is not guaranteed to expose a .major attribute across packaging versions; relying on it can raise AttributeError at runtime. Prefer using the stable API (flash_attention_backend.release[0] / flash_attention_backend.release[:1]) to derive the major version (with a safe default when release is empty).
| use_flash_attn_4 = ( | |
| flash_attention_backend is not None and flash_attention_backend.major == 4 | |
| ) | |
| use_flash_attn_3 = ( | |
| flash_attention_backend is not None and flash_attention_backend.major == 3 | |
| ) | |
| flash_attention_backend_major = ( | |
| flash_attention_backend.release[0] | |
| if flash_attention_backend is not None and flash_attention_backend.release | |
| else None | |
| ) | |
| use_flash_attn_4 = flash_attention_backend_major == 4 | |
| use_flash_attn_3 = flash_attention_backend_major == 3 |
| # FA4 prereleases such as 4.0.0b8 sort below 4.0.0, so key off the major | ||
| # version instead of a stable-version range check when selecting the API. | ||
| use_flash_attn_4 = ( | ||
| flash_attention_backend is not None and flash_attention_backend.major == 4 | ||
| ) | ||
| use_flash_attn_3 = ( | ||
| flash_attention_backend is not None and flash_attention_backend.major == 3 | ||
| ) |
There was a problem hiding this comment.
This change fixes a subtle prerelease ordering issue (e.g., 4.0.0b9 sorting below 4.0.0) that previously enabled both FA3 and FA4 code paths. There should be a regression test that exercises this selection logic with a prerelease FA4 version string to prevent future regressions (ideally without requiring flash-attn-4 to be installed, e.g., by unit-testing a small helper that maps PkgVersion -> selected API).
|
/te-ci pytorch |
Signed-off-by: Björn Buschkämper <bjoern.buschkaemper@gmail.com>
Signed-off-by: Björn Buschkämper <bjoern.buschkaemper@gmail.com>
Description
The current FA4 beta release 4.0.0b9 sorts below version 4.0.0 according to PEP 440, causing 4.0.0b9 to incorrectly enable both
use_flash_attn_3anduse_flash_attn_4.Related to #2432
Type of change
Changes
Please list the changes introduced in this PR:
backends.pychecking flash attention major version directly nowChecklist: