Refactor: Decouple Core Transformer Blocks #1852

parambole · 2025-06-19T18:09:34Z

TL;DR

What: This PR refactors the core DecoderLayer and its related components out of layers/models.py and into a new, foundational file: layers/blocks.py.
Why: To improve the overall code architecture and break potential circular dependencies. This is a necessary prerequisite for adding new, complex modules that also need access to these core building blocks.
How: By creating layers/blocks.py to house the fundamental Decoder and DecoderLayer classes. Higher-level files like models.py and other future modules now import these components from a single location.

Detailed Description

This pull request introduces a structural refactoring to improve modularity and maintainability.

The primary change is the creation of MaxText/layers/blocks.py, which now serves as the source for fundamental building blocks of the Transformer architecture, such as:

DecoderLayer
Decoder

Previously, these classes were located in layers/models.py, which created tight coupling. As we add more features this tight coupling would lead to circular import dependencies.

By decoupling these core components, we establish a clearer hierarchy in the codebase, where high-level modules can depend on these "building blocks" without depending on each other.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed.

bvandermoon

Thanks for refactoring @parambole. Can you explain the circular dependency you mentioned in the description?

bvandermoon · 2025-06-20T17:36:33Z

MaxText/layers/blocks.py

+Quant = quantizations.AqtQuantization
+
+
+class DecoderLayer(nn.Module):


Is there a more descriptive file name that can be used here instead of 'blockes.py'? Maybe naming this file decoders.py? Would need to create a separate encoders.py file for the VisionEncoder

I agree to name this as decoders.py, and move class VisionEncoder to encoders.py.

RissyRan

Could you select one model to run a quick perf test with same configs? main branch vs. your branch. I see functional tests are passing, just ensure some breakdown does not hurt performance.

Refactor: Decouple Decoder into layers/blocks.py

2e56b95

parambole force-pushed the parambole/mtp_refactor branch from c7e43e0 to 2e56b95 Compare June 19, 2025 18:59

parambole changed the title ~~Refactor: Decouple Decoder into layers/blocks.py~~ Refactor: Decouple Core Transformer Blocks Jun 19, 2025

parambole mentioned this pull request Jun 19, 2025

Integrate Multi-Token Prediction (MTP) Training objective #1837

Open

4 tasks

Fixing Import

6ffbbe6

parambole marked this pull request as ready for review June 19, 2025 19:48

parambole requested review from RissyRan, gagika, richjames0, gobbleturk, khatwanimohit, bvandermoon, vipannalla, shralex, yangyuwei, SurbhiJainUSC, hengtaoguo, A9isha and aireenmei as code owners June 19, 2025 19:48

parambole assigned RissyRan and gobbleturk Jun 19, 2025

bvandermoon reviewed Jun 20, 2025

View reviewed changes

RissyRan reviewed Jun 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor: Decouple Core Transformer Blocks #1852

Refactor: Decouple Core Transformer Blocks #1852

Uh oh!

parambole commented Jun 19, 2025 •

edited

Loading

Uh oh!

bvandermoon left a comment

Uh oh!

bvandermoon Jun 20, 2025

Uh oh!

RissyRan Jun 20, 2025

Uh oh!

RissyRan left a comment

Uh oh!

Uh oh!

		Quant = quantizations.AqtQuantization


		class DecoderLayer(nn.Module):

Refactor: Decouple Core Transformer Blocks #1852

Are you sure you want to change the base?

Refactor: Decouple Core Transformer Blocks #1852

Uh oh!

Conversation

parambole commented Jun 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Detailed Description

Checklist

Uh oh!

bvandermoon left a comment

Choose a reason for hiding this comment

Uh oh!

bvandermoon Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

RissyRan Jun 20, 2025

Choose a reason for hiding this comment

Uh oh!

RissyRan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

parambole commented Jun 19, 2025 •

edited

Loading