Skip to content

Support CPU32 version of the 68000 instruction set #10

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

arvi18
Copy link

@arvi18 arvi18 commented Apr 26, 2025

See NationalSecurityAgency#1244

The "CPU32" iteration of the 68000 instruction set deprecates a few instructions and adds new table-lookup instructions. I I suspect this is not 100% complete, but it does a fine job on the firmware that I'm studying, so I don't plan to invest more time in it myself.

This is largely based on the work that @dzidaV8 began when that issue was opened. I could not have done this without that head start.

Summary by CodeRabbit

  • New Features
    • Added support for the Motorola 32-bit CPU32 variant in the 68000 processor family, including new instruction decoding, addressing modes, and CPU32-specific instructions.
  • Enhancements
    • Extended language definitions and manifest entries to recognize and utilize the new CPU32 processor variant within the tool.
  • Documentation
    • Updated descriptions and mappings for CPU32 in processor language files.

@arvi18
Copy link
Author

arvi18 commented Apr 26, 2025

This PR fails to build; certification.manifest needs an entry for CPU32.slaspec.

> Task :68000:ip FAILED

FAILURE: Build failed with an exception.

* Where:
Script 'C:\Users\bsa3\source\ghidra\gradle\support\ip.gradle' line: 125

* What went wrong:
Execution failed for task ':68000:ip'.
> No IP found for C:\Users\bsa3\source\ghidra\Ghidra\Processors\68000\data\languages\CPU32.slaspec in module: C:\Users\bsa3\source\ghidra\Ghidra\Processors\68000. Expression: (ip != null). Values: ip = null

@arvi18
Copy link
Author

arvi18 commented Apr 26, 2025

Sorry for the slow turnaround on this, but I've added the missing entry, and I get a clean local build.

@arvi18
Copy link
Author

arvi18 commented Apr 26, 2025

I'm still working on getting these pcodetests to run/pass. I've been running into a few issues. The first is on my end - gcc keeps inserting bfffo instructions which cpu32 doesn't know how to process.

The second is that the cpu32 pcodetests won't even run without at least one modification. The pcodeop countleadingzeroes needs to be moved out of the #ifndef CPU32 block and should be put around line 345 with the rest of the pcodeop definitions.

After that I'm still getting some failures that I'm looking into to see what the resolution should be.

@arvi18
Copy link
Author

arvi18 commented Apr 26, 2025

I've manually rebased @LegacyNsfw's PR (so far without taking care of the comments) into a branch on my fork at backerman@fafd1bb. It mostly fails to apply because some previously unimplemented 68k instructions have since been implemented.

Copy link

coderabbitai bot commented Apr 26, 2025

Walkthrough

The changes introduce support for the Motorola CPU32 variant within Ghidra's 68000 processor module. This is achieved by updating the manifest and language definition files to register the new CPU32 variant, creating a new specification file for CPU32, and extending the SLEIGH specification to add CPU32-specific instruction decoding, addressing modes, and pcode operations. Several instructions are conditionally included or excluded based on the CPU32 macro, and new instruction patterns and tokens are defined to support CPU32's unique features such as table lookup and interpolation instructions.

Changes

File(s) Change Summary
Ghidra/Processors/68000/certification.manifest Added a new entry for the CPU32.slaspec language specification file to the manifest, registering it within the 68000 processor module.
Ghidra/Processors/68000/data/languages/68000.ldefs Added a new <language> XML element defining the CPU32 variant of the 68000 processor, specifying its attributes, description, compiler spec, and external name mapping for IDA-PRO.
Ghidra/Processors/68000/data/languages/CPU32.slaspec Created a new specification file for CPU32, defining the CPU32 and MC68332 macros and including the base 68000.sinc file.
Ghidra/Processors/68000/data/languages/68000.sinc Introduced CPU32-specific support in the SLEIGH specification: new tokens, pcode operations, instruction patterns, addressing mode constructors, and conditionally included/excluded instructions. Added decoding and semantics for table lookup and interpolation instructions, and new instructions such as lpstop and bgnd for CPU32.

Sequence Diagram(s)

Loading
sequenceDiagram
    participant Disassembler as Ghidra Disassembler
    participant CPU32Spec as CPU32.slaspec & 68000.sinc
    participant User as User

    User->>Disassembler: Select CPU32 processor variant
    Disassembler->>CPU32Spec: Load CPU32.slaspec (defines CPU32 macros, includes 68000.sinc)
    CPU32Spec->>Disassembler: Provide CPU32-specific instruction definitions
    Disassembler->>User: Disassemble code using CPU32 decoding, tokens, and pcode

Poem

In Ghidra’s warren, new code takes its place,
CPU32 hops in with elegant grace.
Sleigh rides with tokens, new patterns in tow,
Table lookups and interpolations now glow.
With macros and manifests, support is complete—
A rabbit’s delight, this update’s a treat!
🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @arvi18, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request adds support for the CPU32 version of the 68000 instruction set to Ghidra. According to the PR description, this implementation is largely based on the work that @dzidaV8 began. The CPU32 iteration deprecates a few instructions and adds new table-lookup instructions. The changes involve modifying the .ldefs file to include the new CPU32 language definition, adding a new .slaspec file for CPU32, and modifying the .sinc file to include CPU32-specific definitions and instruction implementations. Several instructions are conditionally disabled for non-CPU32 architectures.

Highlights

  • New Language Definition: A new language definition for CPU32 is added to 68000.ldefs, specifying the processor, endianness, size, variant, version, and associated files.
  • CPU32-Specific Definitions: The 68000.sinc file is modified to include CPU32-specific definitions, such as TBL_ADDR_MODES and new token definitions for TBL instructions.
  • Conditional Compilation: Many instructions are now conditionally compiled based on whether CPU32 is defined, effectively disabling them for other 68000 architectures.
  • Table Lookup Instructions: The pull request introduces new table lookup and interpolate instructions specific to CPU32, including definitions for addressing modes, signedness, and rounding.
  • New slaspec file: A new CPU32.slaspec file is added, which includes the 68000.sinc file and defines CPU32 and MC68332.

Changelog

Click here to see the changelog
  • Ghidra/Processors/68000/certification.manifest
    • Added data/languages/CPU32.slaspec to the manifest file.
  • Ghidra/Processors/68000/data/languages/68000.ldefs
    • Added a new language definition for the CPU32 variant of the 68000 processor.
  • Ghidra/Processors/68000/data/languages/68000.sinc
    • Added conditional compilation blocks (@ifdef CPU32) to define CPU32-specific addressing modes and token definitions.
    • Added token definitions for the TBL instructions.
    • Added attach variables conditionally for CPU32.
    • Conditional compilation to exclude bitfield parameters when CPU32 is defined.
    • Conditional compilation to exclude several instructions when CPU32 is defined, including bfchg, bftst, callm, chk, cpBcc, cpDBcc, cpGEN, cpRESTORE, cpSAVE, cpScc, cpTRAPcc, pack, rtm, and unpk.
    • Modified the stop instruction and added lpstop and bgnd instructions, conditionally compiled for CPU32.
    • Added table lookup instruction definitions and related effective address constructors, conditionally compiled for CPU32.
  • Ghidra/Processors/68000/data/languages/CPU32.slaspec
    • Created a new .slaspec file for the CPU32 processor, including 68000.sinc.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


In Ghidra's realm, a new CPU,
CPU32, for you and me.
Table lookups, a coder's delight,
Reverse engineering, day and night.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the CPU32 version of the 68000 instruction set. The changes include adding a new language definition, updating the instruction set specification, and modifying the token definitions. Overall, the changes seem well-structured and address the issue outlined in the description. However, there are a few areas that could benefit from further review and clarification.

Summary of Findings

  • Conditional Compilation: The extensive use of @ifdef CPU32 and @ifndef CPU32 suggests a need to carefully verify that the code behaves as expected for both CPU32 and non-CPU32 configurations. Testing both scenarios is crucial.
  • TODO Comments: Several TODO comments indicate incomplete implementations. It's important to either address these before merging or create separate issues to track them.
  • Addressing Mode Constructors: The comments about tbl_mode=4 and tbl_mode=5 constructors not being implemented for TBL instructions indicate a potential area of incompleteness that needs to be addressed for full CPU32 support.

Merge Readiness

The pull request introduces significant functionality by adding support for the CPU32 instruction set. However, due to the presence of TODO comments and conditional compilation, thorough testing and verification are essential before merging. I am unable to directly approve this pull request, and recommend that other reviewers carefully examine the code and ensure that all critical and high severity issues are addressed before merging. Given the number of unimplemented features and conditional compilation blocks, it's advisable to postpone merging until these aspects are fully resolved and tested.

Comment on lines +55 to +56
@ifdef CPU32
@define TBL_ADDR_MODES "(tbl_mode=2 | tbl_mode=4 | tbl_mode=5 | tbl_mode=6 | tbl_mode=7)" # Addressing modes for tblxx instructions

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider adding a comment explaining why tbl_mode=3 is skipped in the definition of TBL_ADDR_MODES.

@define TBL_ADDR_MODES "(tbl_mode=2 | tbl_mode=4 | tbl_mode=5 | tbl_mode=6 | tbl_mode=7)" # Addressing modes for tblxx instructions (tbl_mode=3 skipped)

Comment on lines +887 to 888
@ifndef CPU32
:callm "#"^d8,e2l is opbig=6 & op67=3 & $(CTL_ADDR_MODES); d8; e2l [ savmod2=savmod1; regtsan=regtfan; ] unimpl

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since callm is unimplemented, consider adding a define pcodeop unimplemented; and using that instead of unimpl directly. This would provide a consistent way to mark unimplemented instructions.

define pcodeop unimplemented;
::callm "#"^d8,e2l         is opbig=6 & op67=3 & $(CTL_ADDR_MODES); d8; e2l          [ savmod2=savmod1; regtsan=regtfan; ] unimplemented

Comment on lines +1672 to +1676
#TODO: implement interrupt mask → EBI; STOP
:lpstop "#"^d16 is opbig=0xf8 & d8base=0x00; opbig=0x01 & d8base=0xC0; d16 { SR = d16; }

#TODO: implement: if (background mode enabled) then enter Background Mode else Format/Vector offset → –(SSP); PC → –(SSP); SR → –(SSP); (Vector) → PC
:bgnd is opbig=0x4A & d8base=0xFA { }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

These TODO comments should be converted to issues in the issue tracker, and the issue number should be referenced in the comment. This will help track the remaining work and prevent these from being forgotten.

#TODO: GH-1234 implement interrupt mask → EBI; STOP
::lpstop "#"^d16         is opbig=0xf8 & d8base=0x00; opbig=0x01 & d8base=0xC0; d16                { SR = d16; }

#TODO: GH-5678 implement: if (background mode enabled) then enter Background Mode else Format/Vector offset  → –(SSP); PC → –(SSP); SR  → –(SSP); (Vector) → PC
::bgnd                is opbig=0x4A & d8base=0xFA { }

@@ -1680,6 +1737,96 @@ ptestLevel: "#"^mregn is mregn { export *[const]:1 mregn; }
:tas eab is (opbig=0x4a & op67=3 & $(DAT_ALTER_ADDR_MODES))... & eab { logflags(); resflags(eab); eab = eab | 0x80; }
@endif # COLDFIRE

@ifdef CPU32

# TODO: tbl_mode=4 and tbl_mode=5 constructors

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Consider creating issues for these TODO items and referencing them here.

# TODO: GH-9101 tbl_mode=4 and tbl_mode=5 constructors

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
Ghidra/Processors/68000/data/languages/CPU32.slaspec (1)

3-6: Minimal spec is fine, but consider documenting intent

The file just stubs two empty macros and includes 68000.sinc.
Even a single-line comment that clarifies why the macros are intentionally empty (i.e. they are purely feature flags) would help future readers.

 @define CPU32 ""
 @define MC68332 ""
+# NOTE: These macros are used only as conditional
+#       compilation flags inside 68000.sinc – no value needed.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2abfd4e and 6223f0b.

📒 Files selected for processing (4)
  • Ghidra/Processors/68000/certification.manifest (1 hunks)
  • Ghidra/Processors/68000/data/languages/68000.ldefs (1 hunks)
  • Ghidra/Processors/68000/data/languages/68000.sinc (14 hunks)
  • Ghidra/Processors/68000/data/languages/CPU32.slaspec (1 hunks)
🔇 Additional comments (2)
Ghidra/Processors/68000/certification.manifest (1)

14-14: Double-check certification status flag

Other .slaspec lines are occasionally tagged reviewed.
If the CPU32 spec has already passed IP review you may want the same marker to avoid future build-pipeline noise.

-data/languages/CPU32.slaspec||GHIDRA||||END|
+data/languages/CPU32.slaspec||GHIDRA||reviewed||END|
Ghidra/Processors/68000/data/languages/68000.sinc (1)

164-181: Field-name reuse across tokens may collide in SLEIGH – compile-check

The new 32-bit tbl_instrA token introduces fields like tbl_opbig, tbl_op67, etc.
If any of those names already exist in other tokens, the SLEIGH compiler will error (“duplicate field name”). I couldn’t find tbl_op67 elsewhere, but please run sleigh –a to be safe.

Comment on lines +67 to +79
<language processor="68000"
endian="big"
size="32"
variant="CPU32"
version="1.1"
slafile="CPU32.sla"
processorspec="68000.pspec"
manualindexfile="../manuals/68000.idx"
id="68000:BE:32:CPU32">
<description>Motorola 32-bit CPU32</description>
<compiler name="default" spec="68000.cspec" id="default"/>
<external_name tool="IDA-PRO" name="CPU32"/>
</language>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Missing gnu external-name will break tool-chain selection

All other 68k variants expose a gnu mapping (e.g. m68k:68020).
Without one, Ghidra cannot auto-select the language for ELF/obj files that advertise m68k:cpu32, and the importer will fall back to heuristics.

     <description>Motorola 32-bit CPU32</description>
     <compiler name="default" spec="68000.cspec" id="default"/>
+    <external_name tool="gnu" name="m68k:cpu32"/>
     <external_name tool="IDA-PRO" name="CPU32"/>

Comment on lines +1744 to +1753
tbl_eal: (tbl_regan) is tbl_mode=2 & tbl_regan { export *:4 tbl_regan; }
# tbl_eal: -(tbl_regan) is tbl_mode=4 & tbl_regan { tbl_regan = tbl_regan - 4; export *:4 tbl_regan; }
# tbl_eal: (d16,tbl_regan) is tbl_mode=5 & tbl_regan; d16 { local tmp = tbl_regan + d16; export *:4 tmp; }
tbl_eal: (extw) is tbl_mode=6 & tbl_regan; extw [ regtfan = tbl_regan; pcmode = 0; ] { build extw; export *:4 extw; }
tbl_eal: (d16,PC) is PC & tbl_mode=7 & tbl_regan=2; d16 { tmp:4 = inst_start + 2 + d16; export *:4 tmp; }
tbl_eal: (extw) is tbl_mode=7 & tbl_regan=3; extw [ pcmode=1; ] { build extw; export *:4 extw; }
tbl_eal: (d16)".w" is tbl_mode=7 & tbl_regan=0; d16 { export *:4 d16; }
tbl_eal: (d32)".l" is tbl_mode=7 & tbl_regan=1; d32 { export *:4 d32; }
tbl_eal: "#"^d32 is tbl_mode=7 & tbl_regan=4; d32 { export *[const]:4 d32; }

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

TBL ea constructors for modes 4/5 are left commented out

The comment says “TODO”, but the table-lookup instructions will raise Bad effective-address for pre-decrement or displacement modes.

If those modes are architecturally invalid for CPU32, delete the commented rules and tighten TBL_ADDR_MODES; otherwise implement them before users rely on these opcodes.

Comment on lines +1672 to +1677
#TODO: implement interrupt mask → EBI; STOP
:lpstop "#"^d16 is opbig=0xf8 & d8base=0x00; opbig=0x01 & d8base=0xC0; d16 { SR = d16; }

#TODO: implement: if (background mode enabled) then enter Background Mode else Format/Vector offset → –(SSP); PC → –(SSP); SR → –(SSP); (Vector) → PC
:bgnd is opbig=0x4A & d8base=0xFA { }
@endif # CPU32
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

lpstop / bgnd have no semantics – decoding only

Both instructions are flagged TODO but currently emit no p-code.
This means analysis will treat them as NOPs, hiding control-flow changes (e.g. low-power stop).

At minimum, emit a trap-like p-code stub so control-flow is suspended:

-:lpstop "#"^d16            is opbig=0xf8 & d8base=0x00; opbig=0x01 & d8base=0xC0; d16              { SR = d16; }
+:lpstop "#"^d16            is opbig=0xf8 & d8base=0x00; opbig=0x01 & d8base=0xC0; d16              { ##cpu32_lpstop(); }
 
-:bgnd               is opbig=0x4A & d8base=0xFA { }
+:bgnd               is opbig=0x4A & d8base=0xFA { ##cpu32_bgnd(); }

Where ##cpu32_lpstop / ##cpu32_bgnd are dummy p-codeops declared near the other define pcodeop lines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#TODO: implement interrupt mask → EBI; STOP
:lpstop "#"^d16 is opbig=0xf8 & d8base=0x00; opbig=0x01 & d8base=0xC0; d16 { SR = d16; }
#TODO: implement: if (background mode enabled) then enter Background Mode else Format/Vector offset → –(SSP); PC → –(SSP); SR → –(SSP); (Vector) → PC
:bgnd is opbig=0x4A & d8base=0xFA { }
@endif # CPU32
#TODO: implement interrupt mask → EBI; STOP
:lpstop "#"^d16 is opbig=0xf8 & d8base=0x00; opbig=0x01 & d8base=0xC0; d16 { ##cpu32_lpstop(); }
#TODO: implement: if (background mode enabled) then enter Background Mode else Format/Vector offset → –(SSP); PC → –(SSP); SR → –(SSP); (Vector) → PC
:bgnd is opbig=0x4A & d8base=0xFA { ##cpu32_bgnd(); }
@endif # CPU32

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants