Oh no! Where's the JavaScript?
Your Web browser does not have JavaScript enabled or does not support JavaScript. Please enable JavaScript on your Web browser to properly view this Web site, or upgrade to a Web browser that does support JavaScript.

Using AI tools for AROS development

Last updated on 1 month ago
C
CoolCat5000Member
Posted 1 month ago
Hi again,
Not against of the clean git history, but as deadwood mention going back in the git history to know why something was that way, I saw one person having an adr documentation in sync with the git history, that was somehow generated/managed with AI.

I don’t know exactly how it was made, but was exactly the scenario mentioned, in a even more human friendly approach.

Again, there is lot of resources/tools that are emerging based on AI and I can’t track/test it all, but every single dimension of what used to be are been research to have an AI version of it and some novel experiments to give you more and more human friendly understanding of the code.

That said, it’s a ongoing thing, not all will be success and lots make more sense having some degree of local AI (no token$) to be pratical.

Logs and documentation’s got even another level up, it is a new kind of code, cause it also can be consumed from agents.

I myself got producing much more “documents” artifacts that I never used (or will use) myself, but anyway, commenting cause I saw that exact scenario and apparently an impressive result.

Regards,
Edited by CoolCat5000 on 15-05-2026 22:32, 1 month ago
T
terminillsMember
Posted 1 month ago

deadwood wrote:

@deadwood - Hi All,

Thanks for sharing your experiences and workflows. That's what I was hoping this thread will become. It's really great to see our community embracing AI-assited development.

@terminills, @kaffeine

One thing I noticed in your workflows is that you use git to store a diary - that is you store all failed attempts and allow AI to learn from that. It's a novel approach I didn't think off. It definatelly makes sense during development phase to avoid AI going into endless loops.

At some point though it is time to merge your work to the mainline and here I would suggest a diffent approach:

As long as the repository is to be still usable by humans, the amounts of changes needed to implement a feature needs to be minimized. I'm talking from experience - I often need to go back into git history several years and looking at the changes a person made, try to understand WHY the changes were made.

Generally the chain of changes is:

A->B->C

However if you commit failed attempts, the chain becomes:

A->B1->B2->B3->B4->B...... -> C

The more of such in-between changes the harder it is for human to understand them and they bring no value, because most of the time it's AI doing "shotgun debugging" trying different aproaches until one of them works.

So, my ask is: When merging to mainline, keep the history and code still human readable. Don't force the human to use an AI to explain the history of changes to him because of other AIs iterative bugfixing process. Wink


That was always my plan. it makes no sense to flood the main repo with clutter. but for local development it makes total sense.
C
CoolCat5000Member
Posted 1 month ago
Hi,
Nice to see this git insight, I must educate myself into it.

I said I wouldn’t share resources that I didn’t tested myself, but I think this resource can give a perspective on the subject of AI coding, much less aros specific but maybe someone can find it insightful (and somehow ilustrate my pov about discovery and management)

https://github.co...MAD-METHOD

Regards,
K
kaffeineJunior Member
Posted 1 month ago
I totally agree. Using Git as a diary should be strictly for local development with AI tools, just to track failed attempts and avoid loops. But that messy history shouldn't reach the mainline.

Upstream commits must be clean, squashed, and strictly human-readable—no debugging noise or dead ends. It's essentially two layers: a messy 'AI diary' for local exploration, and a curated history for future maintainers.

'Dirty workshop, clean mainline.' Thanks for highlighting this boundary; I'm making it a permanent rule in my workflow!
deadwood, CoolCat5000, terminills
D
deadwoodAROS Dev
Posted 1 month ago
Hi All,

Thanks for sharing your experiences and workflows. That's what I was hoping this thread will become. It's really great to see our community embracing AI-assited development.

@terminills, @kaffeine

One thing I noticed in your workflows is that you use git to store a diary - that is you store all failed attempts and allow AI to learn from that. It's a novel approach I didn't think off. It definatelly makes sense during development phase to avoid AI going into endless loops.

At some point though it is time to merge your work to the mainline and here I would suggest a diffent approach:

As long as the repository is to be still usable by humans, the amounts of changes needed to implement a feature needs to be minimized. I'm talking from experience - I often need to go back into git history several years and looking at the changes a person made, try to understand WHY the changes were made.

Generally the chain of changes is:

A->B->C

However if you commit failed attempts, the chain becomes:

A->B1->B2->B3->B4->B...... -> C

The more of such in-between changes the harder it is for human to understand them and they bring no value, because most of the time it's AI doing "shotgun debugging" trying different aproaches until one of them works.

So, my ask is: When merging to mainline, keep the history and code still human readable. Don't force the human to use an AI to explain the history of changes to him because of other AIs iterative bugfixing process. Wink
Edited by deadwood on 15-05-2026 03:38, 1 month ago
CoolCat5000, terminills
C
CoolCat5000Member
Posted 1 month ago
Hi,
More or less, as I think you can have multiple scenarios, like porting a software, but are others …

One thing that AI can do is taking projects from the drawer, there’s a lot of projects that never see the light of the day cause it’s a huge effort (you can imagine big ports, major refactors or even hard reverse engineering)

I mostly agree with what was described and the cost of massive wide log analisys token cost, but, even if sometimes AI can get into rabbit holes from such logs but sometimes it founds precious info.

Mostly I miss a discovery and management tool.

Plan is a interactive discovery, so I wold like to have a chat that the issue could be refined into requirement (hello copilotkit)

From there I would like to have a kaban board so you could manage better the sessions history. For example, you start with a graphic subsystem and you find a goal and interact toward to solve it, in the way you do findings and don’t go fully into close all the points of the subject, you stop when it’s good enough, move to a new bottleneck and when you come back for the subject you kind of lost the rich context you have previously enriched, including fail attempts.

I have much tokens wasted doing the same thing over and over again (including cause of a bad initial setup from my side).

And keep the docs in the same pace of the code is another pain, you had made an initial decision, it is documented as an objective or guideline but as code evolved those become obsoleted and sometimes the AI go into it outdated decision as reference.

I am still testing stuff, but mostly I would to have a software development lifecycle the could get most of tools like git for branching, and apply regression tests on pr reviews, but all of this needs setups and I was too excited to tests ideas and see what woul emerge.

Make a robust setup and workflow is lot of work, and ofc can be made using AI assistance, and there are lots of possibility that you only start to be aware as more as you use the tools.

This is not a much pratical post from my side, but it’s just to enrich with a wider perspective.

There are much potential that maybe we are scretching the surface. I wonder how many projects was abbandoned as doesn’t worth that amount of work that start to be just a time and token issue.

There’s is lot of issues allready very well described in the docs, like describing the issue, describing the goal and also describing what should be done, and I think that is gold for the AI scenario cause you allready have more than half of the discovery phase done.

Regards,
Edited by CoolCat5000 on 13-05-2026 11:01, 1 month ago
K
kaffeineJunior Member
Posted 1 month ago
Thanks, this is exactly the kind of workflow I was hoping to hear about.

The “AI as a collaborative partner, not a vending machine” framing matches my experience very closely. I’m currently experimenting with using AI tools to help port software to AROS, starting with bebboSSH and later hopefully Telegram-related tooling.

My current plan is very similar to what you described:

- use Codex/Claude-style tools for the initial porting/scaffolding work
- build and test inside AROS x86/x64 under QEMU
- keep the feedback loop tight with serial/debug logs
- add very granular logging instead of asking the AI to guess from vague crashes
- use git commits as a development diary, including failed attempts

The point about committing failures is especially useful. It turns the repository history into memory for both the human developer and the AI assistant, which is probably essential on a platform like AROS where many problems are low-level, system-specific, or simply undocumented.

I’m also interested in whether you have any recommended debug setup for AROS under QEMU, especially for userspace porting work. For example: preferred sysdebug settings, serial logging, or any scripts/conventions you found especially useful.

In any case, thanks for sharing this. It confirms that AI-assisted AROS development is not just possible, but already useful when treated as an iterative engineering process rather than “generate code and pray”.
Argo, terminills, deadwood
T
terminillsMember
Posted 1 month ago
I've been using AI tools extensively to push development on AROS. Recent projects include:
  • Ported Mesa 26 to ABIv1
  • Fixed SMP bugs
  • Fixed Large BAR issues
  • Brought up the RadeonSI driver
  • Ported LLVM, LLVMPipe, and llama.cpp


My Workflow Philosophy

My approach is deliberately not conventional. I treat AI as a collaborative partner rather than a vending machine. The process relies heavily on letting the AI fail, closely analyzing those failures, and iterating quickly. This "learn by breaking things" method has proven far more effective than trying to get perfect code on the first attempt.

Tool Selection and Roles

I use three different AI coding tools, each for specific strengths:Claude Code: Best for exploration, brainstorming new approaches, and pushing the boundaries of what's possible.
Codex: Excellent at handling the first ~80% of a port or large feature. It scaffolds code quickly but often collapses on complex edge cases or low-level system details.
GitHub Copilot: Strong at self-correction and considering multiple angles. It tends to second-guess itself when stuck, which makes it ideal for deep debugging.

Debugging Workflow

Scaffolding — I start with Codex to generate the initial implementation.
Initial Testing — Boot the system (usually in QEMU or hosted) and expect it to crash. This is intentional.
Targeted Debugging — Instead of feeding crash logs back to the AI for speculative disassembly and root-cause guessing (which wastes a lot of tokens and money), I switch to Copilot.
Granular Logging — I instruct Copilot to add increasingly detailed debug logging until we isolate the exact point of failure. Once the crash is precisely located, the AI can usually identify and fix the issue very quickly.

"Vibe Coding" Setup

When doing exploratory or experimental work ("vibe coding"), I set strict guardrails:All destructive commands are blocked.
I rely on custom scripts that automatically launch AROS (hosted or under QEMU) with full debugging enabled (sysdebug=all, or QEMU's debug=serial with output redirected to a log file).
I maintain custom debug commands in the working branches.
The AI is encouraged to use git as a diary. It commits both successes and failures, including clear explanations for its "future self" about what was attempted and why it didn't work.

This creates a rich history of reasoning and experimentation that helps avoid repeating dead ends and accelerates long-term progress.
Edited by deadwood on 15-05-2026 03:15, 1 month ago
Argo, deadwood, CoolCat5000
You do not have access to view attachments
K
kaffeineJunior Member
Posted 1 month ago
Hi everyone,

very interesting thread.

I am currently using coding agents quite heavily while developing Telegram Amiga, an experimental Telegram Bot API tester/client for Amiga-like systems:

https://github.co...gram-amiga

It targets AmigaOS 3.x, MorphOS, AmigaOS 4.x and AROS.

The project may be useful as a practical case study for AI-assisted Amiga/AROS development, because the agents are not only generating code snippets. They are also helping with:

- cross-platform C code organization;
- Makefiles for different Amiga-like targets;
- build/test loops;
- analysing compiler errors from cross-compilers;
- writing small self-tests;
- producing tester documentation;
- keeping platform-specific code separated;

The important part, at least in my experience, is not “vibe coding blindly”, but giving the agent a constrained task, real source code, build logs, and a clear test target. Then the loop becomes:

plan → implement small step → compile → inspect errors → fix → test → document.

I also started an early AROS port of Bebbo’s sshd:

https://github.co...bossh-aros

The goal is not only to have an SSH server for its own sake, but to improve the development workflow when working with AROS VMs or remote AROS machines.

With a working sshd on AROS, a developer can connect from the host system, copy files more easily, run commands remotely, collect logs, and automate parts of the build/test cycle without constantly switching back and forth between the host and the AROS desktop.

For example, when developing from macOS/Linux and testing inside an AROS VM, sshd can make it much easier to push a new binary into AROS, run it from Shell, capture the output, and iterate quickly.

The port is still very embryonic, and interactive console programs may still behave better from the local AROS Shell, but even at this stage it is already useful for file transfer, remote testing and general development workflow experiments.
s.

So if anyone is experimenting with AI agents for AROS development, feel free to have a look at the repositories. I would also be interested in comparing workflows, especially around AROS SDK setup, cross-compilation, QEMU testing and how to feed useful Amiga/AROS context to coding agents.

More details about the current AROS Telegram Amiga tester build and the requested tests are in this thread:

https://www.arosw...ad_id=1970
C
CoolCat5000Member
Posted 1 month ago
Hi, great to see this rolling …
I couldn’t yet expand my labs using more alternative tools.
I don’t have much horse powers in my hardware and I started the project of a baremetal emulator and didn’t have yet the opportunity of make new labs. (Mostly I’m in a love/hate relationship with Claude code and codex and I am afraid that have small models in the loop atm. could mess the bring up).

It advances, it regresses, and I say that I would never use it again, but I don’t know what I am doing, so as overall it’s impressive where I could reach so far.

I will not share what I haven’t yet tested myself, but I think it’s very good this kind of shout out as it brings awareness for the pain and pleasures of this new tools.

It is my professional goal at end of the year be more familiar with those environments and was a good excuse to play with amiga/arosland 😁

Thanks @deadwood I hope I can contibute more with this discussion. (Afterwall it is my objective to test setups and workflows)

Atm I am vibe coding, but I hope next month start something more spec oriented workflows.

My goal is aros boot screen, aros booting and installing aros to sd, after that I will be free to test other variations, I am quite obsessed atm in reach the objective (that started with a question mark if it would be possible, but I allready noticed that the answer is yes 😁)

Regards,
D
deadwoodAROS Dev
Posted 1 month ago
FYI: There are two new free models in OpenCode:

  • DeepSeek V4 Flash - very good model, I use it daily
  • Ring 2.6 - good model


Also, if you want to keep up with latest available models, bookmark this link:

https://openrouter.ai/models?categories=programming
Edited by deadwood on 12-05-2026 01:20, 1 month ago
retrofaza, Argo, CoolCat5000, Deremon
D
deadwoodAROS Dev
Posted 2 months ago

miker1264 wrote:

@miker1264 - As far as AI I use specific questions to get code samples and brief explanations from Copilot AI. It's very useful for my purposes.


Yes, I also started with Copilot and it's very capable. The recent changes to cost of using it made it however not economical. Right now I can do what Copilot did (analysis-wise) on local sources with AI Agents for a fractions of the price. Smile
D
deadwoodAROS Dev
Posted 2 months ago

CoolCat5000 wrote:

@CoolCat5000 -

Aros already has in the build system extraction from the code to docs, web pages etc … maybe it could also generate info for coding agents, but I don’t know exactly what would be the best approach


The "starter pack" already contains autodocs generated from AROS sources and MUI autodocs. I've seen agents learn from them. The pack also contains some example source codes and some models used them a lot in my testing. Generally I noticed that some models seem to have a good understanding of Amiga API (DeepSeek V4) and some are less trained on it (MiniMax M2.5). The later once use examples to a larger degree.

CoolCat5000 wrote:

@CoolCat5000
If we could find a good way of indexing the codebase so the AI agent could have a quicker onboarding and symbols retrieving could be handy. (I am not sure, but probably I would try to make something in this lines, plus other tools for tokens management, but that would be out of the codebase scope)
Regards,


That's a good point for large codebase like AROS. For smaller projects I've seen agents capabable of handling this through standard shell tools: grep, find, ls, etc.
D
deadwoodAROS Dev
Posted 2 months ago

CoolCat5000 wrote:

@CoolCat5000

If this topic is about this kind of thing I have lots of resources from token usage reduction to codebase indexing. I didn’t use it at all yet, so it’s not any recommendations.



Thanks for your comments. Yes, this thread is for people who use AI to develop software for AROS and who can share practical and tested approaches and tips on how to use it.

CoolCat5000 wrote:

@CoolCat5000
From my personal experience the AI agents has 2 major issues: don’t really understand what is supposed to do and don’t have the full picture of what exists, both are context engineering issues.

So, a initial discovery chat is good and somehow keep the codebase status in context.

If, the idea would be: discovery->spec->code->docs->test with some bidirectional link between the spec,docs and code!


Yes, that's also my experience, but it warries based on model used of course. Stronge models can do more from a single step. In the "starter pack" documentation I suggest the Plan-Build approach:

"You can control how agents behave and what they do by planning work for them. This is useful for weaker models (most of free models), which will get confused with a long complicated task.

Check PLAN.md and STEPS.md located in the PlanExample subdirectory. PLAN.md is high-level plan while STEPS.md list incremental steps that are used to build the application. Both of those documents were generated by a model based on a prompt. It is often the case that a more powerfull (and expensive) model
is used to generate plan and then a weaker model is used to implement it.
"
M
miker1264Software Dev
Posted 2 months ago
As far as AI I use specific questions to get code samples and brief explanations from Copilot AI. It's very useful for my purposes.

For the most part I try to write code that can be used interchangeable with Amiga 68k. The code remains the same only the compiler changes. Yes, it's possible!

That's why much of my research deals with Amiga documentation. SDK's, code samples, developer CD, and so on. The elowar type notations with syntax and usage, sometimes with samples is useful. Every Library, every command is explained in detail. That's what programmers need. Useful samples.

Amiga OS4 documentation is also helpful but the developers changed some of the source code so it's not exactly the same as AROS & 68k.

AI in that sense is the icing on the cake! It's extra but also very helpful.
Edited by miker1264 on 04-05-2026 22:32, 2 months ago
C
CoolCat5000Member
Posted 2 months ago
Hi again,
I have no idea how it would be the better way of feed AI context with aros info, but some frameworks are allready making docs specific for agents.

(For example: https://nextjs.or.../ai-agents )

So we have resources that turn code/docs into searchable graphs, the problem is find the best right way. Mcp? Flat md files? Graph info? Skills?

Aros already has in the build system extraction from the code to docs, web pages etc … maybe it could also generate info for coding agents, but I don’t know exactly what would be the best approach

If we could find a good way of indexing the codebase so the AI agent could have a quicker onboarding and symbols retrieving could be handy. (I am not sure, but probably I would try to make something in this lines, plus other tools for tokens management, but that would be out of the codebase scope)



Regards,
C
CoolCat5000Member
Posted 2 months ago
Hi all, great to see this topic cause mostly this is the kind of thing that I must study. I have tons of resources that I didn’t tested, but I would like to share a quick subject to see if it’s the idea behind this topic

https://github.co...nt-Harness

If this topic is about this kind of thing I have lots of resources from token usage reduction to codebase indexing. I didn’t use it at all yet, so it’s not any recommendations.

From my personal experience the AI agents has 2 major issues: don’t really understand what is supposed to do and don’t have the full picture of what exists, both are context engineering issues.

So, a initial discovery chat is good and somehow keep the codebase status in context.

If, the idea would be: discovery->spec->code->docs->test with some bidirectional link between the spec,docs and code!
R
retrofazaDistro Maintainer
Posted 2 months ago
I’ve been playing around with OpenCode for two weeks now, and I have to say it’s really great. As you can see in Deadwood’s video, you don’t need to know how to program at all to do something simple. If you know a little bit of programming, you can easily start porting games and programs… Once there are more of us, we’ll quickly catch up on the software base for AROS 64-bit Smile AI is also great for converting 32-bit programs to 64-bit. It handles it pretty well. For example, if a program crashes, just paste the crash log into it, and that’s often enough for it to find a solution.

Feel free to start with the free versions to get used to everything. Once you’ve learned how to use it, it’s worth buying a Go subscription, which is ridiculously cheap and offers far more capabilities than the free versions.
deadwood, Farox, CoolCat5000, Deremon
D
deadwoodAROS Dev
Posted 2 months ago
Over last couple of weeks I've been exploring use of local AI agents to develop software for AROS. AI agents are software tools which can generate code and compile it to a working binary starting just from a description of functionality. After getting some positive results I prepared a "starter pack" for anyone who would like to try using AI Agents. The pack contains necessary configuration to direct the Agent to write code for AROS.

The pack can be downloaded from: https://axrt.org/download/aros/other/OpenCodeStarterPack-v1.2.zip

You can start exploring AI Agents for free, using free models available in OpenCode (more in Readme file), but to get things done quickly and easily I suggest OpenCode Go subscription (just $10/month) and using more advanced models.

I also recorded a video showing AI Agent in action. You can see on it how, starting from a prompt, it build a working MUI applicaiton in a couple of minutes. In the video you can see how AI first analyzes the task, then starts generating code and Makefiles and finally compiles the code and resolves compilation errors.

https://youtu.be/5lNSAOx_N8Q
Edited by deadwood on 04-05-2026 12:39, 2 months ago
x-vision, Farox, CoolCat5000, Deremon
D
deadwoodAROS Dev
Posted 3 months ago
Continuing exploration of Copilot I decided to test different available models with another example reported bug. This bug is specific as it results in crash in text.datatype while the original source of the issues is located in a different, but related, module - amigaguide.datatype. It is a use-after-free bug and and is caused by caching information on text.datatype level which is manipulated (created and freed) by amigaguide.datatype.

I tried following 4 models: OpenAI GPT-5 mini, Claude Haiku 4.5, Claude Sonnet 4.6 and Claude Opus 4.6. The conversation with each of the models is attached at the end of this post.

Cost of using

GPT-5 mini and Haiku 4.5 models are available in Free tier (50 requests per month). To use Sonnet 4.6 and Opus 4.6 a subscription is need ($10/month => 300 requests per month). Each of the models has a different price, suggesting their relative capabilities. GPT-5 mini costs 0 requests, Haiku 4.5 - 0.33 requests, Sonnet 4.6 - 1 request and and Opus 4.6 - 3 requests. Speed-wise, GPT-5 mini, Haiku and Sonnent were comparable in generating answers. Opus was taking around 2x longer.

Communication style

All Claude models have a similar style - they reply is concise, they produce one or two possible answers and described reasoning behind them. I would call that style "just enough". GPT-5 mini is more verbose. It produced several possibilities when describing either the problem or a solution. Also those answers felt more theoretical opposed to more practical answers of Claude models. GPT-5 produced answers more of a type "here is what you can check yourself" rather then "here is where the problem is" of Claude models. For me personally GPT-5 mini output was "too much to read".

Locating the issue

Opus and Sonnet models were able to locate the issue and assess it as use-after-free within the first prompt reply. GPT-5 mini model gave several answers in first prompt - use-after-free was there also, but the description wasn't as clear as with Opus and Sonnet models. Haiku also suggested use-after-free problem, but it's answer felt more random and it also added another issue, which was unrelated and insisted getting back to that second issue.

Solving the issue

GPT-5 mini provided more of a description on how the issue can be solved, rather then specific code. The description contained a portion that was relevant, but a bigger portion of output was just confusing the core of the issue. In order to use GPT-5 mini, the person would already have to have a good understanding of code base.

Haiku 4.5 kept suggesting theoretical solutions that were related to codebase and gave general idea of what needs to be solved, but not how. Haiku also wasn't able to move on from text.datatype (where problem is visible) to amigaguide.dataype (where the problem is located).

Sonnet 4.6 provided a solution that is not the worst one but also not the greatest one. It correctly identified amigaguide.datatype as the location of the issue.

Opus 4.6 took it's time thinking about a solution (around a minute) but eventually located the correct place and generated a fix almost identical with my hand-made fix.

Summary

GPT-5 mini behaved more like a lecturer then a helper programmer. Haiku 4.5 behaved like a junior software engineer who wants to boast his very wide (but very shallow) knowledge. It was quick to reply, but replies were lacking depth. Though if you are using Copilot Free tier, I'd still go with Haiku 4.5 over GPT-5 as Haiku's answers are much more actionable (and they both cost 1 request in Free tier).

If you are using Copilot subscription and you have a rough idea of what you are doing, I'd go with Sonnet model. It's more expensive then Haiku, but where Sonnet located the issue and proposed a solution in 2 prompts (=2 requests), Haiku was still going in circles after 4 prompts (1,3 request).

It's also important to note that this comparison is done a single, slightly more advanced use case. If you have your own experiences with Copilot or other AI tools in AROS development, please share them here!

https://axrt.org/media/copilot_example_03.zip
retrofaza, miker1264, Farox, CoolCat5000
You can view all discussion threads in this forum.
You cannot start a new discussion thread in this forum.
You cannot reply in this discussion thread.
You cannot start on a poll in this forum.
You cannot upload attachments in this forum.
You cannot download attachments in this forum.
Users who participated in discussion: terminills, deadwood, retrofaza, miker1264, CoolCat5000, kaffeine