If you want your program work on every single device that's physically capable to run it, you should follow these steps:
Write the core functionality as an ISO C library
All kinds of CPU you can write programs for should have some kind of C compiler. This means if you write the core functionality in ISO C, it should build and compile on basically everything.
The core functionality must be isolated
The core functionality shouldn't have anything that interacts with the system. It shouldn't contain memory allocations or print statements. These facilities should be provided by the library's user who will integrate the core library into an actual device.
In other words the core functionality should be implemented using the free-standing implementation of C.
Avoid undefined behavior at all costs
CPU architectures are different. For example there are architectures where a pointer to an int is a different kind of pointer than a pointer to char. That is the same number points to a different address when interpreted as an int* than when interpreted as a char*. So take strict aliasing seriously.
There are some architectures and operating modes which doesn't even protect memory. You can write wherever you want. Writing to a bad place don't cause crash like it does modern CPUs.
So avoid undefined behavior.
Once the library complete, you will need to integrate it
This the point when you provide the necessary system functions to the core functionality. Like showing the actual image it generated for you in a buffer on the actual screen.