Do you need that much for a single program or just for giggles? There is a price you pay for asking a single or double byte from a 11 bit address space. Try to look for what speed you want to trade in relation to the address space. I personally use 32 bytes that i keep close and the storage has multiple lanes to it so it can load and store the blocks without waiting a century.
There are multiple things you can do, but that is pretty sophisticated stuff.
As for multicore processing there are two ways really:
1. the units are physically separate except for storage, I/O and they do not share the same address space (im talking von neumann ofc).
In the real world they are the dual socket motherboards and they each have their own set of RAM.
2. the units share the same address space; They both work on different threads; EG small programs that can be run independent of one another.
This does require you to know how to work with caches; A LOT. You've got to work with race conditions and memory hazards.
for MC its not really efficient since doing more stuff = more lag = slower cpu and in the real world the only thing that increases is power draw, not speed. We rather try to improve in other areas like clock speed (pipelining and instruction set effectiveness for example)
There are multiple things you can do, but that is pretty sophisticated stuff.
As for multicore processing there are two ways really:
1. the units are physically separate except for storage, I/O and they do not share the same address space (im talking von neumann ofc).
In the real world they are the dual socket motherboards and they each have their own set of RAM.
2. the units share the same address space; They both work on different threads; EG small programs that can be run independent of one another.
This does require you to know how to work with caches; A LOT. You've got to work with race conditions and memory hazards.
for MC its not really efficient since doing more stuff = more lag = slower cpu and in the real world the only thing that increases is power draw, not speed. We rather try to improve in other areas like clock speed (pipelining and instruction set effectiveness for example)