Forums - Open Redstone Engineers
Magic's Stoopid CPU - Printable Version

+- Forums - Open Redstone Engineers (https://forum.openredstone.org)
+-- Forum: ORE General (https://forum.openredstone.org/forum-39.html)
+--- Forum: Projects & Inventions (https://forum.openredstone.org/forum-19.html)
+---- Forum: In Progress (https://forum.openredstone.org/forum-20.html)
+---- Thread: Magic's Stoopid CPU (/thread-5679.html)



Magic's Stoopid CPU - Magic :^) - 01-29-2015

I'm gonna make a stupid cpu because i'm bored.
Old:

I mean, it probably won't even have ram...

It's just an exercise to see if I can make something compact with a decent clock.
all I have atm is a 12 tick dataloop with dual read and 6 registers.

MAYBE I'll add a simple i/o system for peripheral support. probably 4 slots max tho.

Disregard everything I just said above, more info below.... Big Grin

Alright, there will be more info on the ISA later. Right now it's a 12 bit IS, and my longest non i/o op is 11 ticks. Everything else is <10 thanks to my tiny internal data bus :3

I will hopefully have decent, feasible peripheral support aswell. I have 8 peripheral ports which have 2 bytes out and 1 byte in.
You can:
request data from a peripheral and stall until data is received..
Or schedule a read operation and then retreive (or wait for) a localy stored copy of the data when it is needed later in the program. (instruction-level pipelining, anyone? Big Grin)

The i/o system is VERY general. You have 2 control pulses out (read and write in most cases), and 1 control pulse in to signal that the peripheral has completed the operation.
A write operation uses two bytes of data out, and it also stores a copy of the peripheral's output (if any) so you can use the write op as a 16 bit addressing system if you like Wink


RE: Magic's Stoopid CPU - Apocryphan - 01-31-2015

As long as it doesn't come with facebook messenger pre-installed, we can consider it smarter than my "smart" phone.


RE: Magic's Stoopid CPU - Nuuppanaani - 01-31-2015

(01-31-2015, 08:00 AM)Apocryphan Wrote: As long as it doesn't come with facebook messenger pre-installed, we can consider it smarter than my "smart" phone.

Root that shit and delete all the bloat from /system/apps!

And while you're at it, install Xposed and all the other root shit and tweak all the things!

And then custom roms start looking more and more tempting. You find yourself flashing the noob rom for your device that doesn't require a custom kernel. Easy and fast value improvement for your device!

After this you realize all the sweet improvements custom kernels give you: overclocking, undervolting, double tap to wake etc. You find yourself opening the bootloader of your device (if you haven't already) and flashing a feasty custom kernel that blows your friends' $/£/€800 Samsung flagships out of the water with its performance and features!

Then you realize how many custom roms there are for your device and become a flash-o-holic. This is not you controlling the flashing any more but all the shiny new roms just need to be flashed.

A new android release is rolling out and you *have* to have it before your friends. You tweak the phone all night and finally get that shit to boot! This is your miserable life's happiest moment while sober and clothes on!

Your friends are really jelly for your phone and its bleeding edge android version. However it's not that long until you realize the flaws of that experimental pre-alpha rom you just flashed: it doesn't have that one tweak you cannot live without! Couple hours after the flashing you decide to revert back to stock because it was the fastest of all the roms after all, and you won't need the features the roms are offering because there's Xposed.

You try flashing your device back to stock and find out the hard way stock rom's bootloader doesn't approve of the custom one you had and realize your device is giving no signs of life any more. :'( It is hardbricked and you pray the customer support for help. But they find your IMEI on the list of naughty kids for unlocking the bootloader and refuse to give you any warranty.

You wake up next day without being able to check the time from your phone, reading the Reddit feed or any of that stuff. Living without your beloved smartphone is tough! But what is this? Google released its new Nexus device; a phone running their latest pure android that is designed for developers and hackers in mind! You spend the remaining of your savings on this new beauty. It is 10 times better than your old device in every way!

It takes a single day until you get fed up of all the Youtube advertisements and that little stutter when scrolling your Reddit feed. You root your device, void your warranty and the addiction takes all over you again. You must have more every time! And the loop goes around.


RE: Magic's Stoopid CPU - Magic :^) - 01-31-2015

yes my cpu can do all that


RE: Magic's Stoopid CPU - Magic :^) - 02-01-2015

Ok, so I've gone through my arch and IS more, and have come up with a pretty interesting system Wink

So, a lot of instructions in my IS would make more sense as double-width instructions, but doing that would slow down the whole system in a lot of cases.
This is because the time taken to decode a register would be considered every time, even if I was performing the same op with the same registers. (This is a minor example)

So, I designed my IS so that the second half of a double-width instruction can be executed independently Big Grin

e.g. my alu ops look like this:

[(A reg)(!A)(B reg)(!B)] [(C reg)(OR)(FC)(Cin)(SHR)(Pop)]

The next op would go the same way.

but, if I didn't need to change the source registers afterwards (e.g. A+B=A works with this arch),
I could do this op by itself:
[(C reg)(OR)(FC)(Cin)(SHR)(Pop)]

also, I am doing the cpu async, so the first half will be a LOT faster to resolve than the second half.
e.g. <5 ticks and 10 ticks respectively.

Also note that I put the inverters in with the first (reg load) op. They would have added 1-2 ticks to each op if I put it with the second part of the instruction Sad

ok, here's some tentative specs:
(spoilered because the info is slightly outdated... It still is worth a read though Wink)

My IS is now 11 bit

Supports up to 8 peripherals, with 8-bit data and addressing.
I have an address register that I can increment after a read or write op Wink
The i/o system can read directly into the address register, to enable pointers Big Grin

My dataloop is 10 ticks. Decoding and stuff takes longer, but that's explained above.
The dataloop has 6 dual-read registers.
It can read from two special-purpose registers for immediates or i/o fetches. They are special, I'll explain later.
There is a single-read 6-deep hardware stack.
There's a single-read flag register. It can take external status flags easily.
All these can be accessed from a 3-bit address :3

8-bit immediates are supported.


The special purpose registers are connected directly to the alu, one on each side. The SPRs can be written to from the imm instruction or the i/o read instruction.
In those instructions you specify whether you want to write to one, or both of the SPRs.
Writing to both essentially makes them dual read.
Writing to one makes that value available to the side you wrote to.
e.g. writing to SPR 1 will make that value available to the A reg's SPR address, and writing to SPR 2 would make the value available to the B reg's SPR address.


If the (Pop) bit is 1 during an alu op, the stack deletes one item. It does not read from it.
Peeking at the stack is treated like a normal register read, so to perform an actual Pop, you perform an op while peeking at the stack with the Pop bit set to 1. In that case you will also automatically peek at the next entry in the stack Wink
Obviously, you could also pop the stack without reading from it to delete an entry.

Timing-wise, pop triggers after the write op, so you can write to nothing by writing to the stack with the Pop bit enabled. I don't know why you would want to do that, but you can Tongue

currently, writing to reg address 000 does nothing... It would be very easy to implement a function here, but I won't add anything until I come up with something the system actually needs.
yup i think that's it for now :3


RE: Magic's Stoopid CPU - Magic :^) - 02-08-2015

guh it's 6am but i'm posting this anyways:

I made myself a good 12bit vertical rom for the cpu
It uses my slide system, where 8 lines of prom are loaded at once, then those are decoded faster by a set of decoders closer to the processor 'n stuff.

I settled for 128 lines of prom, even though 256 lines are supported.

My slide read speed is 12 ticks
My normal read speed is 5 ticks Big Grin

In general, a slide load occurs on a branch or after every 8 instructions.


RE: Magic's Stoopid CPU - Magic :^) - 02-09-2015

my I/O handling hardware is fully functional and completely debugged Big Grin

It's more responsive than I expected. I may just yet be able to get my passive ops down to 5 ticks :3

The whole thing's very close to completion, I'll do some more concise documentation after it's finished!

ETA: 2-5 days


RE: Magic's Stoopid CPU - LordDecapo - 02-09-2015

yay!!! Big Grin


RE: Magic's Stoopid CPU - Apocryphan - 02-10-2015

I'm gonna need the full tour when i get on the server.


RE: Magic's Stoopid CPU - Magic :^) - 02-10-2015

The cpu has been moved over to a new plot, /pwarp magic should still get you close enough. If all goes well, i might be able to get this compatible with the possible future pokemon battle network Big Grin


RE: Magic's Stoopid CPU - Magic :^) - 02-10-2015

OOh what's this?

I have a fetch time prediction unit installed! It looks at the current address of the prom +1 and compares it to see if it will be a fast or slow decode time :3
Lookahead on the left, current address on the right:
[Image: 7B7ga6j.png]

And look! I'm almost done the main part of the CPU, I have to figure out the optimal timing for my instructions here... but once I figure it out, the cpu will be able to carry out all the main ops Wink
[Image: rGK1nCF.png]

I just have to get going on the branching stage now, I'm putting it in the middle of my prefetch buffer. It's going to be doing some speculative prefetching too :3


RE: Magic's Stoopid CPU - LordDecapo - 02-11-2015

Yay for dynamic branch prediction.. I think you and I are the only ones who currently have it xD


RE: Magic's Stoopid CPU - Magic :^) - 02-14-2015

Here's a tentative list of ops and the syntax they will use in asm:

Code:
regLoad (!)r1, (!)r2
regWrite r3, (alu opcode)
immLoad (8-bit immediate), (to SPRs OR ioReg)

ioRead (address source), (peripheral address), (result register)
ioRead+INC ...

ioBGRead (address source), (peripheral address)
ioBGRead+INC ...

ioWrite (address source), (data source), (peripheral address)
ioWrite+INC ...

buffRead (peripheral address), (result register)
buffRead+INC ...

jump (8-bit jump address)
call (8-bit jump address)
ret (8-bit jump address)
brIf (8-bit jump address), (condition)


////////////////////////////////////////////////

alu ops:
add, sub, and all bitwise stuffs
shift right can be done in the same op as all others
shift left supported via arithmetic shift.

conditions:
If0
Sign
Cout
Overflow



RE: Magic's Stoopid CPU - Magic :^) - 02-18-2015

Oh yeah the cpu's done, I'm gonna make another one soon. It's gonna be very similarly laid out, but will be much more optimised with twice as many pipeline stages and support for interrupts and some fancy code injection.

(I'mma bypass the main program flow with a preloaded hi-speed instruction buffer to spam my dataloop and i/o stage with ops. It is going to be like microcode, but the code can be modified from external sources)

I might also do support for hardware register renaming/context switching. It would be nice to switch between two protected register banks.


RE: Magic's Stoopid CPU - Spidermy9 - 02-18-2015

Love the time thingy Smile i remember me and i believe hawk also did something like this, but never really implemented it in a good fashion


RE: Magic's Stoopid CPU - LordDecapo - 02-23-2015

The dual register banks is helpful, but I have played with that idea before, and it's a little... picky sometimes lol. try using 15 instead of 7 registers, and make some of the special purpose.
just a thought Smile


RE: Magic's Stoopid CPU - Magic :^) - 02-23-2015

Yeah, I have a lot of stuff to improve in the new version. The arch is still similar, but only in spirit xD

I'm going for some speedup techniques for the instruction pipeline too. I'll elaborate on it later.

Also working on better i/o hardware. So far so good!
I got me some 1 tick/bit with input buffering so I don't drop bytes.

I'll try for 16 registers, but I'm not sure if I can get a low tick loop out of it....

I might be able to do the context shifting by having every bit of the register actually be a 2-bit barrel shift. Hmmmmnn


RE: Magic's Stoopid CPU - LordDecapo - 02-23-2015

That could work, I'll be on later if you wanna TS and talk about it. I'm curious to here the new upgrades and more in depth thinking as to what the changes do.


RE: Magic's Stoopid CPU - Magic :^) - 02-23-2015

I won't be able today, but maybe tommorrow Wink

My instruction queue optimisation would really only be explainable with a diagram though xD


RE: Magic's Stoopid CPU - LordDecapo - 02-24-2015

Kk, diagrams are always helpful xD
I'll be on today after work and I change my dad's serpentine belt on his truck.
When ever ur on we can talk about ur stuffs, got some idea about my stuffs I wanna run by u too.


RE: Magic's Stoopid CPU - Magic :^) - 02-24-2015

cool, i'll be about 4 hours from now... ish

i have a lot of stuff to do today xD

maybe 5.

I also made the context switching registers the way i wanted them. They are the same speed as normal vertical registers, but 4 wide.