10-10-2014, 04:22 PM
(10-08-2014, 06:17 PM)TSO Wrote: I was thinking of data KB, thus powers of two.Metric prefixes:
The stack is entirely solid state except for a single instant repeater that moves the pop command from the left to the right side.
8Mb = 1(1x10^6) or 1,000,000B, 8KB = 8(8x10^3) or 64,000b, 4Kb = 1(4x10^3) or 4,000b
Binary Prefixes:
8Mib = 1(1024^2) or 1,048,576B, 8KiB = (8x8)(1024^1) or 65536b, 4Kib = 4(1024^1) or 4096b
the 1024 came from 2^10 which is the bases of which binary prefixes are powering from, so 8GB = 8((2^10)^3) = 8589934592Bytes or 6871947674bits.
The Binary prefixes was losely based around the French metrics or later came SI metrics
(10-09-2014, 05:16 AM)TSO Wrote: Sorry, I meant mostly how to generate the conditions (flags, I guess) in the same cycle as the operation, then I started wondering how many of these conditions I need... then I asked my father for help. He programmed back in the 90's when computers had no memory and actual program speed mattered because computers were still so slow (back when programmers actually had to be good at programming.)
What I learned was this: all your possible conditions are computed at the same time and given a true/false value. These values then go to a single register the same size as all the possible conditions (between eight and sixteen), you then use a bit mask to get the conditions you want out of this register (you can actually call more than one flag at the same time, then operate a condition on those two). Your goal is to never jump, EVER, especially more than one or two lines. Long goto jumps take longer, are harder to understand in debugging, and are a sign that the programmer does not understand the program being written. If, Then, Else statements also slow down the system, but are far better than a goto because the compiler will place the destination code for the if statement near the line the condition is on. The compiler will preload the program data line by line regardless of condition destinations, if an if, then, else occurs, a path is chosen, and all code for the other path is deleted, followed by loading all the subsequent program data for the line that was chosen. The nearer these lines are to the jump condition, the less gets loaded and unloaded. The amount of qued data that gets deleted depends upon how large the jump is, with a goto being the worst. Nearly all data is dumped and reloaded and all stages of the CPU are completely cleared in a goto because they inherently do not jump near the line their condition is on. And if you pipelined and it jumps conditionally, you have to somehow rid yourself of all the crap in the pipeline that is no longer correct because it was part of the other conditional side.
Luckily, my design inherently helps with this, but it will still suck to have to think about it.
Computers back in the 90's did have registers, if they was GPRMs, this wasn't used to often though because the processors at the time was slower then the memory speeds so having registers didn't improve performance much, and added to the instruction size, so was deemed uncesseriy for the time because of the extra memory requirements for the IS.
Two other popular PE types wear Stack based and Acumulator based, this two wear both popular for near enough the same reason, positives and negatives of each are near the same and are mainly the other type of PEs that pros and cons wear oposite of GPRM PEs, there was also other types of PEs that other systems used but the 3 most popular and main ones where the Stack,Accumulator and GPR based PEs
"all your possible conditions are computed at the same time and given a true/false value."
The idea and direction you're going with this is good, irl some do this, however in MC it isn't really a thing with most things being small enough they reach the new address and return before clock, thus giving no performance gain, or are predictivly prefetched for pipelined CPUs which also gives no gain.
This is a good idea though and in bigger MC things maybe useful, the idea has gone around a couple of times but has yet to be implemented, so 5pts. to you if you do it first with at least some gain
"(back when programmers actually had to be good at programming.)" actual programming is harder now then it's ever been for pro's, the only programmers that aren't good tend to be game devs (sometimes they know some code but they aren't really programmers, they use software instead) and those who can't be asked to learn how to write optimal code, but for those doing it professionally are made to max their hardwars abilitys out and this becomes very hard.
To do this you have to learn verious languages (normaly this would be done with a combination of fortran, C++ and ASM, with GPGPU being utilised heavily.
Generically this style of programming is never learned but is in high demand of, also tends to not run to well on various configurations other then it's intended system config(s).
I'm also unclear as to what you're defining as a compiler, probably correct but sounds almost like you want to use it in active programs instead of compiled before use.
(10-10-2014, 01:48 AM)TSO Wrote: So you would be saving a clock cycle per instruction.
I spoke with him, and yes you do exactly what I described when programming assembly for the 386, with one slight exception. The instruction set does not carry the conditional with it, there is a branching operation and the CPU uses some kind of hardware look ahead in order to set the flags in one clock cycle so that the next cycle will pipeline correctly.
Also, when optimizing for speed on an ALU where not every operation takes the same amount of time but multiple simultaneous operations are possible, is it better to put the fast operations close to the CPU and have them be a hell of a lot faster than the slow ones, or put the fast farthest away and have it all kinda balance out? For example, my current instruction set, which I have discussed with LD would allow for a bit shift to occur three ticks after being instructed, and repeatable every six ticks, with the ability to run all the other operations at such speeds as well (the CPU can have a three tick clock). The Boolean operators are four ticks out, but also repeatable every six ticks. At the other end, the MOD function is 106 ticks out, so that's like 34 near operations for every far operation.
Doesn't really matter how much you speed up things that are going slow, if you have them before your next clock comes it's waiting, so if you can, put other stuff their and let that be further out, again this is a unique thing with your designs, most don't have this issue so might be a little tricky figuring out a method of doing this automatically for the programmer or for the programmer to know where to do this.
(10-10-2014, 05:45 AM)TSO Wrote: That completely defeats the purpose of a forum. I advise you get around to reading it, we've been discussing making the impossible possible, and (at least according to GG) have actually made a small breakthrough in redstone computing (it sounds like I knocked a tick or two off of the fastest theoretical clock rate)
Hi
we've/you've not really been talking about things that was seemingly imposible, albe it you do have some great ideas in general, also sorry to say it to you but you won't be caliming performancing king just yet. maybe after a few revisions and optimisations once you settle in and learn stuff your combination with some of our members of ideas might do though (it's not really expected that new members even know half this much about CS really, so you're doing good )
If it's not much to ask for, once you have a IS drawn up may i see it please?