Forums - Open Redstone Engineers
I am TSO - Printable Version

+- Forums - Open Redstone Engineers (https://forum.openredstone.org)
+-- Forum: ORE General (https://forum.openredstone.org/forum-39.html)
+--- Forum: Introductions (https://forum.openredstone.org/forum-18.html)
+--- Thread: I am TSO (/thread-4807.html)

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14


RE: I am TSO - LordDecapo - 10-04-2014

bro, memory is a huge deterrent of speed on a CPU, the access times due to the signal strength will be ridiculous. The best route is to do a pipeline CPU that is ATLEAST 10 tick clock, and have an IS and architecture that can deter or eliminate any and all pipeline hazards.
This is what is going to happen, you will build this huge thing, the tick count will be so high u will scrap the idea and based on your enthusiasm (Which is great btw, and i encourage it, plz continue your idea and learn, we need more ppl with a go-getter attitude) you will start a new design with a new everything to get it smaller and faster.
You will realize a pipeline is the only way to get a faster clock without doing CRAZY ass shit that limits the ACTUAL power of the CPU, which is how much different shit u can do with the fastest throughput and at the same time encountering the least hazard stalls/branch misprediction penalties.
Which i give you this advice, look up some stuff on Instruction set, look up CPU Architecture, and then look up Instruction sets, and just read as much as you can, there are some great resources out there that can get you to the goal you seem to want to acheive, but i had the same approach about 10 months ago, and i have gone through like 13 design changes to finaly get the first fastish pipelined CISC CPU on the build server, 8 bit data, 8 tick clock, and full serial interface, as well as ability to support 16bit addressing on 32 different pieces of hardware. LOTS of functionality, and good through put,
hazards like
RAR(not really, its a "false" hazard), RAW, WAW, WAR.. Those of which you will need a decently fast and passive Hazard detection system as well as a Fwd system for results to elimate the RAW hazards, and if you con construct ur stages in ur pipeline well enough the WAW and the WAR hazards can take care of them selfs,
i am familiar with what i speak, and i will tell you that i can do OOOexe in Logisim and MC, and it isnt worth it in MC, and only worth it in OOOexe if you REALLY REALLY want to,, just try and go with a smart IS and good pipeline stages with a Forward feature.. all this you will understand as you do research.

feel free to apply and ill be happy to share what i know and give you links.... but i think you are underestimating severely what you are trying to accomplish.


RE: I am TSO - TSO - 10-04-2014

Safari crashed on my iPod, which is why this reply took so long, then I went out to dinner, then I spent time doing other stuff.

First off, I'm well aware what an instruction set is, I'm just bad with acronyms. I have never referred to an instruction set as an, "IS," so when you said that, I thought there was an IS component unique to redstone computers or a component I gave a different name to.
I know CISC is a blanket term (I actually don't know what defines the difference between RISC and CISC). In this case it means, "I don't actually quite have it worked out, but I have 44 operations I need this thing to perform, which is probably a lot more than any other minecraft computer, so I'll just call it CSIC." The actual opcodes aren't going to be created until after the computer is built, so that I can reverse engineer the whole system and have the codes mirror the processes that occur in the system, thus reducing the number of decoders.(I already know one typically does this the other way, buy my mind just doesn't quite work like that.) All I know is the first two bits of the opcodes, which will tell the CPU where to ship the information.

Memory space greater than 16 bits can easily be addresses using 16 bits and the secret sauce. Pointers are possible to write, but aren't needed, and not all the memory is registers or cache. There is an hierarchy that allows for addressing more than the bit space available to the word size. Although, I don't have a solution for fast registers yet...

Memory should only be around 19 ticks max to read or write, with nearer memory being faster. Special sauce is used here to fix that speed issue, and instant wire will fix bussing delays.

Now, I have discovered that memory actually does not induce lag, which makes sense if you think about it for more than two seconds and don't assume that anything you build with redstone automatically makes lag happen. The following things cause lag: sounds, light updates, block updates, complex rendering. If memory is really far away, you won't see or hear it, and there are no lighting updates because it isn't in render distance. The only time the blocks update are when it is being written to or read, in which case, only a very small part is actually updating. For memory in render distance: don't look at it. I'll make some images that show various frame rates for using it, looking at it, and looking away from it while standing near it. (Funny thing, if you stand inside it, you get a better fps than if you stand outside and look at it.)

The Von Newman architecture is why it needs so much memory, and more secret sauce makes sure the computer has the program data as it's needed.

The ALU uses some basic calculus to approximate large answers at a speed greater than what can be accomplished by other types of software implementations. (Mostly for division and any root, maybe real or complex)

The entire system could use two's compliment floating point encoding for large integers, or it will use standard binary if told to.

And most importantly: more than one computer can use the memory without any issues. This is where the secret sauce really gets lathered on there.

I'm willing to bet a few of you already figured out what the secret sauce is made of, and the only reason it is even there is so that the computer can be programmed in a language other than machine code (like c or maybe just assembly), but I discovered that this single piece would allow for all of these other magical properties to suddenly be available because it would make the CPU itself run much faster because it only really has to do one thing now: move memory between registers or cache. This is a five(ish) tick process.

There was something else I wanted to mention, but I have forgotten it now. No worries, though, you people will point out that I am an idiot because I forgot to mention that tiny detail.

I do suppose pipelining would be an option, but it would be more applicable toward the ALU than anything else. And this is just the first iteration anyway.

@LD, the only thin there I've never heard of is OOOexe.


RE: I am TSO - greatgamer34 - 10-04-2014

Quote:but I have 44 operations I need this thing to perform, which is probably a lot more than any other minecraft computer, so I'll just call it CSIC." The actual opcodes aren't going to be created until after the computer is built,
wat

Quote:The ALU uses some basic calculus to approximate large answers at a speed greater than what can be accomplished by other types of software implementations. (Mostly for division and any root, maybe real or complex)
good luck with calculus on a 16 bit alu

you never addressed Lords questions about pipelines xD

but saying that this CPU will work before building it is just as ridiculous as claiming i have a buggati veyron and no pictures of it..


RE: I am TSO - TSO - 10-04-2014

You do realize, as I said earlier, that I don't actually have the CPU yet, right? All I have is a memory cube, which took three days because I am not a fast builder. In fact, I only have half of it. So when I say the CPU will work, it's because I don't even have a design yet that can be proven to not work, and we know CPU's exist that do work, so we can say it is possible for me to have a CPU that works in our hypothetical situation. Whether or not I designed it myself is irrelevant to the current conditions of the hypothetical.

Most of the instruction set actually just passes through the CPU, being routed to it's destination without any decoding at all.

As for your question about coding, I know what I need it to do, like add, subtract, fetch some memory, ext. but I don't actally have the opcodes for the functions. Even then, I know the first two bits for every opcode because they are the destination being considered.

Calculus is just addition, include multiplication if you don't want it to take as long. Mostly I'm looking at Newton's method when I think of that.

I wrote most of that post before I saw LD's question about pipelines, so all I really wrote to it was those quick last two lines.

Upon consideration, I could venture to say that if you stood back far enough from it and squinted real hard, you would see something that sort of looked like a pipelined CPU.

Also, just in case you still don't know the chunk loading thing I mentioned, I was talking about the spawn chunks. (Also, I had to look up what a command block was because I have never heard of them.)


RE: I am TSO - VoltzLive - 10-04-2014

I am now completely sure you have not a clue what you're getting your self into.
What I believe is you watched some redstone videos, read up a bunch on Wikipedia and such but other than that I believe you haven't a snowballs chance in hell to get anything like what you say. The most advanced redstoners in the world are only beginning to tread on some of the ideas that you are discussing, People that have been working towards these problems for years now.. I expect this project to go almost nowhere as you are a complete beginner with no reference to what is feasible within redstone.


RE: I am TSO - LordDecapo - 10-04-2014

NOTE: I am not calling you an idiot, and I am not telling you your design wont work, I am telling you from experience that your plans so far sound like it wil lbe a very slow computer, and unless you do pipelining then you are looking at a clock no faster then 40 or so ticks whilst current Redstone CPU's avg in the 20's.
And My own CPU being an 8tick clock, 8 staged pipelined, 16/32/48bit CISC IS, with full serial interface, Hardware stacks, and 105 total different arrangements of branch type/destination.
I started with the same enthusiastic intentions as you, and it will work for your better, but u will need to do some homework first and listen to others in this community for advice on what to do to make it faster.
Read my responses below with that in mind and at the end of this post I will attach a .rar file with a bunch of PDFs and PPTs I have on my laptop, if you would like more I can get them for you, as well as I know of a meriade of college lectures on Comp Sci. topics that can help you emensly,
Also I will include a Logisim version of my CPU so you know im not blowing smoke up your ass.
Feel free to ask me any questions.
and i hope you continue ur GO-GETTER attitude, i would love to have someone else to talk about advanced control units with on the server xD


(10-04-2014, 03:28 AM)TSO Wrote: First off, I'm well aware what an instruction set is, I'm just bad with acronyms. I have never referred to an instruction set as an, "IS," so when you said that, I thought there was an IS component unique to redstone computers or a component I gave a different name to.
IS is something that you will find that almost all college or more advanced places will refer to an Instruction Set, after doing a bit more research you will see IS a lot.
Another one that may help is Inst. which is just short for Instruction.


(10-04-2014, 03:28 AM)TSO Wrote: I have 44 operations I need this thing to perform, which is probably a lot more than any other minecraft computer, so I'll just call it CSIC. The actual opcodes aren't going to be created until after the computer is built, so that I can reverse engineer the whole system and have the codes mirror the processes that occur in the system, thus reducing the number of decoders.(I already know one typically does this the other way, buy my mind just doesn't quite work like that.) All I know is the first two bits of the opcodes, which will tell the CPU where to ship the information.
44 is ALOT, unless you have a HUGE IS that is just all your control lines bussed to one Program Memory bank, I highly recommend either finidng someones IS you like a lot on the server and then make yours based on there layout (only at first, you can change it 100%, but it will give you a great place to start) or you can throw together a quick one to make sure you have some of your CPU's functions predetermined,, it sucks adding a bunch of features to a CPU and realizing you only needed like 1/3 of them to actually do what all your IS needs.
An IS also helps define a lot of basic asspects of your CPU, making it almost a road map for how to build it with out ever touching hardware.


(10-04-2014, 03:28 AM)TSO Wrote: Memory space greater than 16 bits can easily be addresses using 16 bits and the secret sauce. Pointers are possible to write, but aren't needed, and not all the memory is registers or cache. There is an hierarchy that allows for addressing more than the bit space available to the word size. Although, I don't have a solution for fast registers yet...
how are you going to use more then 16bit addressing on a 16bit system? And be aware, you can make a CPU with as much memory as you want, but are you ever going to make a porgram that ACTUALLY uses it? Or could you program a tad bit better and have the program use your memory more wisely.


(10-04-2014, 03:28 AM)TSO Wrote: instant wire will fix bussing delays.
The Von Newman architecture is why it needs so much memory, and more secret sauce makes sure the computer has the program data as it's needed.
NO! instant wire bussing is a VERY VERY bad idea, I have seen many ppl (about half of which were long time MC CPU veterans) try and use it as one of those “cure all” solutions to bussing,, the timing will be hell, and you will end up wanting to pul ur hair out.
Also I personly dont ever use pistons.. in any CPU... ever... I hate them and there half tick nonsense and BUD glitches.. a Pistonless CPU (also called SS (Solid State) is much more predictible,
But if you like pistons and you want to use them, have at it Big Grin just saying I personally dont like them at all.


(10-04-2014, 03:28 AM)TSO Wrote: The ALU uses some basic calculus
“basic” is a very HARD word here...
Cause calculus isnt something that is normally done at a MC CPU level, due to the massive amount of ticks it will add to your CPU, and if ur only using a basic ALU to do these,, there will be a lot of time that it will be clogged preforming the “basic” task of your calculus.



(10-04-2014, 03:28 AM)TSO Wrote: The entire system could use two's compliment floating point encoding for large integers, or it will use standard binary if told to.
2's comp is mainly for negitive, and floating point is decimal points (better known as Radix bits in binary) and you will need a good barrel shifter to be able to do FP well, I have a good design im working on if you want to use it.

(10-04-2014, 03:28 AM)TSO Wrote: And most importantly: more than one computer can use the memory without any issues. This is where the secret sauce really gets lathered on there.
2 CPUs with one memory is a bad idea,, only do it if you have a system to manage memory that is independent from the CPU's themselves, otherwise you will have addressing confilicts and data hazards, as well as architectural hazards (more then 1 data packet wanting to go down same bus line at same time, and other stuff)


(10-04-2014, 03:28 AM)TSO Wrote: I'm willing to bet a few of you already figured out what the secret sauce is made of, and the only reason it is even there is so that the computer can be programmed in a language other than machine code (like c or maybe just assembly), but I discovered that this single piece would allow for all of these other magical properties to suddenly be available because it would make the CPU itself run much faster because it only really has to do one thing now: move memory between registers or cache. This is a five(ish) tick process.
They may have but I have not figured this one out, but please tell me xD im curious.. and the way to make a CPU be able to be programmed in a higher lang, is by making a compiler and to have a GOOD IS that is small, yet feature rich, so you can easily port the coding in and out of the CPU's machine code.

(10-04-2014, 03:28 AM)TSO Wrote: I do suppose pipelining would be an option, but it would be more applicable toward the ALU than anything else. And this is just the first iteration anyway.
just saying the ALU is like the ONLY place you do not pipeline like ever, I mean you can yes, but due to speed limitations in MC, it doesnt give you any speed advantage, and to pipeline an ALU, you have to have a good hazard detection system and possibly OOOExe,, considering the delay of an OOOexe system,,, unless you make a milestone discovery it wont be beneficial to your clock speed/throughput.


(10-04-2014, 03:28 AM)TSO Wrote: @LD, the only thing there I've never heard of is OOOexe.
well dont worry, that is annoying as hell, I believe im the only one on the server that has made something using OOOexe, (the OOOexe stuff is in the Logisim CPU file I have linked in the bottom, however it isnt being used in the CPU at all, and it has some bugs that need to be fixed at some point if I ever decide to actually use it,,, but a Belt Architecture is better for consistency and such so I will be going that route)


(10-04-2014, 06:39 AM)TSO Wrote: Most of the instruction set actually just passes through the CPU, being routed to it's destination without any decoding at all.
well yes, all inst. Go though the CPU, xD that is what they are instructing Tongue and no.. u have to decode your Op's unless you want something so big that the straight control line programming will end up adding more delay then a simple decoder would have, and be about 5x the space


(10-04-2014, 06:39 AM)TSO Wrote: Upon consideration, I could venture to say that if you stood back far enough from it and squinted real hard, you would see something that sort of looked like a pipelined CPU.
doesnt work like that, either you are pipelined or your not,
the best way I would suggest you to get into pipelining is to do a 2 maybe 3 stage at first, and if you would like to do that, I can let you know the best way to go about that.


(10-04-2014, 06:39 AM)TSO Wrote: Also, just in case you still don't know the chunk loading thing I mentioned, I was talking about the spawn chunks. (Also, I had to look up what a command block was because I have never heard of them.)
this could work,, but a CPU is RARELY that big that it even comes close to going outside a render distance, and nothing outside the render distance (besides spawn chucks) even register when redstone is activated,, well they do SOMETIMES,, but its buggy enough that you can not use the system for a CPU as it wil corrupt data


Links as I promised:
Losgisim version of CPU
https://www.dropbox.com/s/df3e2wi82kftgty/IizRx16-RCr1.6b4.rar?dl=0
This will have the CPU itself, the Instruction set and the 16/32/48bit framework exampled at the bottom of the IS, it also has the Logisim Program itself so u dont need anything else, and it has programs you can upload into the PROM in the CPU circuit, it is a work in progress and I am making a Belt version of the same IS shortly, after I finish porting that current version to MC (dropping data width from 16 to 8bit)
ask one of us if you are unsure how to use logisim

PDF and PPT resources
https://www.dropbox.com/s/j8ocf6aezj9uiup/CPU%20Resources.rar?dl=0
these are not in any particular order and the naming system in them is dumb as hell (just left there names as they downloaded as) some are super advanced, some are mild, and some are simpler.. ull just have to go through them, let me know if you want stuff on a specific topic, and I can get you some great stuff on it.

As I said, im here to help, not to discourage

LAST NOTE: Bussing bro BUSSING,,, one of the most important things to keep In mind is “how do I minimize bussing time”. When making a CPU, you can have the fastest parts, but if your data loop has like 8 ticks of bussing delay,,, those are all wasted ticks,, that could either be eliminated, or that could be put to better use, like doing proccessing. So learn how to stack ur parts close as u can.


RE: I am TSO - greatgamer34 - 10-04-2014

well lord said more than i would have taken the time to...


RE: I am TSO - jxu - 10-04-2014

(10-04-2014, 01:54 AM)LordDecapo Wrote: The best route is to do a pipeline CPU that is ATLEAST 10 tick clock

10 ticks is really pushing it in terms of actual functionality and memory access compared to some benchmark tests


RE: I am TSO - LordDecapo - 10-04-2014

yes, i meant that as like more then 10, doing anything under 11 is hard if u haven't done pipelining before


RE: I am TSO - Magazorb - 10-04-2014

TSO i've no idea who you are but i love you, you're just that bit crazy enough to do new things and that's cool Big Grin, but you've got lot's to learn.

It seemed to you wanted to be proven wrong and so now i shall help; Your talk on the memory not causeing lag for the reasons you said are somewhat correct, those things you said are the main causes of "Lag" but you didn't notice how the server keeps track of all of those states regardless to weather it's a chunk a player is in or not, also each time the redstone changes MC makes a write to HDD (because mojang derp) which of course takes time, but to say you placed glowstone in all the otherwise dead space, it may reduce the lighting update a little.

Now to move on to the instant wire busing you wanted, their's a update limit in MC iirc, implementing instant wire has proven unreliable in the past and those where on relatively small scale, to put it into the scale you have would most definitely drop values and thus cause corrupt data.

This where just some facts, i get the sense that you're trying to initialate multicores, unfortunitly in MC they tend to not work for the fact we don't have enough load, and for the few things where multicore could be usefull we don't have the memory bandwidth to match the requirements of this, hence why most CPUs are only a single SISD PE

Further more Decapo does have a rather powerful SISD PE that nears what is theroeticly the best possible. beyond which if you want to talk about raw processing power i'm still drafting methods of a SIMD or MSIMD PE array, of which i target to achieve a minimal of 26.6recurring (26+(2/3)) computations a second average.

Currently what you have to contend with are PEs that can decode instructions, fetch, compute, writeback and branch with a 10tick clock, some are even quicker then that.

I like some of the ideas you have and you seem like a nice enough guy, i hope you shall be proud of your creation how ever it turns out as few tempt to make such large scale stuff.

though i have to now make it clear why i choice to go against you so strongly:

(10-04-2014, 06:39 AM)TSO Wrote: So when I say the CPU will work, it's because I don't even have a design yet that can be proven to not work, and we know CPU's exist that do work, so we can say it is possible for me to have a CPU that works in our hypothetical situation. Whether or not I designed it myself is irrelevant to the current conditions of the hypothetical.

We call this ignorance in my country, this argument is what most religious people would say to "prove" that their god(s) or equal is real.
But coming back to reality, no evidence of existence isn't evidence of lack of existence, and neither does it work in the other direction, in you're case; lacking the evidence to say you're idea doesn't work doesn't mean it does work.

You should try to be a little more scientific about things you build Tongue
also i define power as P=IV.

It would be nice to see what you come up with as you seem to have a good few ideas and would also like to see you around a bit, best of luck.