As a bonus, any operation can be replaced with a lookup into a nxn table.
This was true only for cheap computers, typically after the mid sixties.
Most of the earliest computers with vacuum tubes used longer floating-point number formats, e.g. 48-bit, 60-bit or even weird sizes like 57-bit.
The 32-bit size has never been acceptable in scientific computing with complex computations where rounding errors accumulate. The early computers with floating-point hardware were oriented to scientific/technical computing, so bigger number sizes were preferred. The computers oriented to business applications usually preferred fixed-point numbers.
The IBM System/360 family has definitively imposed the 32-bit single-precision and 64-bit double-precision sizes, where 32-bit is adequate for input data and output data and it can be sufficient for intermediate values when the input data passes through few computations, while otherwise double-precision must be used.
> The smallest possible float size that follows all IEEE principles, including normalized numbers, subnormal numbers, signed zero, signed infinity, and multiple NaN values, is a 4-bit float with 1-bit sign, 2-bit exponent, and 1-bit mantissa.
Someome didn't try it on GPU...
Shouldn't that be m mantissa bits (not y) -- i.e. typo here -- or am I misunderstanding something?
I think Cray doubles were 128 bits, and their singles were 64… which makes it seem like smaller floats are just a continuation of the eternal trend.
But what I wish is that there had been fp64 encoding with a field for number of significant digits.
strtod() would encode this, fresh out of an instrument reading (serial). It would be passed along. It would be useful EVEN if it weren't updated by arithmetic with other such numbers.
Every day I get a query like "why does the datum have so many decimal digits? You can't possibly be saying that the instrument is that precise!"
Well, it's because of sprintf(buf, "%.16g", x) as the default to CYA.
Also sad is the complaint about "0.56000 ... 01" because someone did sprintf("%.16f").
I can't fix this in one class -- data travels between too many languages and communication buffers.
In short, I wish I had an fp64 double where the last 4 bits were ALWAYS left alone by the CPU.
It seems that life is imitating art.
I thought in ancient times, floating point numbers used to be 80 bit. They lived in a funky mini stack on the coprocessor (x87). Then one day, somebody came along and standardized those 32 and 64 bit floats we still have today.
S:E:l:M
S = sign bit present (or magnitude-only absolute value)
E = exponent bits (typically biased by 2^(E-1) - 1)
l = explicit leading integer present (almost always 0 because the leading digit is always 1 for normals, 0 for denormals, and not very useful for special values)
M = mantissa (fraction) bits
The limitations of FP4 are that it lacks infinities, [sq]NaNs, and denormals that make it very limited to special purposes only. There's no denying that it might be extremely efficient for very particular problems.
If a more even distribution were needed, a simpler fixed point format like 1:2:1 (sign:integer:fraction bits) is possible.
00 -> 0.0
01 -> 1.0
10 -> Inf
11 -> NaN
or 00 -> 0.0
01 -> 1.0
10 -> Inf
11 -> -InfOr does that matter - its the kernel that handles the FP format?