- Has there been much exploration on how much benefit comes from precision in activation functions in KANs? There's a little niggle in the back of my head that maybe 90% of the benefit of KANs can be gained from a quite small variety of function shapes. Combined with input weighting, I almost feel you could have a representation that scales from a standard relu perceptron though KANs to something with weighted inputs and fancy weighted activation functions.
Mark that out in 2d with axes of input weight precision and activation weight precision, you could perhaps do sweeps to find the best accuracy per parameter bit, or accuracy/speed, or some sweet spot that has a nice balance of operating speed, accuracy, and model size.
by mikeayles
3 subcomments
- So for people wondering if it can be used to accelerate LLM inference, sadly not.
I've been trying to hit 100,000tokens/s with a 3.28m dumb model, and even this is an order of magnitude too large to benefit.
It appears to be focussed more on latency, than throughput. Happy to be corrected?
by RantyDave
3 subcomments
- Right. But ... this would limit you to either extremely small models or extremely large FPGA's, yes? If there's a simple machine learning task that requires a sub microsecond latency I can see the point but otherwise??
by jeffreysmith
1 subcomments
- Super cool work. I love seeing this direction taken all the way to hardware.
I'm a big fan of KANs. The really seem like the start of something big and new. We've got a couple of papers out and in the works on KANs. The most relevant to OP's is this one: https://arxiv.org/abs/2512.15742v2
And we just put up a general primer on KANs on YT: https://youtu.be/wgcSsJ69x1c?si=fiUl1YGTgaTt_bn9 Fun stuff if you want to get into the weeds.
And if you are really interested in KANs, you should really check out Ziming (KAN creator)'s blog: https://kindxiaoming.github.io/blog/
by Cadwhisker
0 subcomment
- If you want to experiment with KANs yourself in a non-FPGA environment, there's a GitHub repo here: https://github.com/KindXiaoming/pykan
HN comments page on that is here: https://news.ycombinator.com/item?id=40219205
by scivizlabvienna
2 subcomments
- I am using an almost identical architecture of a combination of lut-nn and bitnet on an upcoming fungal network interface which is basically just a metal pole rammed into the forest floor with electrodes at the bottom, fpga lut-nn in between and lora transceiver at the top. Thank you for this paper it will make pitching the concept alot easier using this as a reference :*
- Happy to hear that KANs continue to find solid footing.
by potato-peeler
1 subcomments
- Bit off topic but I have always wondered how is it decided whose names would come first in a paper. You mentioned you and Duc Hoang having equal contribution, so how did you both decide this? Was it that persons idea first or you were his roommate and owe him a beer? Coin toss? I never had an traditional college life. Always wondered about all this.
- This guy will be hired by a high-frequency trading firm, and the next time we hear about him, he will have a net worth in 9 figures.
- I love the name 'Kolmogorov'
- Sorry, I haven't had time to read your papers in full yet. Have you considered that LUTs on many FPGAs aren't 2:1 but instead, say, 6:3 and also may contain flip-flops and muxes? FPGA synthesis may not be as easy as "just" translating the activation functions to LUTs.
- and where is the Transformer library ;)
by DeathArrow
0 subcomment
- I know enough to understand this is interesting but sadly I don't know enough to understand how it works.
by babelfish
1 subcomments
- Archive link, as it looks like the original post was taken down: https://web.archive.org/web/20260609200156/https://aarushgup...
by amdeisimncrmnls
0 subcomment
- [flagged]
- [dead]
- took long enough