https://github.com/markusheimerl/gpt/blob/main/transformer/a...
* where is data (make data) how create new my own data, (questions for chat?) * how create a tokenizer (meybe separate) * how stop the code, how many memory need, how setup size of context etc. * how creating a LORA or learn with new data. * how quantize model?
In my opinion this is great idea but making a Ruby extension will be goot way to increase users using this code.
CUDA error in attention.c:91: out of memory
Command exited with non-zero status 1
1.38user 0.46system 0:00.75elapsed 246%CPU (0avgtext+0avgdata 226164maxresident)k
0inputs+0outputs (0major+25414minor)pagefaults 0swaps
make: ** [Makefile:34: run] Błąd 1
clang: warning: CUDA version 12.4 is only partially supported [-Wunknown-cuda-version]
(I have ubuntu and 8GB memory NVIDIA GeForce RTX 3050 876MiB / 8192MiB )