Hacker News new | past | comments | ask | show | jobs | submit login

I have quantized Llama-2-chat-70b to 2q_k with llama.cpp then run on a 4090 for 50/83 layers. I was able to get 1.3 tokens per second. It answers this question, which smaller models have trouble with:

<s>[INST]Hello. The functions addition, multiplication, and exponentiation can be considered each as a higher order of the previous. For example, 2+2+2 = 2*3 which is significant because there are three twos, which is the multiplication factor. Likewise 2*2*2=2^3, again, the power number represents the count of the numbers being multiplied. What is this concept and what is the next higher order function?[/INST] The concept you are referring to is called the "iterative" or "recursive" nature of arithmetic operations. Each operation can be seen as a repeated application of the previous operation, with the multiplication factor representing the number of times the operation is applied.

For example, the addition of two numbers, 2+2, can be seen as a repeated application of the multiplication operation, where the multiplicand is 2 and the multiplier is 3 (i.e., 2*3 = 6, which is the result of the addition). Similarly, the multiplication of two numbers, 2*2, can be seen as a repeated application of the exponentiation operation, where the base is 2 and the power is 3 (i.e., 2^3 = 8, which is the result of the multiplication).

The next higher order function would be the "tetration" operation, which is the iterative application of the exponential function. In other words, tetration(n,x) = x^(x^(x^(...(x^n)...))), where n is the number of times the exponential function is applied.

For example, tetration(2,2) = 2^(2^(2^(2))), which can be simplified to 2^16, or 65536.

It's worth noting that this concept can be extended even further to higher levels of abstraction, such as "hyper-operations" and "super -operations", but these are more advanced and less commonly used concepts in mathematics.*




Note that it got all the math wrong. 2+2 is a repetition of the succession operator, not multiplication; 2+2 equals 4, not 6; multiplication of two numbers is repeated addition, not exponentiation; 2*2 equals 4, not 8; and tetration(2,2) is 4, not 65536. This is actually the invariant in that n-ation(2,2) equals 4 for all natural numbers n.


To be clear, this is not using QuIP#, and is thus not particularly relevant to the discussion at hand.



It is Llama-2-70b-chat. I quantized it to 2q_k using `quantize` with llama.cpp.


So your experience isn’t representative of the work presented by this post? Or does llama.cpp use the same technique for quantization?


I don't know, hopefully it will help set general expectations.


The answer seems kind of low quality, but it got further than others. It could probably self correct.


I don’t know too many humans who could answer that half as well.


"For example, the addition of two numbers, 2+2, can be seen as a repeated application of the multiplication operation,"

I bet most humans ever born did not know what the word "multiplication" meant. Not sure how your metric is useful.


Sometimes, to achieve some perspective, I also zoom out and declare normal over the estimated +-120,000,000,000 people who have ever lived.


I think the point went over your head (which just further proves mine).

Hint: Perhaps read the original output more carefully: "For example, the addition of two numbers, 2+2, can be seen as a repeated application of the multiplication operation,". Or are you counted in the majority of that 120,000,000,000?


Paris

in the

the spring


To be fair it is a memory/recite test mostly. “Define this”




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: