yes, I did follow the above pattern, but due to a small network and few known people, they were excited, and a few of them signed up as well today, and others are going to sign up too, but it looks like that Validation failed right on my face.
You know, sometimes close small network validation fails too.
That is very helpful feedback. I will optimise that in a couple of hours.
> It sounds kind of anachronistic,
Hmmmm interesting. I thought this was the perfect time for this, but you may be right based on the attention I have received for the product or maybe it's because of my noisy landing page.
TBH I mostly use LinkedIn for this kind of thing. I do have a resume for the odd occasion I need it but LI mostly suffices. It's hard for me to know whether it would gain traction outside of my demographic, and so I can only speak narrowly.
Agreed, all my network participants said the same thing: they prefer LinkedIn.
Their only reason for supporting something like this was that they bought a domain they liked, and now to have something on those domains without the stress of managing hosting and coding, they would use that, since their domain are sitting idle in their registrar.
Depends on the server. Probably not going to be cost effective. I get barely ~0.5 tokens/sec.
I have Dual E5-2699A v4 w/1.5 TB DDR4-2933 spread across 2 sockets.
The full Deepseek-R1 671B (~1.4 TB) with llama.cpp seems to have a in that local engines that run the LLMs don't do NUMA aware allocation, so cores will often have to pull the weights in from another socket's memory controllers through the inter-socket links (QPI/UPI/Hypertransport) and bottleneck there.
For my platform that's 2x QPI links @ ~39.2GB/s/link that get saturated.
I give it a prompt, go to work and check back on it at lunch and sometimes it's still going.
If you're going to want to achieve interactively I'd aim for 7-10 tokens/s, so realistically it means you'll run one of the 8b models on a GPU (~30 tokens/s) or maybe a 70b model on an M4 Max (~8 tokens/s).