MLTechniques's comments

MLTechniques · on Jan 21, 2025

To skip the high-level presentation and directly download the paper, visit the AI research section at https://mltblog.com/3zsnQ2g, and look for paper 51.

Any solution to the mythical problem in question has remained elusive for centuries. It is deemed more difficult than proving the Riemann Hypothesis, yet its formulation can be understood by kids in elementary school.

This article sets a significant milestone towards a full resolution. It serves as a blueprint, featuring the architecture of a highly constructive yet difficult proof, based on new theoretical developments and concepts designed specifically to tackle the problem, with success.

The goal is to transition from experimental to theoretical number theory. Yet, I use illustrations with numbers that are so big that no amount of computing power will ever be able to handle them. Obviously, I use some tricks here, and such numbers make the patterns and their complexity easier to detect with the naked eye. Most importantly, these patterns can be explained with theoretical arguments based on the new theory: the iterated self-convolutions of infinite strings of characters and their congruence classes, akin to p-adic numbers, in a special topological space.

Finally, there is a strong connection to the deepest aspects of discrete dynamical systems approaching their chaotic regime. Also, there are practical applications to cryptography,

Read the executive summary and access the free paper, at https://mltblog.com/3PEFh42

MLTechniques · on Oct 15, 2024

New Book: Building Disruptive AI & LLM Technology from Scratch https://mltblog.com/404F1BZ

This book features new advances in game-changing AI and LLM technologies built by GenAItechLab.com. Written in simple English, it is best suited for engineers, developers, data scientists, analysts, consultants and anyone with an analytic background interested in starting a career in AI. The emphasis is on scalable enterprise solutions, easy to implement, yet outperforming vendors both in term of speed and quality, by several orders of magnitude.

Each topic comes with GitHub links, full Python code, datasets, illustrations, and real-life case studies, including from Fortune 100 company. Some of the material is presented as enterprise projects with solution, to help you build robust applications and boost your career. You don’t need expensive GPU and cloud bandwidth to implement them: a standard laptop works.

Part 1: Hallucination-Free LLM with Real-Time Fine-Tuning

Part 2: Outperforming Neural Nets and Classic AI

Part 3: Innovations in Statistical AI

About the author

Vincent Granville is a pioneering GenAI scientist and machine learning expert, co-founder of Data Science Central (acquired by a publicly traded company in 2020), Chief AI Scientist at ML Techniques and GenAI Techlab, former VC-funded executive, author (Elsevier) and patent owner — one related to LLM. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET.

See content and get your copy, at https://mltblog.com/404F1BZ

MLTechniques · on Sept 23, 2024

Have you tried the xLLM web API? It allows you to fine-tune and debug an agentic multi-LLM in real time. The input data is part of the anonymized corporate corpus of a Fortune 100 company, dealing with AI policies, documentation, integration, best practices, references, onboarding, and so on. It features one sub-LLM. The full corpus is broken down into 15 sub-LLMs.

One of the goals is to return concise but exhaustive results, using acronyms (a specific table for each sub-LLM) to map multi-tokens found in prompts but not in the corpus, with multi-tokens in the corpus. Exhaustivity is the most overlooked metric when evaluating LLMs designed for search / retrieval. Using xLLM in combination with another LLMs is one of the best approaches, and both can be used to evaluate each other. Yet, thanks to fast in-memory processing, no weight, and no training, the xLLM web API is one of its kind, with capabilities not found in any competing product, free or not.

Read more at https://mltblog.com/47DisG5

MLTechniques · on Sept 3, 2024

New additions to this ground-breaking system include multi-token distillation when processing prompts, agents to meet user intent, more NLP, and a command prompt menu accepting both standard prompts and various actions.

I also added several illustrations, featuring xLLM in action with a full session and sample commands to fine-tune in real-time. All the code, input sources (anonymized corporate corpus from fortune 100 company), contextual backend tables including embeddings, are on GitHub. My system has zero weight, no transformer, and no neural network. It relies on explainable AI, does not require training, is fully reproducible, and fits in memory. Yet your prompts can retrieve relevant full text entities from the corpus with no latency — including URLs, categories, titles, email addresses, and so on — thanks to well-designed architecture.

Read more, get the code, paper and everything for free, at https://mltblog.com/4dNPSnB

MLTechniques · on Aug 24, 2024

Many are ground-breaking innovations that make LLMs much faster and not prone to hallucinations. They reduce the cost, latency, and amount of computer resources (GPU, training) by several orders of magnitude. Some of them improve security, making your LLM more attractive to corporate clients. I introduced a few of these features in my previous article "New Trends in LLM Architecture". Now I offer a comprehensive list, based on the most recent developments.

Read full article, learn about agentic LLMs, LLM routers, contextual tables, fast search, and more, at https://mltblog.com/3Aq9iAb

MLTechniques · on Aug 14, 2024

Most LLMs are not reproducible because the underlying deep neural networks are not. Because that's something LLM creators don't care about. We do, and ours are reproducible, including our GenAI that uses GAN.

All you have to do is allow the user to specify the seeds of the random number generators involved. First, you need a good random generator you have full control over. Betty than numpy.random. See ours, with infinite period and one line of code, faster and better than what's in Python and elsewhere. Here is the link: https://mltblog.com/4fGDLu0

MLTechniques · on July 31, 2024

This article features an application of xLLM to extract information from a corporate corpus, using prompts referred to as “queries”. The goal is to serve the business user — typically an employee of the company or someone allowed access — with condensed, relevant pieces of information including links, examples, PDFs, tables, charts, definitions and so on, to professional queries. The original xLLM technology is described in this presentation. More details are available in my new book (June 2024), available here. [..]

MLTechniques · on March 22, 2024

The GenAItechLab Fellowship program allows participants to work on state-of-the-art, enterprise-grade projects, entirely for free, at their own pace, at home or in their workplace. The goal is to help you test, enhance, and further implement applications that outperform solutions offered by AI startups or organizations such as Google or OpenAI.

You will learn how to quickly build faster and lighter systems that deliver better results based on sound evaluation metrics, with a focus on case studies and best practices. Not the least, you will learn modern methods here to stay, designed by world-class expert and investor, Dr. Vincent Granville, founder of GenAItechLab.com.