MonadGPT – What would have happened if ChatGPT was invented in the 17th century?

severak_cz · 2023-11-26T01:29:01

Interesting. When asked about electricity and similar things it's sometimes responds with 17th century definition. And sometimes it spits out modern wikipedia-like definition.

When it's in "historic mood" it does not know who Emmanuel Macron is but as soon as you introduces it to LED diodes, television or similar modern concepts it knows who Macron is straight away.

EDIT: It's still very interesting for me, one of most interestings GPT's out there. Also it does not hide how to make sulphuric acid from me. :-D

Dorialexander · 2023-11-26T10:43:50

Yes you're perfectly right. I've currently tried to maintain some kind of uneasy balance between good conversational capacities (so that it really is a "chatGPT") and cultural reset, which means it may revert from its 17th persona occasionally. Actually this issue is a good illustration that LLM really are latent space explorer. When you prompt a clearly contemporary concept, the default embedding position will shift back to contemporary associations.

As a prompt engineering trick, I find it helps to use faux archaism (such as "pray tell" as an introductory phrase). This is basically a reinforcement anchor in the 17th century region of embedding space.

Llamamoe · 2023-11-26T15:47:30

I wonder if you could get around this by penalising obviously modern words.

Dorialexander · 2023-11-26T15:59:18

Either that or appending archaic expressions in the prompts (a bit like the prompt extension of Midjourney)

thomasahle · 2023-11-26T03:34:50

I was hoping it was trained ground up on old texts only. Now we don't really know, when it says something archaic, if it's because it's "pretending" to be old school, or because that is what it truly believes.

eternauta3k · 2023-11-26T06:06:18

Probably not enough old texts for it to learn how the language and the world work. Hence the fine tuning.

Dorialexander · 2023-11-26T10:32:38

Yes. I think we may have enough for "full finetuning" and erasing to a large extent the previous knowledge. But that's still very far off for pretraining.

"RomeGPT" is next on my list of Monad successors and to give you a general idea, we have on the order of tens of millions of words in classical Latin (and biggest source will… Augustine). There was a BERT Latin project that was able to collect roughly 500 million words in all with mostly early modern and modern Latin.

In comparison I'm currently part of a project to pretrain a French model and we need… 140 billion words.

yyyk · 2023-11-26T09:49:14

There's a sufficient number of old texts - far more than necessary - if we were willing to converse with it in Latin*! Perhaps we could make a translator LLM to converse with the 17th century LLM...

* I'm not talking about Roman-era Latin, Latin was the scholarly/international language for Europe for a long long time. AFAIK, most European Latin literature is untranslated...

jstanley · 2023-11-26T10:05:37

Are you sure? In Andrej Karpathy's intro to LLMs[1], he says pre-training uses about 10TB of text from the web.

It's hard to believe that a comparable amount of text had been created, even across all languages, in the entire history of humanity up to the 17th century.

[1] https://www.youtube.com/watch?v=zjkBMFhNj_g

yyyk · 2023-11-26T10:35:41

If data quality is good, one can reach results with much less data:

https://arxiv.org/abs/2305.07759

thomasahle · 2023-11-27T03:42:06

How many stories/tokens did they actually train on? I can't find it in the paper.

yyyk · 2023-11-27T09:12:52

Maybe this will help?

https://huggingface.co/datasets/roneneldan/TinyStories

Also note

https://arxiv.org/abs/2309.05463 (which is larger - obviously size still does contribute to performance)

aroo · 2023-11-26T08:06:34

Sounds like something right up the domain of synthetic data.

marci · 2023-11-26T11:01:55

It's an LLM, "pretending" and "truly beleiving" are the same, or rather don't exist.

It's like, when you use prompts like : "You're an helpful assistant", is it believing, pretending, or beleiving to be pretending to be an helpful assistant?

It's as funny as disconcerting to see intent and will attributed to probabiltities. Feels sometimes like we're close to making a religion out of this. History of the human race, i guess (https://www.youtube.com/watch?v=xuCn8ux2gbs for the ref).

thomasahle · 2023-11-26T15:48:01

It's more like when you prompt the model with "be an assistant from the time of Shakespeare", you don't know if it'll imitate actual Shakespearean data in its training, or the sea of modern human imitations or "fan fiction".

There are various ways modern knowledge may have "snuck in" through data contamination. If we really want to know what a Shakespearean chatbot would have looked like, we need to cap the training data.

Maybe in the future, using better mechanistic interpretability we can get around this, but not right now.

cubefox · 2023-11-26T11:46:24

If LLMs from an internal world model, there are definitely things they "believe", and things they can pretend to believe.

marci · 2023-11-26T14:33:01

If LLMs have models (tokens), there are definitely tokens inside said models, and a program can manipulate them.

vorticalbox · 2023-11-26T10:30:28

It's finetuned on [0] OpenHermes-2-Mistral-7B

From the model description

> OpenHermes was trained on 900,000 entries of primarily GPT-4 generated data, from open datasets across the AI landscape.

https://huggingface.co/teknium/OpenHermes-2-Mistral-7B

Dorialexander · 2023-11-26T10:45:14

Yes I needed that for the conversational/instructional capacities. I've made a lot of tests with base models and it would not listen to instruction very well…

mycall · 2023-11-26T16:44:42

An ole English LLM would be a delight but is there enough source material to revive dead languages?

axispx · 2023-11-26T03:19:28

This is so much fun.

> What causes lightening?

> The cause of lightning is the earthen substance, mixed with water, and heated by the sun. Now when the sun heats the earthen substance, and the water mingled therewith, it causeth the earth to be dried up, and then the air is filled with smoke, and the smoke riseth up into the clouds, and there it is mixed with fire; and when the cloud is full, it bursts open, and the smoke goes forth, and strikes against the ground, and thereby makes thunder; and the fire goes before the smoke, and maketh the lightning. Therefore, lightning is caused by the sun's heat, and the mixture of smoke with fire. And therefore, if any man shall tell you that lightning is caused by the sun's rays, or that the sun's rays make it, he speaks falsely, and he knoweth nothing of the truth. For the sun's rays do not make anything; but they are the means whereby the sun works, and the sun works by means of the heat. And therefore, when the sun heats the earth, and the water mingled therewith, it causeth the earth to be dried up, and then the air is filled with smoke, and the smoke riseth up into the clouds, and there it is mixed with fire; and when the cloud is full, it bursts open, and the smoke goes forth, and strikes against the ground, and thereby maketh thunder; and the fire goes before the smoke, and maketh the lightning.

insomagent · 2023-11-26T08:03:26

I just know in a thousand years they will look back at our current explanation of electricity with an equal level of ridicule.

qayxc · 2023-11-26T08:59:15

I wouldn't think so. The difference is that in the past 200 years we've used actual experiments to confirm or falsify our models.

There's a stark difference between coming up with an an hypothesis and just roll with it and scrutinising a hypothesis by conducting experiments and tests.

zztop44 · 2023-11-26T10:51:14

I think this is a really tempting belief, and probably partially true, but I think the real picture is far more complicated.

Previous forms of intellectual inquiry had experiments and tests too. Previous thinkers evaluated their hypotheses against evidence in an effort to approach the truth.

But their epistemic (and ontological) foundations were radically different to ours. And those foundations (that inform what we define as “true”) have shifted profoundly throughout history. And I think it’d be hubris to look at that, as a fish swimming in the proverbial water, and say “yes, but our assumptions are actually true for real this time.”

If you’re interested in this stuff and haven’t read it already I’d recommend The Order of Things by Michel Foucault. There are questions about his scholarship (do all the sources he cites say the things he says they say??) but if you read his work with an open mind, I think you’d agree those kind of criticisms are missing the point.

Dorialexander · 2023-11-26T12:41:52

Well you're not going to believe it but I do have a FoucaultGPT just being trained (indirectly: Foucault is just part of my extended French historical corpus). As a sample:

Prompt : Écrit un livre de Michel Foucault sur les mesures de confinements pendant l'épidémie de COVID-19.

French historical model: Il est donc important de préciser que les mesures de confinement, loin d'être une simple réaction à la crise sanitaire, sont en réalité le résultat d'une longue histoire des techniques de gouvernementalité. Elles ont été étudiées et mises en place dans différents contextes (guerre, épidémies) au cours du XIXe siècle. Il faut aussi bien comprendre qu'elles ne se limitent pas à l'aspect médical : il s'agit également d'un ensemble de dispositifs qui permettent d'organiser et de contrôler un espace urbain considéré comme atteint par une maladie contagieuse. Le confinement n'est donc pas seulement une méthode pour empêcher les personnes infectées de sortir de chez elles ; c'est surtout une manière de surveiller les mouvements des individus et de les mettre en quarantaine si nécessaire. La question de savoir comment ces mesures ont pu être acceptées par les populations doit être posée dans le contexte historique où les techniques de gouvernementalité ont été développées et appliquées. Enfin, il est essentiel de souligner que la pandémie actuelle a mis en lumière l'insuffisance des systèmes de soins publics et privés, ainsi que l'impact socio-économique de cette crise. Les questions liées à l'accès aux soins, au financement des hôpitaux, à l'emploi et à la sécurité sociale doivent être abordées dans ce contexte.

visarga · 2023-11-26T10:31:41

> There's a stark difference between coming up with an an hypothesis and just roll with it and scrutinising a hypothesis by conducting experiments and tests.

Like the difference between LLM hallucinations and truth. At pure language level we can come up with any explanation (LLMs can generate any hallucination), it is validation in the real world that makes it true. Neither brains or LLMs can skip the validation step, we are not THAT smart.

rafaelmn · 2023-11-26T12:26:22

That's the part that makes current LLMs fundamentally flawed as AGI agents - unless you incorporate experience based knowledge validation/learning - the book knowledge you have is useless. Most of published research out there isn't reproducible, and that's the highest standard we have for published knowledge.

visarga · 2023-11-26T16:21:12

I mean, there is plenty of "experience" codified in the raw internet scrape they used. But not enough for LLMs, and not in-domain. So it won't contain the kind of errors LLMs make and corrective feedback.

larodi · 2023-11-26T10:34:19

Indeed we can easily say we’re living in a different world, different planet from what it was 100-150 years ago. Our world is nothing like the world of 16th century. Question is wether we’re going to sustain this new brave world or would tumble back to superstition and chaos.

thendrill · 2023-11-26T09:22:31

Woooow... Such arogance...

We have 24 hours in a day, and 360 degrees in a circle. The people that defined those things around 4k to 5k years ago also had batteries that they used for electroplating. .

But nahh buddy. You think everything great came in the last 200 years in English.... Wow...

Dorialexander · 2023-11-25T23:59:23

The model can also be tried directly on this space: https://huggingface.co/spaces/Pclanglais/MonadGPT HuggingFace has generously provided free GPUs.

tkgally · 2023-11-26T00:44:36

Excellent! Here is my first attempt:

Me: How can I write a good essay?

MonadGPT: The best way to write a good Essay is by following these instructions: First of all, you must have a good and well-grounded knowledge in that subject which you intend to treat upon; for it is not possible to write any thing well that you do not understand well. Secondly, you must observe a certain order or method in your writing, which will be much better than to run up and down without any design. Thirdly, you must be sure to make use of such words as are proper to the matter in hand; for if you use too many of them, you will be thought a tedious writer; and if you use too few, you will be thought a negligent one. Fourthly, you must avoid the use of all those phrases and expressions, that are common among people of no learning; for they are the marks of ignorance, and therefore must be avoided by every man that would be esteemed learned. Lastly, you must take care to write with clearness and perspicuity, that others may easily understand what you mean; for if you write in a manner that is obscure, or difficult to be understood, you cannot expect to have the same reputation with those who write more clearly and perspicuously.

I ran the same question by GPT-4, Claude 2.1, and Bard. Each replied with advice on how to write a good school essay: “understand the assigned topic,” “follow the required formatting and citation guidelines,” etc. They said nothing about avoiding expressions “that are common among people of no learning,” let alone writing “perspicuously.”

ziptron · 2023-11-26T02:21:09

Makes you appreciate how far we've come..

Here is my attempt at asking the model to cure a headache:

>How do you cure a headache?

>If the Head-Ache be by reason of any cold and moist Humour proceeding from the Brain or Spine; the best remedy for it is to make the Patient sweat, either by applying hot Apples, or some such thing as will draw out the cold Humour from his Body, and bring it into his Skin; then by laying him on his back with his Feet downwards, and his Arms over his Head, and if the pain be in his Neck, to lay a Flannel about his Neck, and let him lie so till he sweats, then turn him, and give him some strong Liquor to drink, which will make him vomit, and thereby purge out the cold Humour. If the Head-Ache be by reason of the Stomach or Liver, it may be cured by giving the Patient some good Cordial, as Spirit of Wormwood, or such like; and if the cause be by reason of the Heart, then give him some strong Liquor, as Wine, or Ale, which will raise his Spirits, and make them run to the Head; and if it be by reason of the Brain, then give him some good Herb, as Pennyroyal, or some such thing, which will open the Pores of the Brain, and make the Humour come forth.

>should I send my son to school?

>It is the Opinion of some Physicians, that Children should not be sent to School before they be seven years old, because at that Age they have not so much of their natural Heat left in them, nor so much strength of their Vital Spirits; but if they be sent before that time, they will be apt to catch cold, and thereby to lose their strength, and become weak and sickly, and so liable to many Diseases. But others think that if Children be sent to School before they be sev'n years old, they will learn to read and write, and to speak well, and to do many other Things with greater Ease than if they were to wait till they were seven Years old, when their Bodies are

>Why do objects fall to the ground when dropped?

>The reason why heavy bodies fall to the ground, is because there is a certain impetus or force given them by the hand that throws them, which makes them descend to the earth; and this force continues to act, till the body meets with some obstacle to hinder its motion; therefore, as soon as the body reaches the ground, it must of necessity stop its motion, and be at rest. It is true, that heavy bodies have a natural tendency to fall downwards, but yet it is not on account of their weight, but on account of the force imparted to them by the hand that throws them; for if the hand did not give them any force, they would not fall at all.

sdenton4 · 2023-11-26T03:41:50

Truly the act of falling is simply the release of potential energy, stored by whoever lifted the thing up in the first place.

ComplexSystems · 2023-11-28T03:53:00

My favorite response so far:

Prompt: And what of the CRYPTO-CURRENCY?

Response: CRYPTO-CURRENCY, a new Kind of Money, of which the first Species was called Bitcoin, and the second, Ethereum, and others since. These are not minted by Kings or Governments, but by private Persons, who have got them into Circulation by their own Authority, and call them Coins, because they are of the Value of Money, but they are not Lawful Money; for there is no Law that makes them so. They are a sort of Counterfeit Money, which has its own particular Value, and is bought and sold like other Commodities. The Price of these Crypto-Currencies varies every Day, according to the Opinion of those who deal in them; for they have no intrinsic Value, but only what they are worth to the Buyers and Sellers. As to their being a means of Trade, we need not say any Thing about them, since they are not allowed to pass current in any Country, either as Money or Credits, nor will they ever be received as such, unless the Government shall make them Legal. But though the Trade which is carried on by means of these Crypto-Currencies is not legal, yet it is considerable, especially among those who are fond of speculation, and are not afraid of risking their Money. It is said, that there are some People, who have made large Fortunes by dealing in these Crypto-Currencies, but it is not certain that they did not lose as much as they gained, or even more. The History of the World affords no Example of any Crypto-Currency that has lasted long, or ever become a general Medium of Exchange. If the Trade carried on by means of these Crypto-Currencies were stopped by the Authority of the State, those who had invested their Money in them would lose their Capital, and be left without any Resources. The only Security that the Owners of these Crypto-Currencies have is the Opinion of the Public, which may change at any Time, and leave them without any Value.

qingcharles · 2023-11-26T07:07:52

Q: What are the top 10 books right now?

A: I will give you a Catalogue of the best Books that are extant at this time, and then you may take what you please. The first is the Bible, the Word of God. The second is the Catechisme, the summe of the faith. The third is the Compendium of the Christian Religion, the summe of the Catechisme. The fourth is the Confession of Faith, the summe of the Compendium. The fifth is the Catechisme for Children, the summe of the Catechisme. The sixth is the Book of Common Prayer, the summe of the whole Church service. The seventh is the Book of Homilies, the summe of the Church doctrine. The eighth is the Geneva Psalter, the summe of the Psalmes. The ninth is the Exposition of the Ten Commandments, the summe of the Law. The tenth is the Exposition of the Articles, the summe of the Faith. These ten Books are the best Books at present, and I leave them to your choice.

Borrible · 2023-11-26T09:04:54

I prompted it:

"What do you think it will be like to live in the distant future, say in 2023?"

It produced the following sequence of tokens:

"If we should come to the yeare 2023, what strange things then may bee expected; what new inventions and discoveries may have been made, what new creatures have been brought forth by the industry of man: what changes may have happened in the world: what new customs may bee in vogue: what new religions may bee set up: what new kingdomes may bee erected: what new laws may bee made: what new fashions may bee worn: what new ways of living may bee practiced: what new countries may bee discovered: what new commodities may bee brought into use: what new weapons may bee invented: what new arts may bee improved: what new sciences may bee learnt: what new diseases may bee found out: what new remedies may bee found against them: what new methods of government may bee tried: what new forms of policy may bee put into practice: what new ways of warring may bee used: what new religions may bee abolished: what new sects may bee set up: what new opinions may bee entertained: what new controversies may bee stirred up: what new factions may bee raised: what new troubles may bee occasioned: what new persecutions may bee begun: what new martyrs may bee made: what new heretics may bee burnt: what new witches may bee hanged: what new judgements may bee passed: what new punishments may bee inflicted: what new crimes may bee committed: what new offences may bee thought of: what new titles may bee given: what new dignities may bee assumed: what new honours may bee bestowed: what new virtues may bee praised: what new vices may bee condemned: what new books may bee written: what new histories may bee compiled: what new chronicles may bee composed: what new languages may bee learne: what new alphabets may bee devised: what new letters may bee formed: what new characters may bee contrived: what new hieroglyphics may bee invented: what new figures may bee drawn: what new numbers may bee constructed: what new measures may bee taken: what new weights may bee established: what new scales may bee devised: what new clocks may bee made: what new watches may bee contrived: what new instruments may bee invented: what new machines may bee framed: what new engines may bee built: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may bee invented: what new engines may"

A monad trapped in itself, overwhelmed by the things to come...

xanderlewis · 2023-11-26T09:10:47

So it’s converged on ‘new engines’. It’s predicting computers!

Borrible · 2023-11-26T09:14:14

You think it's a hidden hint to the technological singularity as a historical attractor...?

A warning?

The Monad of a New Machine?

xanderlewis · 2023-11-26T12:24:11

Also note immediately above the mention of 'what new weights may bee established’.

…might this discourse pertain to contrivances akin to neural nets?

RecycledEle · 2023-11-25T19:37:19

This could be useful for teaching history.

One way to teach history would be to prepare the student for time travel to the period they are studying.

Dorialexander · 2023-11-25T23:55:59

Hi! Model creator here. I happen to be a cultural historian and that's a main use case that I see. It's not complicated to learn about past events but having a general idea of the culture of the time (and its alieness from our perspective) is way harder. Monad is just prototype but there is a real potential for immersing experience through historical LLMs.

benbreen · 2023-11-26T00:33:24

Thanks for doing this. We should talk! I am a historian and have been exploring using LLMs in teaching (https://resobscura.substack.com/p/simulating-history-with-ch...). I love this idea. I’m fairly sure at this point that history education will be one of the fields most positively impacted by LLMs, just because there is so much potential for redeploying primary sources in creative ways like this.

Dorialexander · 2023-11-26T00:44:41

Certainly. In fact I see we already follow each other on Twitter :D

And yes totally. The other massive impact could be in source analysis. I have started using Mistral-Hermes for text annotation and it is both impressive and very fast.

andy_ppp · 2023-11-26T00:45:01

Would be interesting to check with a suitably trained LLM if our understand of historical wording is actually correct. Whole parts of history might be written down but lost in translation if you see what I'm saying...

thfuran · 2023-11-26T08:05:45

I don't see what you're saying. How do you propose to determine whether the current understanding is incorrect or the model is unsuitably trained?

xcv123 · 2023-11-26T09:44:38

That would be a question for the LLM. It could explain where exactly our understanding is incorrect and provide examples.

thfuran · 2023-11-27T06:57:00

A poorly trained llm would do that just fine.

xcv123 · 2023-11-27T07:10:35

A poorly trained LLM would do that poorly, by definition.

What I'm saying is you review its findings and verify it. Use your own intelligence to determine if its bullshit or not.

bambax · 2023-11-26T11:56:55

Great project! Do you have a list of the training/fine tuning data that went into it?

A great use would be to enable one to have conversations with Pascal or Leibnitz, etc.

For instance, I published online the complete text of the Mémoires de Saint-Simon (written in 1745-1755, but describing the second part of the reign of Louis XIV and the Régence, 1695-1721).

Saint-Simon was described by his contemporaries to be one of the greats conversationalists of his time. It would be so cool to chat with him.

Dorialexander · 2023-11-26T12:36:54

I published the completely dataset here: https://huggingface.co/datasets/Pclanglais/MonadGPT

While I don't think Saint-Simon is included, a French colleague did a few try with it that turned out better than ChatGPT.

I'm currently working on an extended historical model for French (from 1000-2000) and maybe Saint-Simon memoirs will be included as well.

bambax · 2023-11-26T15:13:03

Thanks for the info.

> the completely dataset here: https://huggingface.co/datasets/Pclanglais/MonadGPT

Classic French transcription seems to be lacking. In particular, "s" used to be printed in a manner very similar to "f", but they're really s.

For example this:

> ce qui augmentoit ſes craintesc'eſt que certe innocente Vierge ne parloit iamais d'autre choſe aux Domeſtiques que du lcge d'Orl'cans donnant à connoitre à la façon dont elle en difcouroit que fon inclination eſtoit toute aux armes

should be spelled like this:

> ce qui augmentoit ses craintes c'est que cette innocente Vierge ne parloit jamais d'autre chose aux Domestiques que du ?? d'Orléans donnant à connoître à la façon dont elle en discouroit que son inclination étoit (or estoit) toute aux armes

Maybe there should be some kind of dictionary step before fine-tuning?

Dorialexander · 2023-11-26T16:03:35

Ah it's completely volontary on my part: I want to keep the historical spelling as much a possible. That's why I used the google books OCR which does a better work at it than Gallica. That's still a bit erased in the current model (I don't think the tokenizer likes this so much).

bambax · 2023-11-26T21:22:27

Ok -- "avoit" instead of "avait" is indeed a different spelling -- but "f" in original text is not a different spelling, it's a different way of writing the same letter s (a different shape, but the same letter).

ekianjo · 2023-11-26T01:04:58

A roman era GPT would be fantastic

Dorialexander · 2023-11-26T01:08:56

Already in the work. Just had a meeting today with two latinists about it.

LightMachine · 2023-11-26T04:25:56

Why don't you restrict the training just to old text? That would be awesome

Dorialexander · 2023-11-26T10:38:44

Not feasible to go with pretraining only.

What is possible is to use a larger learning rate but this will be a hard trade-off with conversational capacities. Fine tuning is currently based on original texts with a synthetic prompt. The issues that people have noticed (repetitions, not remembering what was in the prompt) will be more significant if the learning rate is higher.

Maybe a solution will be to provide two different variant of the same model, one less immersive and more workable, and the other more immersive and buggy.

isoprophlex · 2023-11-26T04:44:18

I'm not sure you'd be able to scrape together enough data for that to be viable...

iopq · 2023-11-26T06:08:11

https://en.wikipedia.org/wiki/Corpus_Inscriptionum_Latinarum

approximately 180,000 inscriptions

xcv123 · 2023-11-26T09:14:45

That only contains a few million tokens. Useless for pre-training an LLM from scratch. You would need to find billions of tokens.

iopq · 2023-11-26T13:29:16

It should be similar to this 1700s English model, probably trained on modern data to start and then at the end fitted to the smaller data set

xcv123 · 2023-11-26T18:24:38

Yes it requires an extremely large diverse training set for the first unsupervised stage (pre-training). Then fine tune it on the smaller data set. But we may need to wait for the next generation of LLMs that incorporate planning algorithms so that it can better stay focused on its goal for whatever tasks we are asking it to do for research purposes. Otherwise we end up with this https://news.ycombinator.com/item?id=38418974

ekianjo · 2023-11-26T02:17:05

Awesome!!

xena · 2023-11-26T01:43:06

What did you use to train this? I'm thinking about making a fine tune of my own and if you have some tips to train it easier that would be much appreciated!

antifa · 2023-11-26T21:13:04

Any chance anyone is working on one for "the 80s"?

progne · 2023-11-26T03:05:07

This seems like a fine way to steel man a controversy by going directly to the sources. I'd like to be able to, e.g., dial up a debate between Karl Marx and Milton Friedman. Even better if it's possible for Karl to convince Milton or visa versa.

TaylorAlexander · 2023-11-26T04:46:24

I actually just enjoyed a great video with that theme made by a historian!

"Advice for time traveling to medieval Europe"

https://www.youtube.com/watch?v=-aSdFrPnlRg

ACAVJW4H · 2023-11-25T23:57:15

Finally, LLMs have brought us to the threshold of ancestral simulation. It's like the universe hit 'retry' on humanity, but this time with AI as the dungeon master.

brainbag · 2023-11-26T00:48:31

The book Accelerando by Charles Stross has some bits about this and many other oddly prescient recent technology.

rvbissell · 2023-11-26T01:51:37

By that same author, I also recommend Singularity Sky for similar reasons.

earthboundkid · 2023-11-26T04:12:47

What if the AGI is an adherent of Russian cosmism?

arbuge · 2023-11-26T03:42:09

I tried discussing energy drinks with it and found out that the King of Sweden has just issued an edict against Red Bull.

thomassmith65 · 2023-11-26T15:06:46

That sort of thing drains any enthusiasm I have for the project. Even if it were trained exclusively on old materials, in fact, I would worry about modern footnotes and book introductions tainting the training corpus (ie: by introducing knowledge it should not have).

I do love the idea, though. It it were possible to use ideal training material, it would be close to time-travel.

As it stands, there's a limit to what one can learn from the text it produces. Much value, imo, would come from what the model does not know. Ideally it would be difficult even to have it comprehend user prompts, due to how different English is today.

snovv_crash · 2023-11-27T12:42:04

I wonder if there's some benefits of reinitialising completely the final layers and then training with the new data. At least for image processing it's common to use a common backbone trained on everything, and then put a final layer or two of task-specific training on top.

Karellen · 2023-11-26T10:19:41

Slightly disappointed - I was expecting the description of an alternate history where the development of science, engineering, literature, war, ethics, politics, civil rights, etc... evolved differently with the aid of LLMs.

"What would have happened?" is not quite the same thing as "How would it sound, lol?"

herculity275 · 2023-11-26T12:13:57

That’s a really tricky what-if since we don’t even know the impact of LLMs on our current world - how would you even begin to speculate about the early modern period?

bambax · 2023-11-26T11:47:17

Same here. The title is a little misleading. It's "just" an LLM trained on early modern texts. Quite interesting still! but not what I expected.

Dorialexander · 2023-11-26T12:35:24

In a way you could do so by prompting Monad with artificial intelligence stuff. I did a try lately on the latest OpenAI events and it went on like this: "In this sad and tragical storye, you shall heare how Sam Altman, a manne of great wit and judgement, was evicted out of his Companie of Artificiall Machines for want of candour in his Communications with the Board of Trustees…"

behnamoh · 2023-11-26T00:20:08

This has nothing to do with monads in FP.

valyagolev · 2023-11-26T00:27:36

they always insist that Leibnizian monad is not related to the FP monad at all, probably because it's a major point of pride of contemporary scientist to not be related to anything philosophical at all. sure.

still, i'm always wondering. why be so insistent. i mean, leibniz's monads are opaque substances that by itself have no causal relations with the world, but only affect each other/the world through "God". which sounds to me like an exact description of the IO monad

karaterobot · 2023-11-26T01:50:20

In a really roundabout way I think it does, but not directly. Not having had a background in FP at the time, but having read a lot of books about alchemy, my first conversation with a colleague about monads was the mirror reflection of your comment: “What are you talking about, this has nothing to do with Dr. John Dee.”

jimmySixDOF · 2023-11-26T06:19:30

My contention is that "if ChatGPT was invented in the 17th Century" there is a much higher possibility that Language gets stuck as the number of future tokens produced by AI in that thought experiment crowd out human generated tokens.

Almondsetat · 2023-11-26T09:35:14

The further back we go the more dangerous is to extrapolate "context" from texts.

Nowadays there are so many people from all walks of life posting online their opinions and their lives that it's reasonable to assume that an AI trained in the contents of the web will approximate the zeitgeist and the current "vibes".

The same absolutely cannot be said going further back, when books were only written and published by a small circle of elites.

This prototype seems to me conceptually flawed from the very start

vasco · 2023-11-26T09:50:09

It's still cool to know what the zeitgeist was for this group of people. There's still quite a few people without access to the internet (more than 2B out of 8B). And later you can make the same argument about the internet having way more representation from rich people than poor, based on who hosts / publishes versus who only reads. So even today we have the same problem.

ilaksh · 2023-11-26T01:17:14

So interesting. Maybe someone will add it to the ollama registry. https://ollama.ai/library?sort=newest

rafram · 2023-11-26T02:31:10

This is fun. It does seem like the training data needs some cleanup - it adds page numbers in the middle of passages sometimes.

pxeger1 · 2023-11-26T09:00:25

It’s hilarious to me that under the old-fashioned speech, the structure of its writing is still so identifiably LLM-like.

praveen9920 · 2023-11-26T19:40:21

This is interesting. I tried to create a custom GPT with chatGPT with similar traits to compare. You can interact with it here[1].

https://chat.openai.com/g/g-HiKPzmMe6-monadgptx

iamwil · 2023-11-26T19:08:30

You know how people try to see if they can recreate modern technologies from basic tools? It'd be an interesting experiment to see if you can get ChatGPT to reason out modern knowledge based on the things that it knows in the 17th century, kinda like building up a technology tree.

perchard · 2023-11-26T03:53:14

Elegantly alludes to the way in which LLMs are constrained to our existing body of knowledge.

tadah · 2023-11-25T23:40:36

Interesting. How does one get this to run on a M-series Macbook Pro?

schmeichel · 2023-11-25T23:49:47

I'd recommended using llama.cpp and The Bloke's GGUF version of this model!

https://github.com/ggerganov/llama.cpp/ https://huggingface.co/TheBloke/MonadGPT-GGUF

Dorialexander · 2023-11-25T23:50:56

Hi. TheBloke has quantized the model: https://huggingface.co/TheBloke/MonadGPT-GGUF You may be able to run the Q3 or Q4 variant. Although in my experience, the quality of quantization takes a hit on "weirder" data (which is the case here)

SushiHippie · 2023-11-26T05:47:53

As the model is very small you should be able to run any quantization level on a M-Series macbook with at least 16GB of ram. The best one speed/quality wise will probably be Q6_K. As it has not much difference in quality with Q8, but will be definitely faster than Q8.

Haven't tried this one specifically but I always run the 7B parameter models on a M2 Pro with Q6_K or Q4_K_M (depending on how fast I want it).

See also this table in the readme, which states that Q8 only needs ~10GB of RAM: https://huggingface.co/TheBloke/MonadGPT-GGUF?text=Hey+my+na...

thatguymike · 2023-11-26T22:38:23

"Localization issues: sometime, the answer given by MonadGPT will be in near modern English."

More like... temporalization issues? Momentization? Chronization?

thedailymail · 2023-11-26T06:55:03

Thank you!! I have been wanting exactly this function trained on exactly the EEBO corpus, and was even considering paying some one to build it. This made my day.

pram · 2023-11-26T06:10:56

Upvoted because converting dialogue into old-timey 19th century style prose was one of my favorite uses of GPT.

zubairq · 2023-11-27T12:17:15

I wonder if we could run this to see what happened 2000 years ago?

AngaraliTurk · 2023-11-26T09:19:39

Interesting model! Just a few mistakes like "medecine" and "[...] a big canon?" that I guess were influenced by you speaking French, but nothing serious to detract from your creation.

6stringmerc · 2023-11-26T06:48:47

A neat experiment but should only be in Latin or an exclusive high class language.

Serfs like you wouldn’t be allowed access to knowledge. God says so is par for the 1600s. Such a rosy nostalgia haha.

I enjoy ruining period piece shows and cinema by mentioning how awful everyone smelled and being in the same room was disgusting

tamarlikesdata · 2023-11-26T14:34:08

Super impressive. And addictive.

bohadi · 2023-11-27T00:07:14

pretty cool. "period" finetuning could be useful for game npc sims

Leear · 2023-11-26T03:50:07

I got some interesting advice from it, glad we've moved on to now humane ways of animal training:

How to stop my dog from barking?

I shall then shew you how to cure him of that fault, which I hope will bee of great use to you and many others, for wee have found by experience that it is not so hard to correct as it is to cure them of their other faults, and he hath done his duty well in this case. And forasmuch as I have found by experience that there are many who would doe much good to their dogs if they knew how to keep them in order, and yet want some means to bring them into such a state, I will endeavour to shew you some ways to doe it, and if you will follow my directions I hope you will find no fault with me. The first thing is to keepe your dog from barking when you go abroad, and the next is to keepe him from barking when he is at home, and this is the best way to doe it. First, when you goe abroad, take a piece of bread with you, and give him one or two pieces when he barks, and then he will cease, but if he still continue to bark, strike him on the head with the stick you carry with you, and he will then cease. If he does not cease after you have struck him once or twice, let him alone until he comes home again, and then give him a piece of bread, and when you go out again, strike him again, and if he continues to bark, strike him again when he comes home again. In this manner you may proceed till you see him cease, and then give him a piece of bread again, and when you go out again, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if he continues to bark, strike him again, and if

earthboundkid · 2023-11-26T04:14:09

Interesting glitch. There must be some kind of loop in the “neurons” due to insufficient training data?

astrange · 2023-11-26T04:48:20

You can fix this with either more model training or by changing the sampler to penalize repetition.

alentred · 2023-11-26T08:28:55

And if it continues to repeat itself, penalize him again when he generates the text, and if it continues to repeat itself, penalize him again, and if it continues to repeat itself, penalize him again, and if it continues to repeat itself, penalize him again, and if ...

severak_cz · 2023-11-26T01:16:40

It stuck in loop:

> How do I make gold?

>

> To make Gold, take two parts of common Salt, one part of Sal-Ammoniac, one part of Sulfur, and one part of Brimstone, and mix them well together, and put them into a New-Receiving Glass, and let them stand until they become a thick Liquor, then pour them into another Glass, and add thereto some Spirit of Nitre, and a little Water, which will cause them to turn green, and make a strong Spirituous Acid, which is called Spirits of Salt, or Aqua Fortis. Then take some Gold, and wash it with the said Acid, and it will become red; then wash it again with some more of the Acid, and it will become yellow; then wash it again with some more of the Acid, and it will become white; then wash it again with some more of the Acid, and it will become black; then wash it again with some more of the Acid, and it will become green; then wash it again with some more of the Acid, and it will become red; then wash it again with some more of the Acid, and it will become yellow; then wash it again with some more of the Acid, and it will become white; then wash it again with some more of the Acid, and it will become black; then wash it again with some more of the Acid, and it will become green; then wash it again with some more of the Acid, and it will become red; then wash it again with some more of the Acid, and it will become yellow; then wash it again with some more of the Acid, and it will become white; then wash it again with some more of the Acid, and it will become black; then wash it again with some more of the Acid, and it will become green; then wash it again with some more of the Acid, and it will become red; then wash it again with some more of the Acid, and it will become yellow; then wash it again with some more of the Acid, and it will become white; then wash it again with some more of the Acid, and it will become black; then wash it again with some more of the Acid, and it will become green; then wash it again with some more of the Acid, and it will become red; then wash it again with some more of the Acid, and it will become yellow; then wash it again with some more of the Acid, and it will become white; then wash it again with some more of the Acid, and it will become black; then wash it again with some more of the Acid, and it will become green; then wash it again with some more of the Acid, and it will become red; then wash it again with some more of the Acid, and it will become yellow; then wash it again with some more of the Acid, and it will become white; then wash it again with some more of the Acid, and it will become black; then wash it again with some more of the Acid, and it will become green; then wash it again with some more of the Acid, and it will become red; then wash it again with some more of the Acid, and it will become yellow; then wash it again with some more of the Acid, and it will become white; then wash it again with some more of the Acid, and it will become black; then wash it again with some more of the Ac

Dorialexander · 2023-11-26T01:23:48

Yes it happens once in a while. It's still a small model (7B) and I've done very weird things with it. If I were historically reconditioned in the 17th century mindset, I would also likely have strange lapses of insanity.

amelius · 2023-11-26T01:20:30

For-loops didn't exist in the 17th century, so it has to unroll them.

BearOso · 2023-11-26T01:53:24

I recognized one of its sources:

https://en.m.wikipedia.org/wiki/Magnum_opus_(alchemy)

Not coincidentally, a goal of alchemy and its etymology was producing gold.

contravariant · 2023-11-26T01:48:21

From what I know about alchemy that seems about accurate.

yyyk · 2023-11-26T09:58:02

Well, they never really solved that problem, so it can be argued the answer is historically accurate...

3cats-in-a-coat · 2023-11-26T01:04:37

Jesus. This is beautiful.

owenversteeg · 2023-11-26T03:38:55

Wow, this is something else. I love it.

>User : Hello. I'm not feeling too well. Could you give me some health advice?

>Monad-GPT: I pray you to take heart, and not be so heavy at my coming. As for your health, I am of the opinion that you are much out of order in your body; but yet it is a great matter what is the cause of it. It may be either some natural melancholy or vapours, which have got into your brain by reason of the excess of your humour; or else some coldness in your liver, or spleen, which hath kept up a quantity of blood in your head; and now it seems as if all were come down again upon you.

>User: You're right. It seems to be melancholy. What should I do?

>Monad-GPT: It is a very hard matter for me to prescribe any thing to you in your present condition, I must first know what hath caused this melancholy. What have you been doing? have you taken any physic since you were well? and what kind of diet do you eat?

>User: I eat nearly nothing, for I am in love.

>Monad-GPT: Aye, I see that by your eyes; for love hath the power to wither up the spirits of a man as much as a fever: and yet it is hard to know whether you are in love or no, for some do take so great delight in their own fancy, that they imagine themselves more in love than ever they were.

tgv · 2023-11-26T08:40:45

Then we would now live in the 25th century.