
Smalltalk Envy - mpweiher
https://paulhammant.com/2017/09/01/smalltalk-envy/
======
diegof79
I worked using VA-Smalltalk and Envy for 3 years. One interesting thing about
Envy, is that all the versioning model was available as objects that you can
manipulate, and versioning was not file based but per module/class/method.

In our team each developer worked on a branch, we developed an in-house tool
to automatically merge all the branches. Having the class model available made
easy to identify conflicts and class shape changes. The tool resolved most of
the cases and also created the migration script for the object database
(GemStone).

Nowadays I'm working with JavaScript, using VSCode, webpack, and a long list
of tools... it feels like something went wrong in the dev tooling evolution.

~~~
pjmlp
I am fully in sync with you, having used Smalltalk/V for Windows in the
university, a couple of years before Java was announced.

For me in an alternative universe of ChromeOS with Dart (given Gilad Bracha's
presence on the team) could have been a much better experience with a
Smalltalk like OS, instead no.

We still have Cincom, Squeak and Pharo though.

------
hodgesrm
I used Envy from 1992 to 1993 on a prototype project to replace Sybase SQL
server. It impressed me deeply for two reasons.

1.) It was directly embedded in the development environment. Just saving a
method would check in the change automatically and make it sharable across the
team. This was at a time when most people either not using source control at
all or wrestling with cumbersome systems like SCCS. Combined with Smalltalk's
built-in dev environment, it is still the easiest source control system I have
ever worked with, as least from the developer perspective.

2.) More interestingly, Envy also made metadata about the code base and
versions of code easily accessible. Not everyone knows this now, but Smalltalk
was unique in that the entire environment above the level of the VM was built
in Smalltalk itself and not only available for introspection but for
modification. For instance, even execution stack frames were objects that you
could see and edit directly to change their behavior. (Not a good idea I soon
learned.)

Getting to the point, you could easily introspect every change to code and by
combining introspection of classes and methods quickly determine which changes
were happening where. We built a test framework that could list the changes
since the last run, then compute dependencies and run an appropriate selection
of test cases to check. This cut down test time by 90% or more for new
changes. Because the dev environment was pure Smalltalk it was easy for us to
integrate this this back into the tools so that it was available just by
pressing a button.

Many of the ideas of Smalltalk such as VMs, self-documenting methods and
classes, and auto-compilation made their way into other languages and
products. However, I have never encountered anything quite like Envy since. It
displayed the same level of transcendent innovation for developer productivity
that characterized the original Xerox PARC windowing system for desktops.

------
delish
I know so little about Smalltalk and its way of software development. A year
ago, I read about VisualAge, and asked this question on the lisp subreddit:

[https://www.reddit.com/r/lisp/comments/4saada/curious_about_...](https://www.reddit.com/r/lisp/comments/4saada/curious_about_lisp_storing_code_and_state_in_image/)

It has an interesting response from lispm. Having only worked in languages
where the source is stored in files, it's hard for me to get a grasp on this
kind of development. I catch rare glimpses of this. Here's another example,
from [http://dept-
info.labri.u-bordeaux.fr/~strandh/Teaching/Langa...](http://dept-
info.labri.u-bordeaux.fr/~strandh/Teaching/Langages-Enchasses/Common/Strandh-
Tutorial/diff-scheme.html)

> A major problem with Scheme that does not exist in Common Lisp is that the
> Scheme language does not define an internal representation for programs,
> although to be fair, all Scheme implementations do.

> For instance, in Scheme, there is a difference between:
    
    
        (define (f x y . (z))
           ...)

> which is syntactically illegal, and:
    
    
         (define (f x y z)
            ...)
    

> which syntactically legal.

(You'll probably have to read that short section "Program/data equivalence"; I
couldn't find the right excerpt)

Scheme code is stored in text files. Common Lisp code is a sequence of lisp
objects. This is a strange difference to me; I'm an unexceptional web
programmer. But it's "the very basis of the Common Lisp macro system, which
uses Lisp data to represent and manipulate Lisp code." I am condemned to be
intrigued!

~~~
mikelevins
The distinction is that Common Lisp source code is not text; it's Lisp values.
What you see in a text file of "Lisp source code" isn't actually Lisp source
code; it's a text serialization of Lisp source code. Other serializations are
possible.

As a practical matter, most of the time the distinction doesn't matter. You
edit text files. Your Lisp reads the text files and the first thing it does
with the contents is convert them to Lisp source code. The source code is then
processed as you would expect.

One place where the distinction matters is in macro processing: Lisp macro
expansion operates on Lisp source code, not on text. If it operated on text,
as, for example, C preprocessors do, then macro expansion would be a matter of
string substitution. Instead, Lisp macro expanders operate on abstract syntax
trees represented as convenient Lisp data structures, which means that macros
are not limited to what can be conveniently done with simple string
substitution. You have the full Lisp language available, operating on a
convenient Lisp representation, to compute the expansion.

More generally, when source code is represented as convenient data structures
in the language, it's easier to build tools that operate on it. C compilers do
not operate directly on text input; they convert it to internal data
structures more convenient for the various stages of the compiler to work
with. So does Lisp. The difference is that in a C compiler the representations
used by the compiler are private implementation details, and in Lisp they are
standard surface features of the language, available to every user.

~~~
skybrian
For JavaScript, the Esprima parser [1] converts JavaScript to what appears to
be a JSON format. It's formalized by the Estree spec [2]. You can try it out
here [3]. JSON is just regular JavaScript data structures (maps and lists). If
you want type safety, there are Typescript definitions, but you could ignore
that.

Another example: Go has a parser and AST that comes with the standard library,
which again is just using regular Go data structures (structs and interfaces).
If you wanted to write a macro preprocessor, using the built-in parser seems
like a much better idea than string manipulation.

So, I'm wondering if there's anything more to the way Common Lisp does it?

[1] [http://esprima.org/](http://esprima.org/) [2]
[https://github.com/estree/estree](https://github.com/estree/estree) [3]
[http://esprima.org/demo/parse.html](http://esprima.org/demo/parse.html)

~~~
lispm
Lisp does not use a tokenizer or a parser. It uses a reader, which reads a
textual representation of data into data objects.

A reader is similar to a tokenizer, but the reader just creates an internal
representation of some data from an external representation. Basically what a
primitive JSON reader would do.

A parser would know the programming language syntax and create an internal
representation of the program. The Lisp reader does not know anything about
the programming language syntax. It just knows about numbers, strings, lists,
arrays. But the reader does not know about conditionals, variable
declarations, iteration statements, assignment statements, ...

    
    
      var answer = 6 * 7;
    
      (defparameter answer (* 6 7))
    

If we quote it, it is just Lisp data:

    
    
      CL-USER 8 > '(defparameter answer (* 6 7))
      (DEFPARAMETER ANSWER (* 6 7))
    

As you can see the evaluator just returns the data, though the symbols are
upcased by default. The data is not enriched as a tokenizer or parser might
do. Ideally the data structure prints as it is read. There are no type or
syntax annotations necessary/used. Thus it is neither a tokenizer output nor a
program parse tree. In your first link the tokenizer breaks up the input in
strings and categorizes it. The 42 is read into a string '42' and it is
annotated as being a numeric. The Lisp reader does not do that. It reads the
two characters 42 and returns an integer number object with the value 42.

We can describe this data, it is a list with three elements:

    
    
      CL-USER 9 > (describe *)   ; describe the last value
    
      (DEFPARAMETER ANSWER (* 6 7)) is a LIST
      0      DEFPARAMETER        ; a symbol
      1      ANSWER              ; another symbol
      2      (* 6 7)             ; a list
    

On this level of s-expressions we know nothing about Lisp as a programming
language. It is only known that DEFPARAMETER is a symbol, but not known that
it is a language construct and which. It's just a symbol in the first position
of a list.

Since this list is also a valid Lisp program (because we wrote it as such), we
can compute its value and some side-effects. This process is called
evaluation.

    
    
      CL-USER 10 > (eval **)     ; eval the second last value
      ANSWER                     ; it just returns the name of the defined variable
    

We can also input it directly, since the READ-EVAL-PRINT-LOOP already
evaluates:

    
    
      CL-USER 11 > (defparameter answer (* 6 7))
      ANSWER
    

As you can see, the input to the evaluator is not a tokenizer output and not a
parse tree. It's just an expression tree, where the leaves are data objects.
It's not (token 3 :type number) or (literal-data :type number :value 3) or
something similar. It is just the 3 as a data object.

The evaluator itself could be an interpreter or a compiler. The interpreter
runs walks the expression tree and computes the values from the expressions,
which it sees in data format. A compiler-based evaluator would take the whole
expression tree, compile it and then run it.

~~~
kazinator
There is definitely a tokenizer in Common Lisp, and one that is pretty much
"sealed off" to the programmer. The current read table can assign a character
to be a "constituent". Constituent characters are gathered by the reader into
a token. Then they are analyzed. A sequence of unescaped digits like 12345
becomes an integer. Floating-point literals are recognized. A symbol token is
analyzed for the package : or :: . And so on.

~~~
lispm
That's from the programmer point of view an implementation detail. The main
interface for the programmer is the function READ and it returns data, not a
stream of tokens.

~~~
kazinator
Although there is no program-visible "token" data type, the specification
describes tokens and uses that word. Since the reader is available to
programs, programs can be written to test hypotheses about the
implementation's treatment of tokens; tokens can be composed as character
strings, passed into _read-from-string_ and the results of the operation can
reveal something about the tokenizing black box, without yielding the tokens
themselves.

(That said, the Common Lisp syntax isn't completely treated as a token stream
by the reader, in the glossary sense of "token", since that word is defined
only as the read syntax of numbers and symbols; thus string literals and other
bit of notation aren't defined as tokens.)

~~~
lispm
Exactly, it reads single tokens in some places and then creates data from
them. It uses the word token for that, but that's arbitrary.

That's not what a Tokenizer does, which usually would create a stream of
tokens for all the elements in the input. A token would be a string with some
metadata.

------
paul_h
Said to by /u/vfclists on Proggit ..

"folks should realize that ENVY was Smalltalk's Git, way before when before
anyone heard of Git"

~~~
eesmith
What was distinctively git-like about it that makes that connection more
meaningful than "it was a version control system"?

That is, what made it Smalltalk's git, and not, say, Smalltalk's CVS or
ClearCase? The linked-to page suggests that ENVY used a centralized system.
(Eg, "The Git model is better for distributed development" and "it had a small
process that ran on a server to manage record locking".)

Consider also some of the limitations in ENVY. "You would bring your changes
to the machine in the form of (IIRC) a list of RCS-like versions of individual
methods, then review and merge them with latest." and "What wasn’t available,
was automatic merging".

~~~
igouy
> what made it Smalltalk's git

Nothing obvious. What made it more than "Smalltalk's CVS" was fine-grained
method-level control of versions and configs.

~~~
eesmith
And why not ClearCase, or DSEE before it?

Ah, I think I understand it. I think it comes from a belief that most people
treat "git" as a synecdoche or even the entirety of version control systems.

The quote here uses "git" in two ways: 1) "ENVY was Smalltalk's Git" meaning
"ENVY was Smalltalk's version control system", and 2) "anyone heard of Git"
meaning "anyone heard of the Git system, started by Torvalds."

If so, I think it does a disservice as it reinforces that wrong belief, and is
confusing for those who know of other VCSes.

------
mwcampbell
I immediately recognized the name OTI, because I had seen it in some early
articles about SWT, which is now the cross-platform UI toolkit of Eclipse. So,
was SWT also originally implemented in Smalltalk? Or did it come after the
transition to Java?

~~~
hboon
I don't know about the internals. But OTI developed Visual Age for Smalltalk
and other VA for X products such VA for Java. They were all written mostly in
Smalltalk, including VAJ (very interesting! It was a VM that ran both
Smalltalk and Java).

Eclipse was supposed to be inherited from VAJ, but having used Eclipse in its
early days, I didn't see much resemblance..

------
eadmund
> On the contrary, we developed a style where no branch lasted more than a
> day, and usually only half a day. This came about, I think, due to our
> direct observation of something Kent Beck taught us, which was that if
> something hurts, we should do it more often … If you wait a week or a month
> to integrate outside changes, it’s quite difficult, tedious, slow, and
> error-prone. When you integrate after a few hours, it goes much more easily.

This strikes me as relevant to the monorepo vs. multirepo debate: having
multiple repos makes it easier for things to get out of alignment,
necessitating all-or-nothing hell-merges.

I was also surprised by how long ago these practices I tend to think of as
modern actually started. I was learning waterfall back then!

------
needusername
Is JPMorgan still running their on fork of Envy?

~~~
paul_h
I believe so.

