

We don’t need a string type - nkurz
http://mortoray.com/2013/08/13/we-dont-need-a-string-type/

======
coldtea
> _We can support that decision by considering the behaviour of a subscript,
> or indexing operation. Should we index the logical character, or a specific
> code point? Again, looking at unicode combining characters, there is no
> actual representation, or even definition, of what a “logical” character is.
> Sequences of combining marks are unbound, thus no fixed size type could even
> represent a “logical” character. It would seem that operations on the string
> need to be done at the code point level — exactly the behaviour for the same
> operation on an array of characters._

Err, what? Indexing/slicing at the code point level?

That's about as useful for text manipulation, as a bicycle to a fish.

Makes me think the author never or rarely ventures outside ASCII.

Sounds like a bunch or rationalizations for providing half-arsed unicode
support for his language.

------
bjourne
A string type that is not a sequence of characters can be allowed to be more
efficient by being lazy. Say you are constructing a string like this (example
in Python, but the principle holds no matter what language):

    
    
        s = ""
        for x in range(100):
            s += str(x)
        print s
    

If s is a sequence of characters, then for each iteration of the loop s needs
to be reallocated to encompass the appended str(x) characters.

If s is an abstract string, then the += operation can be implemented as just
adding a reference to the string to eventually be appended. Then when the
print operation occurs, the whole string can be materialized in one go so that
it can be printed.

Another, and more philosophical argument against string as a sequence of
characters is that a string is no more a sequence of characters, than it is a
sequence of symbols or a sequence of words or a sequence of sentences.

~~~
serichsen

        s = []
        for x in range(100):
            s = s.append(x)
        format s
    

In other words, this is something that holds just the same for vectors of any
type.

The "philosophical" "argument" is really not relevant, since the difference
between characters and "symbols" seems to be just what you want (whatever that
is), while words and sentences can be represented by strings themselves. This
is just a tangent that adds nothing to the question of how to represent a
string.

~~~
bjourne
You're missing the point. If you are appending items to a sequence, you are
better of using a linked list type which is why your argument doesn't hold
because you wouldn't be using an array type in the first place. But with
"strings are character arrays" you can't choose. In the next step you get a
proliferation of competing text types all because the default one wasn't
defined as an adt.

Furthermore it isn't god given that the right way to iterate a text is
character by character. Word by word or line by line is equally common. It's
therefore completely arbitrary whether it's a character sequence, a line
sequence or something else.

~~~
serichsen
No amount of abstraction will save you from finally having to determine how to
represent the end result. That is totally orthogonal to the ability to delay
the accumulation of something to be represented that way (which is just one
strategy to achieve efficiency for this case anyway).

------
serichsen
Take a look at Common Lisp. There, string is a synonym for vector of
characters.

There are no encoding problems, either.

