Different people have different ideas of serious. To me, exploratory programming is fairly serious, because that's the kind of programming that generates ideas.
Arc is already capable of supporting some subset of applications that are serious in your sense. News.YC is at least moderately serious in that sense.
You said News.YC uses some kind of persistent hash structure for storing everything. This seems to me like Greenspun's Tenth Law except with Berkeley DB instead of Common Lisp; I wouldn't want to write the logic to do what BDB already does much better and faster (I don't want to implement ACID transactions myself if I decide I need them).
clarification: Arc should probably have some kind of database hook at some point, not much point in reinventing that wheel when it works so well for many problems. But that's not the point, as the author has repeatedly pointed out.
There is a public git repo anyone can push to (the so-called git "wiki", or anarki). It contains regular expressions. You could wrap some DB bindings for mzscheme with relative ease.
I think the complaints were more about how pg was originally saying that he intended to never support Unicode. That said, people should realize that UTF-8 encoding/decoding is the zeroth step to internationalization with Unicode.
"It's not for everyone. In fact, Arc embodies just about every form of political incorrectness possible in a programming language. It doesn't have strong typing, or even type declarations; it uses overlays on hash tables instead of conventional objects; its macros are unhygienic; it doesn't distinguish between falsity and the empty list, or between form and content in web pages; it doesn't have modules or any predefined form of encapsulation except closures; it doesn't support any character sets except ascii. Such things may have their uses, but there's also a place for a language that skips them, just as there is a place in architecture for markers as well as laser printers." (http://arclanguage.com/)
I certainly interpreted this as ASCII-only (along with presentational markup) was an explicit design decision, and I don't think this is a far-fetched interpretation. Luckily PG has clarified (http://news.ycombinator.com/item?id=111189) that this is not really what he meant, and now everyone is happy and has regained trust in Arc!
- a comment suggesting maybe the first one shouldn't have been upmodded so much
- a comment backing off
Why would you downmod all of them? If you like dislike the first comment, you should like the second, and vice versa. And if you're unhappy with me in general for the first two, you should like the third one.
1) How do you know the same person downmodded the three?
2) Simplest explanation for that case would be that he's downmodding the latter two as noise. You can agree to a comment and still believe it doesn't add anything to the topic in which it appears.
You know, that would actually be an amusing hardware-hacking project.
If you made a keyboard that had every character in every language spoken in the EU, you could even file to make it a standard with whatever earnest standards body is in charge of such things. No linguistic minority should have to use control keys! It would be like giving peanut butter to a dog.
yeah i confused that with ISO 10646 (has the 3 levels which specify unicode, not all levels support the same characters) and UCS2, which UCS2 is essentially UTF-16, so i meant if he used any characters bytecoded in UTF-16 essentially he'd have a problem, although utf-8 doesnt support certain octets.
UTF-8/16 were not yet part of the standard before UCS version 2.0. Been a while I guess since i got updated
Do I understand correctly that Arc strings are sequences of octets?
If so: I really don't want to be a negativity guy but it seems like every language that has made an 8-bit string the default string type has regretted it later because it is so painful to change it without breaking code. Okay, Paul says that he won't mind breaking code. Maybe he means it, but it doesn't make any sense to me to knowingly and consciously repeat a design mistake that dozens of other people have made and regretted.
It really just takes one day to get this right. You need to distinguish between the raw bytes read from a device and the true string type (which needs to be 21 bit or greater). You need a trivial converter from one to the other (which you can presumably steal from MZScheme) and back.
That's it. You get this right at the beginning and you never have to backtrack or break code.
My apologies in advance if this post is based on incorrect premises. I'm trying to help.
So should I infer that the only reason UTF-8 is mentioned is that the reader APIs do not let you select the codec? Or is even that provided in which case it is accurate to say that Arc supports Unicode-in-general?
Arc uses MzSchemes reader (it modifies the readtable slightly to support []-syntax). AFAIK you cannot access the reader API from inside Arc. The reason Utf-8 is mentioned is that it is the default encoding when MzScheme reads or writes files or streams.
I don't think anyone at this point would claim that Arc supports unicode-in-general.
Could you offer a better solution? What would your solution offer that octets do not? Random character access? No, because not a single unicode encoding offers easy random character access (because they are made of possibly several codepoints, which, in some encodings, are made of more than one basic "chars"). Gylph, word and sentence segmentation? I guess not.
Abstraction. Having Unicode strings (i.e. strings that are a sequence of Unicode code points rather than octets) allows you to work with strings without worrying about encoding (except when doing IO, which is where encoding matters).
If you treat strings as octets OTOH even simple operations like concatenating two strings might lead to headache if the strings are in two different encodings. And how do you keep track of the encodings of individual strings? Madness lies down that road.
Does Icelandic have the 'th' sound? I've heard that English is the only European language with it, but if Icelandic has the written thorn, maybe you have that sound too?
It does, that's precisely what þ and ð represent (the unvoiced and voiced variants respectively, which got folded into the same "th" in English). Also, þorn is the best letter name ever :)
If you are wondering:
On Linux/X11, there's Ctrl+Shift+[unicde number in hexadecimal], gnome-character-map, umap or KCharMap (ت)
And now for the less serious part:
ሞሡሢ Am I the only one whom these Ethiopic characters remind of Tengwar? BTW, are there Unicode chars for Tengwar? I think there should be! (But not for Klingon, because it sucks.)
I have fun wirting this on my ⌨, but ℐ∫ ᚾℍℹ⑀ not pointless? Who cares? Anyway, now we can use distinct characters for Roman numerals: Ⅰ,Ⅱ,Ⅲ,Ⅳ,Ⅴ,Ⅵ,Ⅶ,Ⅷ,Ⅹ,Ⅻ,Ⅽ,Ⅿ!
Ye darn kids! Everythin we had was 7-bit ASCII, without parity, and we were damn greatful for it?
You think you had it bad? I had to use Morse code for browsing porn, back in my days! And I had to etch my public key into the wall of a rotten ol' cave! We did not have this fancy-shmancy routed network, i had to remember the way from here to there all by myself!
---
this post was presented to you by Too Much Coffee.
if you search for "Smjörið er brætt og hveitið smátt og smátt hrært út í það, þangað til það er gengið upp í smjörið." on Google this thread is the fourth result.
किसी वस्तु, व्यक्ति, स्थान, या भावना का नाम बताने वाले शब्द को संज्ञा कहते हैं। जैसे - गोविन्द, हिमालय, वाराणसी, त्याग आदि
संज्ञा में तीन शब्द-रूप हो सकते हैं -- प्रत्यक्ष रूप, अप्रत्यक्ष रूप और संबोधन रूप ।
Well, not quite. I gave Patrick an early version of the code, a couple weeks before Arc was released, and he immediately sent me this fix. I just didn't get around to incorporating it till now.
There's a difference between things I don't care about, and things I'm actively against. I don't care about character sets and css, so those things will no doubt gradually get better.
Classic static typing, however, I think is actually a bad idea in a general-purpose language. It makes languages weaker. So it's never likely to happen in Arc itself. However, one of the explicit goals of Arc is to be a good language for writing other languages on top of, and I can imagine plenty of languages for specific types of problems (e.g. circuit design) in which static typing would be a good idea.
It's not true that static typing always makes languages weaker. It makes map more powerful, for example: the desired type of output sequence can be inferred rather than having to supply a first argument of the same type like in Arc.
I used to agree with you, by the way -- static typing in most languages feels like a straight-jacket. ML wasn't enough to change my mind. It took Haskell.
Interestingly, heterogenous lists are the only example I ever hear cited for how ML-family type systems can cramp your style. It leads me to wonder if the situation is not unlike Fibbonacci sequences and naive recursion.
Anyway, I find that usually when I want a heterogenous list in Lisp, all I really need is a tuple. I want an ad-hoc way to group some values together (i.e., I don't want to bother creating a named structure), but I generally know the type I want in each position.
In the rare situations where I really do want a heterogenous list, Haskell does make it possible. The standard library has a Dynamic type that stores an arbitrary object along with a first-class manifest type identifier. These type identifiers have to be generated at compile time, but GHC has built-in syntax for this, and if it didn't it could still be implemented as a Template Haskell macro, or failing even that, just done by hand once for each user-defined type. That's all the support that's necessary from the core language -- the rest of the dynamic typing system is just an ordinary library.
Now, granted, if you wanted to use manifest typing for everything in Haskell, it would be ridiculously cumbersome and you'd be much better off just using a dynamic language[1]. But if you use it only where it's needed, then the dynamic casts will bloat your program by a couple symbols per thousand lines, and in return you get programs that damn near always work the first time they compile, along with a few other nicities like the one I mentioned above with map.
[1] There are plenty of cases where the converse is true. To name an obvious one, you could write a set of Lisp macros to implement lazy evaluation. But if you wanted to use them everywhere, you'd be much better off in Haskell.
So do I -- when I'm working in Lisp. Lisp gives you a Swiss army knife and lets you build specialized tools when you want them. Haskell gives you specialized tools and lets you build a Swiss army knife if you want it.
To be clear, this is what ML's variant types are all about. You can easily create a list that contains e.g. both ints and chars:
let mylist = [`Int 5; `Char '5']
Technically the elements have the same compile-time type, but the question is, what practical difference does that make? In what cases are variants an inadequate solution?
I don't understand why CSS or HTML are being mentioned during the design of Arc. These seem like library issues and your announcement of Arc was spoiled IMHO by the "rant" about HTML and tables. This is only made worse by the Arc Challenge which seems to be more about the design of libraries for HTML/HTTP etc. than the language.
If your language doesn't support anything but toy apps it quickly evolves to be optimized for building toys.
If the first Arc apps had not been full-featured Web apps, but had instead looked like examples from SICP, everyone would be complaining that the language was only good for computing Fibonacci sequences and writing interpreters for itself.
OTOH, you can't expect a new language to immediately offer the library resources of, say, Perl.
So the plan for Arc's early days seems to be similar to what the Pragmatic Programmer guys called the "tracer bullet" approach:
Tracer code is not disposable: you write it for keeps. It contains all the error checking, structuring, documentation, and self-checking that any piece of production code has. It simply is not fully functional. However, once you have achieved an end-to-end connection among the components of your system, you can check how close to the target you are, adjusting if necessary. Once you're on target, adding functionality is easy.
On day zero, Arc let you construct and deploy every aspect of a useful software system (a web app)... but it took a very narrow and direct path to that goal: emphasis on tables, no Unicode support, borrowing some functionality from an existing Scheme environment, etc., etc. That is what PG was trying to convey in his announcement: the strategic plan for Arc's early days is to work on designing a complete skeleton, but not add a lot of flesh.
I could tell from all the people already dissing Arc before it was released that whatever I released was going to be attacked on any possible pretense. So, like someone bracing himself to be hit, that was what I was thinking about as I was about to release it: what are people going to seize upon as a way of attacking it? Which meant that was what much of the initial announcement ended up being about.
It was a pretty odd situation to be in. If I'd been releasing Arc into a neutral environment, I probably would have said what I wrote in http://paulgraham.com/core.html. But maybe it's just as well I gave all the flames something to expend themselves on before talking about subtler questions.
I thought you handled it pretty well. Basically, you wrote a big sign saying "here is the bike shed", to make sure bike-shed commenters had something to occupy them. :)
Actually, it's probably beneficial to encourage flames. Your users are hackers, and flamewars are the only form of public dialogue among hackers. Ergo...
No, it doesn't. Elements of a Java collection must all be of the same type. The elements may be implicitly coerced to a common supertype, but if you want to get the original types back you have to downcast--which is basically explicit dynamic typing.
Or you could keep elements as Objects until you needed to perform a specific operation on them, then cast at the site and perform the operation, letting the ClassCastException propagate if you're wrong. This is basically what Arc does.
Have you looked into optional static typing (e.g. EC4 style)? The default Array type for EC4 takes types of all kinds as well. It is a satisfying middle ground.
Thanks for the serious reply. I'm not sure I deserved it :)
I appreciate your clarification of the distinction which is not clear from the "manifesto" (http://arclanguage.com/).
I suspected you deliberately mentioned ASCII-only and presentational markup because you knew it would tick off (and hopefully scare away) a certain type of perfectionist which you consider non-productive for explorative hacking.
What needed to be changed? I am no character encoding guru but I thought that treating strings as opaque octet sequences was good enough to "support" UTF-8. i.e. Unless you actively break it, it should work by default.
Apparently, while you were complaining, someone else was solving.