More

jclem · 2024-02-10T13:31:47 1707571907

I grew up in the Midwest, and my father, who went to the University of Kentucky, was fraternity brothers with Ben Gish, and so growing up it was a name that I heard often when he’d tell stories from that time. I remember always being aware that a friend of my father had been framed for burning their family newspaper.

I live in New York City, now, and one evening found myself sitting at a bar in Brooklyn where my roommate, the bartender, was closing around 3am (at the time I worked a very late shift at the 24 hour Apple Store). The normal patrons had all gone, and myself, my roommate, and another man I’d never met were locked in having a last couple of beers while my roommate closed up. I had a long conversation with the man I hadn’t met before that eventually turned to where we grew up. It turned out that this man was Ray Gish (also mentioned in this article), the younger brother of Ben. Ray even recognized my father’s name, because Ben had mentioned him before.

Ray invited my roommate (whom he already knew well) and I over to his place for a last drink, and I vividly remember him playing “Street Hassle” by Lou Reed on his record player. It was the first time I’d ever heard that song.

Anyway—wildest coincidence that has ever happened to me. I see and say howdy to Ray every now and then when I find myself in that part of Brooklyn—one of the most genuine people I’ve ever met.

jclem · on Aug 3, 2023

You can reduce with list comprehensions in Elixir, but it's uncommon to see:

    sum_list = fn list ->
      for item <- list,
          reduce: 0,
          do: (sum -> sum + item)
    end

    sum_list.([1,2,3])

Other comments are correct that `Enum.reduce/2` is probably better:

    Enum.reduce(list, &(&1 + &2))

Obviously you don't have the flexibility / mutability you refer to in other comments with either of these, but you can always put more in your accumulator:

    get_sum_and_update_count = fn list, map ->
      for item <- list,
          reduce: {0, map},
          do: ({sum, map} -> {sum + item, Map.update(map, item, 1, &(&1 + 1))})
    end

(A contrived function that returns a list sum and an updated map with the count of the values found in the list)

jclem · on Aug 3, 2023

It seems worth noting that the input of gte-small is limited to 512 tokens. The tokenizers for sure aren’t the same, but I imagine this is significantly less than ada-002, whose input is limited to 8191 tokens. That said, I don’t imagine that embedding huge full documents is necessarily the right approach.

I would love to see a comparison for some typical use cases using various methods of chunking input documents.

egorr · on Aug 3, 2023

should be yes, but even in examples from openai, they usually do splitting into chunks

For example in chatgpt-retrieval-plugin[0] repo default chunk size is just 200 tokens

this is anyway a limitation, no doubt, but chunking is pretty often used

[0] https://github.com/openai/chatgpt-retrieval-plugin/blob/main...

jclem · on March 28, 2023

How are you stuffing long text into GPT context? Are you doing some form of summarization?

jclem · on Feb 26, 2023

This package is using GPT-3 via the “ChatGPT” export from that module, which is—somewhat misleadingly—not ChatGPT, but GPT-3.

survirtual · on Feb 26, 2023

That package recommends using a ChatGPT proxy. This proxy has the ability to access ChatGPT in a way that OpenAI hasn’t been able to stop, but it requires a configuration file that is not open source.

Everyone using this proxy needs to provide an OpenAI ChatGPT access token to the server. Let me break this down:

Using the ChatGPT npm package enables an opaque third party access to your credentials to use ChatGPT — or exactly what a botnet / social media manipulation operation would need / want for a convincing bot. They just have to distribute load among all the active access tokens they’ve collected from users.

DO NOT use this library.

DO NOT trust code from authors who either don’t see this obvious vector or are in on it.

To recommend using an opaque third party proxy with no encryption is not acceptable. This lets someone peep into your conversations with the bot on top of the other malicious uses with credential hijacking. And while OpenAI is peeping as well, they are at least using the data to advance AI and most researchers have a deep relationship with the ethics of their field.

Here is the repo in question: https://github.com/transitive-bullshit/chatgpt-api

jjuliano · on Feb 27, 2023

You are right. However, nothing is really secure. As Emails still operates on a store-and-forward model, where your message, jumps from server to server (akin to UUCP in the 60s). Even SMTP is not secure in itself without authentication layers.

And also HTTPS is still sent as plain-text. Cert authority in itself doesn't have the keys to decode the text, it just an authority to show the plain-text, but all along, it was a plain-text.

survirtual · on Feb 27, 2023

HTTPS is not plain text. Only the initial DNS resolution is (www.google.com). Everything after that is encrypted — address, payload, etc.

The cert authority simply signs a cert saying “this public key belongs and is controlled by the owner of this domain name”. Since we both trust the cert authority, that signature allows us to prevent mitm attacks.

From there, we can do a Diffie-Hellman key exchange and derive our secret key for encryption / decryption.

That is secure and is the backbone of the internet today. It allows all of us to send messages to an intended recipient without worrying about other parties prying into our business.

A proxy introduces an unnecessary and unvetted third party into an exchange. There is significant financial and political motivation for hijacking sessions for higher access to the chatbot & future versions of it. It is not a good pattern to make a habit of.

jjuliano · on Feb 28, 2023

I am speaking from professional experience, but I am not an expert.

I used to work professionally for a Cybersecurity company in the past for just 3 years, it was just a short tenure, so my views are plausible.

I have design MITMA boxes for WIFI and HTTPS (For capturing/understanding botnets in honeypots), so I've seen how plain-text HTTPS are. (But again, I am wrong, as I am speaking from experience.)

survirtual · on March 3, 2023

Maybe you’re talking about some of the headers? Idk.

It doesn’t matter in any case as OpenAI released the ChatGPT official API, so the original post is irrelevant. That package will transition to the official API and be should be usable.

jjuliano · on Feb 27, 2023

While there is currently a waiting lists to use the official ChatGPT API, the package uses an unofficial ChatGPT API. Surprisingly, the unofficial libraries are much more stabler (not much dropping of requests or timeout issues) than the official libraries from OpenAI.

simse · on Feb 26, 2023

Ahh you're right. I've been fooled.

jclem · on Dec 28, 2022

Maybe you mean Vanderbilt Ave? Vanderbilt St has no commercial zoning.

What many people don't know is that 3 of the pricier restaurants on Vanderbilt Ave are all actually the same restaurant group. Their background is partially the 3-Michelin star Alinea in Chicago via chef Greg Baxtrom. Olmsted was first, then came Maison Yaki, and then Patti Ann's.

Olmsted is mostly very good (started great, then got bad during part of the pandemic, now is improving again), but you can still joke about being sold $8 vinegar from them for sure, haha. However, since "midwestern" cuisine is now becoming somewhat trendy, jokes about expensive port wine cheese balls at Patti Ann's ($12) may be more appropriate.

SkipperCat · on Dec 28, 2022

I live 2 blocks away from Zaytoons on Vanderbit and got the name wrong. I now have deep Brooklyn shame..

Those places are really nice, but so busy. Hard to just walk in and get a table.

That street has really changed in the past 5 years. I really like the semi-new Indian place next to Bicycle Habitat. I don't know why they sell bagels and pizza, but their Indian food is pretty good.

Plus the summer streets are nice. Really give it a neighborhood vibe.

jclem · on Dec 28, 2022

Hello from slightly further south on Vanderbilt Ave :) I haven't tried that Indian place yet, but it's on my list, now. The pizza and bagels threw me off (not because "Indian == bad bagels/pizza", but because "somewhat random assortment of foods" is sometimes a red flag).

SkipperCat · on Dec 28, 2022

Not just the choice of food, but the decor inside looks like it was thrown together with spares from other restaurants. But, the food is good. Just as good as Joy Indian on Flatbush. Try their chicken jalfrezi - its delish.

GauntletWizard · on Dec 28, 2022

Was this comment generated by the same generator? :D

Gushing about the credentials and pedigree of a particular restaurant and bar is also very Brooklyn

jclem · on Dec 28, 2022

First, let's replace "gushing" with "discussing", because that's what was happening. I'm not sure "got bad during part of the pandemic, now is improving again" counts as gushing.

In what way is caring about your local food options in your immediate neighborhood, on the street you live, where your friends and acquaintances eat, drink, and work "very Brooklyn"? I guess if you live somewhere that restaurants just appear and have faceless owners with employees you'll never know, this isn't a thing. This is something discussed by really anyone who is interested food and restaurants in their area all over the world. If you like what a person did at one restaurant, you may like what they do with a new one. Or, maybe it's noteworthy when only the first one is great, and the subsequent ones are mediocre.

bombcar · on Dec 28, 2022

Has the Juicy Lucy migrated east yet?

jclem · on Dec 20, 2022

It's $50/year.

benhurmarcel · on Dec 21, 2022

Of course, my bad, I went too fast.

jclem · on Dec 10, 2022

An LWW register is most certainly a CRDT, it just has limited use. I think you’re making a good point about how one must be careful applying CRDTs, but I don’t think it’s accurate to say that a suboptimal merge strategy in terms of user experience means that something does not meet the definition of a CRDT.

In other words, the “conflict-free” part of the term refers to the fact that all replicas converge on the same value when all updates have been exchanged. What you’re discussing seems to have more to do with concerns around intention preservation when using CRDTs. In other words, an LWW register on a text field is still a CRDT, but in many applications it is potentially a poor choice in terms of preserving the intent of user-generated operations, but determining this is something done in a case-by-case basis.

If the content is a long document which users collaborate on over time, then a register is most certainly a poor choice. If it is a short text field akin to a label, an LWW register isn’t necessarily bad, because user intent is likely to replace the entire value, rather than to perform minute edits on the value.

danbruc · on Dec 10, 2022

I just reread the Wikipedia article on CRDTs and was actually wrong. I only had a subset of CRDTs in mind but they actually encompass a much wider class than I thought. In some way I now understand the fuzz about them even less, they now look almost trivial, but maybe I am still not understanding some important aspect properly. Should maybe read the paper introducing them. But at the very least I should not complain about people calling things CRDTs that actually fall under the actual definition.

kiwicopple · on Dec 10, 2022

> But at the very least I should not complain about people calling things CRDTs that actually fall under the actual definition

that's... a very reasonable response. Nice one.

I think the fuzz comes from the more esoteric CRDTs, which solve actually hard problems (traditionally solved by Operational Transforms). But I don't think I could have created a simple example in the blog post for one of these.

jclem · on Oct 20, 2022

It’s hard to get a sense from the site or from the actual idea pages what the goal is. Removing the BQE (https://transformyour.city/vision/new-york-city/the-bqe ) would affect millions of people and businesses, and simply saying “transform it into a linear park” comes across as incredibly naïve. I would feel foolish signing this petition because…what’s the actual plan?

Sure, maybe there is some way to do it with a multi-decade effort, an enormous cost, and a fundamental shift in NYC’s layout and even economy, but something about how MASSIVELY oversimplified the statement “Remove the BQE” is makes it hard to take this seriously.

I applaud the sentiment, though! And I hate the BQE.

Edit: Also just to be even clearer for anyone not familiar with NYC geography: the BQE is a stretch of interstate highway 278 that connects Brooklyn and Queens.

This isn’t the High Line, built from a disused rail line. It’s proposing removing a stretch of interstate and putting a park in its place.

Edit 2: Ok, apparently the goal is to gather signatures and then show elected officials. The minimum bar appears to be 100 signatures. I have a hard time understanding what outcome is expected when an elected official in NYC is shown a petition in which 100 people say we should rip out part of the interstate (and one of NYC’s most important stretches of road) and put in a park.

tootie · on Oct 20, 2022

If we got 8 million signatures they couldn't move the BQE. Closing streets is like banning alcohol or abortions. If you choke off supply it just makes the price go up. The solution is to throttle _demand_. Like making our subway stations less dank. Curing America's car addiction is not going to be easy.

addicted · on Oct 21, 2022

Yeah, but the BQE isn’t in random place America. It’s in a city which has less than 30% car ownership.

You could ban all private cars from the roads tomorrow and people in NYC would be getting around much faster (including those who currently drive) if you do nothing else but increase bus service and frequency, as they’re not log jammed because of higjly inefficient cars anymore.

quickthrower2 · on Oct 23, 2022

Silly question from afar: could you tunnelify the BQE (no digging, just cover it!) and drop a park on top? Best of both worlds.

Seems silly to rip up roads when self driving electric buses could be running down them (or at least get their own priority lane).

clairity · on Oct 20, 2022

put the BQE underground and you get the best of both worlds. scrub all the pollution (not CO₂) out of the exhaust and it's environmentally positive too. sure, it's expensive, but we're the richest country in the world. boston did it and everyone loves the result now despite the grousing about cost and inconvenience during construction.

addicted · on Oct 21, 2022

If you’re digging an underground tunnel in NYC throwing low throughout roads in it is the worst idea possible.

There is absolutely no reason to not put a subway track, which would have an order of magnitude higher capacity, in an underground tunnel instead.

clairity · on Oct 21, 2022

you could also put in a subway track down the middle while you're at it, but the primary point of undergrounding the road is so we get our most precious above ground space back for people, not cars. it's not simply about optimizing throughput, though that's also a worthy, and compatible, goal.

anamexis · on Oct 20, 2022

Boston's Big Dig was a 1.5 mile (2.4 km) tunnel under the harbor, that cost 22 billion dollars and took 15 years to build.

The BQE is 11.6 miles (18.7 km) long, and runs through Brooklyn and Queens.

zachkatz · on Oct 20, 2022

So remove it. Way simpler.

sedan_baklazhan · on Oct 21, 2022

This is not a "plan". What are the proposals for transit on the place of BQE? Where will all the traffic from BQE go? To local streets?

clairity · on Oct 21, 2022

you're right, it's an online discussion. the traffic from the BQE already goes to local streets. the point is to give the above ground space back to people, and perhaps clean the air while we're at it.

sedan_baklazhan · on Oct 22, 2022

Currently the traffic enters BQE on one local street and exits it most likely far away. When BQE is removed, the traffic will go through a lot of local streets, increasing noice and air pollution in residential areas.

clairity · on Oct 22, 2022

not remove, but rather replace. put the BQE underground.

sedan_baklazhan · on Oct 22, 2022

This is VERY expensive. A lot of new subway lines could be built at the same cost.

clairity · on Oct 22, 2022

yes, cost was mentioned in my original post.

MisterTea · on Oct 20, 2022

There is a ton of underground infrastructure this idea would run afoul of.

clairity · on Oct 20, 2022

pretty sure the contractors and engineers would consider that before they started digging. do you think boston didn't have underground infrastructure already?

jclem · on Sept 16, 2022

While “Arcadia” is likely of greatest interest to HN readers of all of Tom Stoppard’s plays, I also highly recommend “Jumpers” and “The Hard Problem.”

Also worth noting the many excellent films he wrote, like “Brazil” and “Empire of the Sun.”

ska · on Sept 16, 2022

Not to mention Rosencrantz and Guildenstern Are Dead, which I guess is both play and film.

blipvert · on Sept 16, 2022

Happy to be corrected, but screenwriter for EotS, surely, as JG Ballard wrote the book?