Hacker News new | past | comments | ask | show | jobs | submit | TheSpiciestDev's comments login

Curious, how long have you been running your service with them? A few months?.. years?


This was the first year I've been with them, and pre-paid for a year of dedicated servers.


What was that bot that MSFT stood up on Twitter that trolls and memers fed to turn alt-right? I know they eventually took it down and that it stirred up a lot of controversy.

I would not be surprised if someone found some Copilot output stemming from "gender" and reported to MSFT/GitHub for them to simply short circuit or "break" after finding certain keywords.


Yeah they probably found something like: assert gender in ["male", "female"]. If this is enough to trigger a backlash then maybe we deserve whatever fate has in store for us.


But "we" and "the backlashers" are not one group.



Tay AI


I do remember hooking up Heroku to Github for auto-deployments and thinking to myself something along the lines of, "why does Heroku need ALL of this access?"

It'd be great if Github could allow read/write permission grants on a per-repo basis. Maybe they do already!.. in which case I'd much rather have and setup that granular detail than have a token that goes across all my public/private repos...

Edit: I do see in my Github's integration page that the Heroku connection was used within the past week... but it doesn't show how exactly it was used. Until Github can provide specific details, is it safe to assume that all repos, public and private, could have been cloned?


> It'd be great if Github could allow read/write permission grants on a per-repo basis. Maybe they do already!

They totally do. Shopify's Github integration works this way, and it is fantastic!


These are known as GitHub apps and were designed to address this problem. https://docs.github.com/en/developers/apps/getting-started-w...


Thanks and I'm not surprised, it's a pretty intuitive feature! So really the only thing Heroku gets from these all-inclusive tokens is something to drive their type-a-head input on their integration page, right? Totally not worth it, I'd rather use Github's prompt.


I feel like a graph database is a solution to an issue I've faced (and, continue to face) and it may just be because that I haven't spun one up and tried or that the documentation/examples don't stick out. But could someone confirm my feeling? If my feeling is correct, I'd enjoy verifying it with EdgeDB or the like.

My example/requirement: I have a user wanting to find best-matching blog posts. Every post is tagged with a given category. There could be 100+ categories in the blog system and a blog post could be tagged with any number of these system categories. A user wants to see all posts tagged with "angular", "nestjs", "cypress" and "nx". The resulting list should return and be sorted by the best matches, to those of least relevance. So, posts that include all four tags should be up top and as the user browses down the results, there are posts with less matching tags.

What I've seen with SQL looks expensive, especially if you search with more and more tags. I may just not know what to search for though, re. SQL. Is there a query against a graph database that could accomplish this?


EdgeDB employee here. I couldn't have asked for a better question to demonstrate the power of subqueries! Here's how I'd do this in EdgeQL:

  with tag_names := {"angular", "nestjs", "cypress", "nx"},
  select BlogPost {
    title,
    tag_names := .tags.name,
    match_count := count((select .tags filter .name in tag_names))
  }
  order by .match_count desc;
Which would give you a result like this:

  [
    {
      title: 'All the frameworks!',
      tag_names: ['angular', 'nestjs', 'cypress', 'nx'],
      match_count: 4,
    },
    {
      title: 'Nest + Cypress',
      tag_names: ['nestjs', 'cypress'],
      match_count: 2,
    },
    {
      title: 'NX is cool',
      tag_names: ['nx'],
      match_count: 1,
    },
  ];


Do I understand this correctly in that if the list goes on, it will also show posts with unrelated tags in tag_names? As the filter is only applied to match_count?

So to only get blog posts with matching tags we would need to add a filter „match_count > 0“, right?

Update: I am very excited about EdgeDB :)


Yep, you understand correctly!

  with tag_names := {"angular", "nestjs", "cypress", "nx"},
  select BlogPost {
    title,
    tag_names := .tags.name,
    match_count := count((select .tags filter .name in tag_names))
  }
  filter .match_count > 0
  order by .match_count desc;


So as I read TheSpiciestDev's comment, he's complaining that making his query in PostgreSQL is slow. It looks like EdgeDB is a frontend to PostgreSQL; how will it help with TheSpiciestDev's problem?


The problem sounds like something that could be solved with a GIST index. EdgeDB doesn't yet have a way to specify the index type, though, mostly because we aren't sure what would be the best way to do it without things becoming too Postgres-specific in schemas.


What do you mean by "too Postgres-specific"? Will you be supporting other DBs behind the EdgeDB interface in the future?


This is not something we plan to do in the near future, but it’s also not outside the realm of possibility. We picked Postgres because of its power, quality and unparalleled extensibility, but we are also very careful to not leak any implementation details into our interfaces.


I'm curious how this squares up with what someone linked elsewhere: https://github.com/edgedb/edgedb/discussions/3403

> EdgeDB does not treat Postgres as a simple standard SQL store. The opposite is true. To realize the full potential of the graph-relational model and EdgeQL efficiently, we must squeeze every last bit of functionality out of PostgreSQL's implementation of SQL and its schema.

I don't see how this and what you're saying can both be true at the same time. Is EdgeDB tightly coupled to the implementation of PostgreSQL, or isn't it? Is there really a chance that EdgeDB could support other databases, or not really? I don't think there's anything wrong with the answers being "yes" and "no", respectively; that's actually what I'd expect. It would be more unusual to try to do this in an implementation-agnostic way.


I think what they mean by this is that EdgeDB's query language should not be coupled to Postgres, but EdgeDB itself should use Postgres specific SQL features to maximise performance - so you couldn't drop in MariaDB without changing code in EdgeDB, but in theory you could write another backend that takes the same queries and uses MariaDB or MongoDB or something custom under the hood.


Exactly, implementation coupling vs interface coupling.


I'm feeling very disappointed to see that you're using two space-separated words for "order by" in your language. Do "order" and "by" have meanings on their own in independence such that the meaning of "order by" arises naturally via their composition/conjunction? If not then surely it should be "orderby" in your language.

SQL definitely must be replaced, but the silly pseudo English syntax is one of the things we want to get rid of, not retain.


Do you have any timeline on when i can use this with rust?

Found this: https://github.com/edgedb/edgedb-rust


select posts.name as post, count(post_tags.id) as matches from posts,post_tags where post_tags.post=posts.id and post_tags.tag in ("angular","nestjs","cypress","nx") group by post_tags.post order by matches desc; Test data @ http://pratyeka.org/hn.sqlite3


Is it though? Simple IN with an ORDER BY on the same match will return the correct ranking. More info on ranking here https://www.postgresql.org/docs/current/textsearch-controls....


Am I right in reading this as the parent comment envisioning a “post” table and a “tag” table, and you’re suggest the “post” table just have a “tag” column?


I see just one table

> Every post is tagged with a given category. There could be 100+ categories in the blog system and a blog post could be tagged with any number of these system categories.

My point is, I don’t see SQL query as expensive for this kind of use case. There are easy and native ways to do it.

In case you would like a top notch performance, Redis might be a way to do it. Even a reverse-index would achieve great performance.


That's a standard many-to-many relationship that would normally be implemented by three tables:

    +---------+-------+---------+-------+-----------+
    | post_id | title | content | other | fields... |

    +--------+------+
    | tag_id | name |

    +------------+---------+--------+
    | tagging_id | post_id | tag_id |

But it seems like the core of the request is still something like:

    SELECT post_id, count(1) AS count
    FROM taggings
    WHERE tag_id IN (3, 8, 255)
    GROUP BY post_id
    ORDER BY count DESC
(off the top of my head; I haven't checked this for any kind of correctness)

And I don't see why that query suffers as you add tags...?

------------

EDIT responding to below [HN believes I am a problem user who should only be allowed to make so many comments per day]:

< that is pretty much what I meant by “I see just one table” as you don’t need any joins

Well, assuming you're doing this because a user is interacting with your site via some kind of web interface, you can set the interface up to deliver you tag_id values directly, but you'll still need to do a join with the posts table so you can present a list of posts back to the user instead of a list of internal post_id values.

So I guess

    SELECT t.post_id, count(1) AS count, p.title, p.url
    FROM taggings t JOIN posts p ON t.post_id = p.post_id
    ...


Confusing but that is pretty much what I meant by “I see just one table” as you don’t need any joins (atleast with the same design you outline)


I do these often with standard GraphQL queries, often over Postgres. Now I’m curious about the performance difference compared to an SQL ORDERBY or similar EdgeDB implementation!


Thanks for posting this. Kind of comment that adds value to the discussion by illustrating how a piece of tech can or cannot be useful.

I just happen to have a very similar requirement to yours and was also wondering.


Maybe I misunderstand what you are saying but it sounds pretty straight forward in SQL.


I'm currently investigating whether Redis Bloom [1] could be a good tool for similar requirement.

[1] https://github.com/RedisBloom/RedisBloom


But then what would happen if the GitHub token leaks? Would someone then be able to retrieve their own credentials as if they were your CI/CD pipeline? I feel like it be hard to audit that because a baddie would then be able to blend in with your CI/CD pipeline's traffic.

But you say you find "management of AWS Credentials a pain", so I guess this isn't for security purposes, right? More of just a convenience?

Don't get me wrong, I'm all about lessening the amount of environment variables in a pipeline!.. especially with ones that you want to rotate!


The GitHub token that is used is a short-lived token that is generated new every time a GitHub action is run.

Ref: https://docs.github.com/en/actions/security-guides/automatic...

And the SAML.to backend first checks to make sure the token is valid by invoking:

Ref: https://docs.github.com/en/rest/reference/apps#list-reposito...

I haven't checked, but I assume GitHub invalidates the token when the GitHub Action finishes


Working from home for the past 2 years has allowed me to help a lot with my own growing family. I do clock in my normal work (if anything I do more now because of my home setup, being able to jump in and out of work, whatever the hour.) My kids have certainly grown up accustomed to the setup, too. We've got good boundaries and I'm able to see and take part in a lot of their "firsts", all while continuing to do good work.


Note: my income has never been dependent on local/small clients (small being <10 employees) such work has always been "extra income" for me or to build relationships with others.

I've fixed bugs for free (if I failed to interpret requirements or made a mistake) but otherwise I would offer change orders for anything outside the agreed scope. Depending on the changes or how they fit into the existing work, these would possibly be discounted or free, especially if they are made early enough.

That said, I've been solid and explicit in the agreed scope. This goes for every project or org/client, don't get me wrong, but I've found this is much more important with smaller orgs/clients.

Otherwise there are a lot of great suggestions already made elsewhere in these threads. I haven't really ever had issues getting paid. I believe having a clear scope and asking clarifying questions also builds a better relationship (which I think contributes to better engagements, reduces other headaches, and leads to referrals!)


I'm reminded of the scene in The Matrix where the controllers to the gates of Zion are in a simulation managing the processes. [0] Instead of managing things outside the matrix via a terminal/monitor, they're given a virtual environment with their controls. Your comment makes sense to me!

[0] https://www.youtube.com/watch?v=fJDNUcIvxH8


Over the past few years I've searched for something like this, here and there, thinking I wasn't using the right search terms and then giving up. I'll certainly give this a try!

Out of curiosity, are there any other comparable projects you know of? I wondering what I had to have searched for in the past hah


Only project I know is the Close.io fork of open source Nylas Engine (the original project was discontinued years ago) https://github.com/closeio/sync-engine


https://gandi.net has always offered free WHOIS protection, at least for those domains that support it, if you look up any of my domains with them it's all pretty much hidden. I think some domains require the owners to have actual info, and not their registrars?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: