Same here. I was thinking maybe I'd give microphone permissions but didn't see why I had to show my video. Does the clone see my face? Maybe it does. That may creep me out more tho lol.
I'm working on an app to address this issue. While putting a cost calculation is a good way to raise awareness, we take a different approach to a solution.
It's called Meet Robbie (https://meetrobbie.com) and we give you an agenda with timers, minutes, and action items and you can use it directly in Zoom.
It's main use case is for recurring meetings where you have a list of business to get through. It encourages common sense things like having an agenda, being mindful of time, and keeping organized minutes.
We just released our so you can try it out. Would love any feedback and thoughts on our approach!
say --rate=500 "Peter Piper picked a peck of pickled peppers. A peck of pickled peppers Peter Piper picked. If Peter Piper picked a peck of pickled peppers, Where’s the peck of pickled peppers Peter Piper picked?"
I'm really not sure what google is thinking with this one. It's not the biggest money maker but google domains is viable and a really easy to use product.
Wondering what other core web services google may sell off.
I actually have been using a combination of a numeric integer ID as the PK and a UUID as a lookup field to do routing lookups etc. for this purpose in Postgres backed app I'm working on.
I found this approach to be more trouble than it's worth and plan on switching to a UUID PK key and doing away with the integer sequence.
Here are the complications I ran into:
The libraries I'm using for the ORM and API are designed to work with a primary key for single record get access. For example they want you to do `resource.get(ID)` where ID is the primary key, however, I now have to do `resource.find({ where: { uuid: 'myuuid' }})`. This is for all resources on all pages.
In Postgres, the integer PK sequence has a state that keeps track of what number it's at. In certain circumstances it can get out of sequence and this can trip up migrations.
We already have created_at and updated_at fields and these are probably better for ordering than the sequencing.
Had I to do it again, I would just use UUIDv4 until it runs into issues and either date fields or a sequence where necessary. If anyone has better ideas I would be most grateful as this is something I go back and forth on.
If you do not have two separate forms of identifier AND you have a "public" API (including basically any client apps, frontend JS, or anything found in query params) then you are making compliance with European regulators a massive headache when it comes to erasure of PII, since shared identifiers must be destroyed one way or another. Trying to merely delete the records is complicated by the fact that you need legal holds to comply with various federal laws.
Between the work involved in using two identifiers, one for joins and one for external lookups, versus the work involved in manually coding up all sorts of erasure work arounds, something I was in charge of in the past, I would strongly consider just using two IDs.
If your ORM gets in the way, just modify the ORM. This is easier than you'd think. For example, in Django just make a helpers module with something like this:
class OurModel(Model):
def get(...):
And have some sort of programatic way (lint, etc) of ensuring that your models.py doesn't use the stock class. It's simpler. It ruffles some feathers at first, but if your framework is getting in the way of a real use case just change the framework and don't worry about it.
I think you raise some good questions around IDs and PII and we definitely will be tackling GDPR sooner or later.
I don't quite follow on the European regulation issues raised by using as a UUID in a route and that being the PK of the record.
I know you should not expose PII or any information that can be used to identify a person, however, in our case any route is behind an authed login on an SSL connection which encrypts the path (we don't use query params).
The only place that contains data that ties a UUID to a person is in the database. This would be the case whether we used a PK as an integer or not.
Could you elaborate or share any resources around dual IDs DB design for PII compliance? That would be super helpful.
Regarding framework hacking or workarounds, I have a principle to not go against the grain of a framework. The reason for this is that modifying/hacking adds complexity when building on top of it or onboarding other software engineers. If necessary I'll do it as a last resort.
> any route is behind an authed login on an SSL connection which encrypts the path
If your application only services a single user with their own resources then you have nothing to fear. But few applications meet this definition. If, for example, you're running an invoicing application, then at some point you'll want to share some resource, say an invoice or an expense or a time sheet, with another party. If your API exposes the identifiers from one resource to another, or even a user's id when potentially adding them to a team, then these identifiers are considered PII according to European regulators.
I understand that this is frustrating, but it comes from a posture that prioritizes right-to-be-forgotten over programmer ergonomics. Imagine, for example, API crawlers that hit your /search endpoint with email=[some predetermined list of emails] and harvest user ids to match with future data.
In the end, the best thing you can do is keep join keys internal and API keys separated. There are other workarounds, but they're so much trouble that they aren't really viable alternatives. Now, whether you use UUIDs for both identifiers or UUID for external and integer ids for join keys is up to you and your performance and scaling requirements. Personally, I prefer integer keys for internal unless I really expect the database to grow to more than 200m rows before the company hits 1000 people, since int ids mean you do not need secondary indexes on things like the created_at fields, but even there, it's not such a big deal to have an extra index on every table.
> I have a principle to not go against the grain of a framework.
> hacking adds complexity
Here we essentially agree, but with the right integration tests, upgrading and onboarding is a lot easier than feared. That said, do not add to the framework unless the benefit is worth it.
I don't really find these considerations frustrating just a bit tricky but regardless definitely agree with GDPR and on board with keeping PII secure from the get go.
I'm still having a little trouble grokking when an ID becomes exposed or shared so I guess I'll just have to read up on this as it's certainly important.
In our system I realized user IDs are not shared nor linked to (at least not yet) so in actuality the case where there's a URL with a UUID representing a person does not occur. Content generated does not reference UUIDs for persons either. There are URLs with UUIDs representing other types of resources.
By API key I take it to mean an access key for an external reference. That's a good idea for replacing the PK integer with a PK UUID but keeping an external UUID field. That would satisfy the concern with maintaining integer sequences and migrating data.
Anyway this has been helpful so thank you for sharing your thoughts and I have some things to look go on to stay in the good graces of European regulators.
That sounds like a very simplistic framework and I'm sure you could do some metaprogramming to abstract boilerplate. Like you couldn't do database multitenancy with those constraints.
I've used the int keys and UUID public keys on multiple projects - it wasn't an issue for EF core or RoR
I'm simplifying a bit for brevity and we can do some abstraction to handle it so it's not that the framework is simplistic. I'm just having trouble justifying adding this complexity with two types of IDs.
I personally find the numeric id extremely valuable for internal data analysis and sharing. I can refer to rows by the numeric id, including a range of rows, and seeing the ids gives exactly that intuitive information about it's relation in the set that we are hiding from end users. Numeric ids can also be used for the same reason in an admin-only UI.
On the efficiency side, joining and querying by id is generally more efficient on CPU usage for querying, but you do have to pay the cost of having the additional column and index.
This was generally the reason I went with numeric ID as PK originally. It makes working with and analyzing the data as well as cross referencing relations easier.
For all my tables I have a base schema that looks something like this.
The concern I have is when I have to distribute my system when scaling. Those numeric IDs will have to be replaced with the UUIDs so I figure I might as well do it now.
Everything breaks at scale. In my experience most tables don't end up with more than a few million rows and will work fine with this. If you did want to transition a large table to be UUID only, the nice thing about this approach is that you could do it with no down time. If you are using a DB that only scales writes vertically though (most DBs, including distributed DBs) then how are you actually going to scale the DB layer horizontally? Pretty much just CRDB (PG) or TiDB (MySQL) are the options there- look at their docs for how to setup your ids.
I'm not so much concerned with figuring out scaling in terms of volume as I expect to be able to handle millions of rows in a single DB and that would be an implementation detail and fine tuning. I'm more concerned about scaling in terms of complexity and keeping the system easy to reason about when more people, tech are involved.
Lets say I have a <CAR>-[1:N]-<TRIP> in two tables in a relational DB. This works fine at first even for millions of rows as you say.
At some point in the future it makes sense to have these two entities managed by different team/services/db. Let's say TRIP becomes a whole feature laden thing with fares, hotels, itinerary, dates. So I need to take this local relation and move it to different services and different DB.
If I had been using an integer PK/FK this would be a more complicated migration than if I used UUIDs.
My assumption is that we would not want to have a sequenced integer key used in a distributed system.
In other words it seems safer bet if there's a possibility of needing to move to a distributed system to use a UUID for the key from the beginning.
I think switching this with zero downtime to do foreign key references with UUIDs will be easier than any of the pain you would deal with from having to do cross-DB joins.
What specific issues are you worried about with the integer key? Usually the issue is dupming data into something like a staging or development environment rather than a production concern. If you attempt to dump 2 datasets into one db you will have a conflict. Or if you write to an environment and then dump into you will have a conflict.
Mainly portability of data and options for the future. I'm all on one postgres instance right now and don't plan on breaking it up until necessary. If at some point I need to take a table and move it to another type of database I want that migration process to be straightforward. If I have integer keys with sequencing behavior I anticipate having to do that porting. That internal key would then become external to do the lookup and if it's an external key I want it to be UUID for security as well. Integers as IDs are guessable so I want to keep them internal.
Is there a reason web scrapers aren't used for public sites with content? It seems like this would avoid the pitfalls of APIs and changing terms and prices?
They were, in a much less constrained way, until companies running platforms started suing people for scraping. And investing heavily in making it difficult technically. The difference is advertising. The vast majority of people do not want to be advertised to, given the choice, nor do they want to have their contract with a platform downgraded (by having to use the UI designed by the company). This situation is some of the users taking a stand about it.