I'm kind of stumped by a seemingly simple problem in Db design, but the more I think about about it, the more it twists my mind.
Let's say I have columns C1 and C2 in a table. How do I enforce that the pairing of values of C1 and C2 is pairwise disticnt? That is, a specific composite value for C1 and C2, say X and Y, can appear any number of times in the table, but once X/Y appears, you cannot have X/Z, or W/Y.
It seems such a simple constraint to model, but several hours of research provides no clear solutions. I'm using PostgreSQL, but I guess the issue is db-agnostic. I explored partial indexes, but that feature seems too limited for this problem. Any ideas?
To put this in more context, my problem arises from trying to design a users table that maintains version history with the following columns:
- internal_id (auto-generated integer, primary key),
- external_id (varchar, not null),
- auth_id (varchar, not null, phone #, email address, etc),
- display_name (varchar),
- access_level (integer, not null),
- record_version_number (integer),
- record_version_created_by (varchar, not null),
- when_record_version_created (timestampz, not null),
- when_record_version_expired (timestampz),
- unique (external_id, record_version_number)
A record is valid from when_record_version_created to when_record_version_expired. This should basically work, but without enforcing the pairwise distinct relationship between external_id and auth_id, it would be possible to associate an active auth_id with more than one external_id, which would corrupt the data.
Split out the internal_id and external_id from the auth_id and dates.
Of course, this assumes you are controlling the input of the value for the external_id column. In the example below, the external_id is populated by calling a function to get a cuid version 2 ID by default. You may have some other mechanism that gives you a unique external_id.
Here is an example (in Postgresql):
create table users_hn (internal_id int primary key generated always as identity, external_id text not null default cuid2(''::character varying, 12, 'hackernewsusersxmpl'::character varying), -- or whatever method to generate an external_id display_name text, access_level int not null );
create table users_hn_auth (users_hn_auth_id int generated always as identity, users_hn_internal_id int, auth_id text not null, start_date timestamptz check(start_date >= ((current_timestamp::date)::timestamp)), constraint fk_users_hn_internal_id foreign key (users_hn_internal_id) references users_hn(internal_id) );
/* Intial insert of new user. Only generate this user row once. / with insert_user_hn as ( insert into users_hn(display_name, access_level) values ('John Doe 1', 1) returning internal_id ) insert into users_hn_auth(users_hn_internal_id, auth_id, start_date) select insert_user_hn.internal_id, 'johndoe1@example.com', current_timestamp from insert_user_hn ;
/ Update user auth table with new data, for a specific user. All changes to auth_id and when that event occurs are recorded here. / insert into users_hn_auth(users_hn_internal_id, auth_id, start_date) --values (2, 'johndoe1@example.com', current_timestamp) --values (2, 'jdoe1@example.com', current_timestamp) --values (2, 'johnd1@example.com', current_timestamp) values (2, '(555) 555-1234', current_timestamp) ;
/ Get user info This query will need changes as more users are inserted into the table. A function with parameter of the internal_id to retrieve a specific user, for example.
*/ select u.internal_id, u.external_id, u.display_name, u.access_level, a.users_hn_auth_id, a.auth_id, a.start_date from users_hn u join users_hn_auth a on u.internal_id = a.users_hn_internal_id and a.start_date = (select au.start_date from users_hn_auth au group by au.users_hn_internal_id, au.start_date order by au.start_date desc limit 1 ) ;