- Good rapid communication about who is working where. People are generally not touching the same code or else you'd run into frequent collisions (solvable via rebasing of course but they would be doing more branching if it were a thing to happen very frequently I'd suspect)
- The developers are given autonomy and have assumed some level of mastery over whatever their domain is. Trust in each developer's ability to commit well-formed and considered code.
- They have a comprehensive test stack, which helps verify the above point and keep it sane
IMO, code review is a cornerstone of code quality and production stability - the number of dumb (and smart!) mistakes in my code that have been caught in CR are numerous, and it's a big portion of my workflow. There are times when I feel it's redundant (one-line changes, spelling mistakes, etc), but I wouldn't trade those slowdowns for a system where I only got review when I explicitly wanted it.
Of course, for pre-production project and/or times when speed is of the utmost concern, dropping back to committing to master might make sense, but for an established and (I'm assuming) fairly large/complex codebase, I would think that it would be best for maintainability and stability to review code before it's deployed.
Our tooling will tell us what cases have introduced between our current deployment version and the to be released version. It will tell us that all cases have been reviewed and tested and ready to go between those two versions. We usually constantly deploy, so all cases ready are deployed asap.
Only issue is that you can get 'blocking' cases but that's fairly rare. Big cases get a feature switch
We ask for code reviews all the time, we simply don't mandate them - I think that's the main difference.
Isn't that 'after the fact', considering your teamcity polls the gitlab repo a lot, so a commit will trigger a build right after it, and if everything goes well, deploy it too?
So you have to know up front whether a thing is 'risky', but that's a subjective term.
So yes, it will build to dev, but we're using this in situations where we're very confident the changes are correct already. I'd argue blind pushes are the problem otherwise. If the developer is not very certain: they can open a merge/pull request or just hop on a hangout to do a review.
Ah missed / overlooked that!
(I find this is the most valuable way of doing code reviews vs pull requests/sending comments back and forth. In-person conversation about the code is so much higher bandwidth.)
Obviously, it depends widely on the codebase and the number/quality of engineers working on code, but it's been my experience that a team reviewing each other's code is still something that can't be 100% replaced with automated tests.
Every time such process was put into place, it quickly faded out after a few months.
Yes, Yes, and No.
#1 kind of happens naturally as we're all working on different things. There are "a lot" (< 100) of people slinging code/design/SRE/IT, but only a few work in the same areas at the same time and rapid good communication generally happens among those subsets of the overall team. We also seem to have perpetuated a tendency toward good citizenship, so we generally talk to people before traipsing through areas of the code we aren't familiar with.
#2 Absolutely. This is some basic principle stuff. We don't hire people to not trust them.
#3 Related to #2: Automated acceptance testing is done as is deemed appropriate by the person developing the system. I've been on teams that valued and developed automated testing more, and less. My personal experience at the company has been it is neither necessary nor sufficient for success.
Much more important than any pre-deploy automated testing is our effort to monitor what's deployed (both in terms of software/hardware metrics and business goals). Bosun (http://bosun.org), developed by our SRE team, gives us some pretty great introspection/alerting abilities. I'd be incredibly sad to not have it. Bosun monitoring combined with the ability to have a build out in <5 minutes keeps me pretty happy.
Currently most teams don't subscribe to a particular methodology, though there is a Scrum effort happening on one team.
From what I heard so far, Stack Overflow and the Stack Exchange sites don't have a significant amount of automated tests.
If your code is well organized into modules broken down by functional area, it should reduce the number of potential conflicts.
Also, fear of merge conflicts is somewhat unjustified; most conflicts can be resolved and rebased against using git rebase without that much work; the git rerere option  and git imerge  can also help with this.
If developers would actually learn how to resolve merge conflicts, and not be afraid of the occasional conflict resolution which requires understanding the other change and how to write new code that incorporates both changes, it's less overhead than communicating about pending changes.
- Committing to master directly is the simplest thing to do. That is why most people chose to do that. It works well if everyone is working on their own pieces not touched by others. Out of all the teams I have worked with in past almost 4 in 5 did this.
> They have a comprehensive test stack, which helps verify the above point and keep it sane
This may or may not be true.
To us, the speed of deployment and overhead savings we get 24/7 is also absolutely worth those very rare issues.
Good modularization of code I think makes this more possible - the point of a big test suite is to catch unintended consequences of a change, and the less coupling the less likely this is to happen. Stuff like routers, session object and other stateful/lower level logic I'd imagine is more tricky to change without a test suite
Thanks again for the post and the candid answers!
What we all end up with are either DB migration steps (a-la Rails migrations), of which I approve see , or schema comparison tools (a-la vsdbcmd.exe) of which I'm very skeptical after being burned several times on large DBs (they are as good as the tool, and some tools are better, still I find explicit migrations much better).
As a side note, My startup DBHistory.com is recording all DB schema changes, like a DB flight-recorder of sorts, and one of my future goals is to add capability to generate compensating actions for any DB schema change and thus be able to revert the DB schema to any previous state (by simply rolling back every change, in reverse order). But I must reckon I'm quite far from having such a thing working, and I'm not even considering migrations that modify data, not schema.
Leading space is for preformated text and is mostly useful for code. For everything else, don't add leading space. Do add extra line breaks, though.
Here are your links with extra line breaks and without leading space, thus nicely presented and clickable.
As for backups: absolutely. We handle this independently though. We do full backups every night as well as T-logs every 15 minutes. If I had to restore every database we had to a very specific point in time or just before a migration command was run: we have T-logs to do that going back 4 days at all times.
I'm sure there are good solutions for single database applications way more fully featured than our approach, they just do little to solve any problems we actually run into.
I think the bigger problem is cultural - many programmers either don't really understand databases/data modelling or they don't care about it. After all, you don't really have to worry about it when you're just starting out - almost any schema will work. That is, right up until you have to modify it. By the time it becomes an issue, the culture has crystallised and changing the database is too risky.
For some reason, a lot of companies are largely unwilling to spend money on good database management/migration tools - even if they're paying a stack of cash for SQL Server.
If you use SQL Server, you can do database migrations using DACPAC files. DACPAC migrations are idempotent, so you can deploy to an existing database; it will add missing columns without deleting data etc.
Personally I like manual migration scripts better, for my last projects I have been successful embedding these migration scripts in the source code of the application itself, so it integrates with source control , makes it very nice to use in test and development, and avoids many of the possible mistakes with separate code and database deployments.
There are many open source tools for this in .NET: I've used FluentMigrator, SqlFu and Insight Database Schema, and they all worked well.
If your release script doesn't measure in the tens of thousands of lines then you're living far too comfortably.
The principal is brilliant. The SSDT project is a file tree of the complete schema of your system, including security objects. There is also a simple macro-language built-in for supporting conditional-compilation in your SQL. And because it's part of Visual Studio, you get source-control and msbuild integration.
This thing does static analysis of your SQL, and does deltas on SQL schema against running databases and creates the delta scripts to publish. Works very nice for continuous deployment to testing servers.
Any change that risks data-loss will be blocked, so for those you have to execute a change manually and then publish your SSDT package. You can generate the script directly against the target database, or generate a "DACPAC" package and use a command-line tool to publish the compiled DACPAC against a target database.
The problem is that the SQL server side of Microsoft is the polar opposite of the Satya OSS side of Microsoft developers. Lots of tedious designers and slow GUIs.
Also, it has no story for deploying configuration data or initialization data. At all. We've rolled our own tooling for that.
Also, performance-wise it's a goddamned dumpster fire, and it's buggy as hell.
So yes, MS got the concept right... but the implementation leaves a lot to be desired.
> the SQL server side of Microsoft is the polar opposite of the Satya OSS side of Microsoft developers. Lots of tedious designers and slow GUIs.
Someone needs to come in and send some people packing and tell the rest to get with the program. For how central MSSQL seems to be to MS strategy going forward, there sure are a lot of things that still just suck.
While doing some research on an unrelated topic, I stumbled on some potentially related work [1, 2, 3] by some researchers at MIT that could be relevant to database deployment/migration (I haven't checked in depth yet). I have not had a chance to sink into these references to see if there is any relevance or promise there, though it looks like there is some kind of commercialization effort .
 Patrick Schultz et al. Algebraic Databases. http://arxiv.org/abs/1602.03501
 David Spivak. Functorial Data Migration. http://arxiv.org/abs/1009.1166
its even worse if you want to be able to rollback.
1. Add new database structure (new columns, new tables, whatever) but leave all the old structure in place
2. Update all servers with code that writes in the new format but understands how to read both the new and old structures
3. Migrate the data that only exists in the old structure
4. Get rid of the old stuff from the database
5. Get rid of the code that is responsible for reading the old format
Conceptually it's straightforward but it can take a long time in calendar days depending on your deployment schedule, it can be tough to keep track of what's been migrated, and the data migration will cause performance issues if you don't plan it properly (e.g. trying to do a migration that locks an important table). You just have do it in a way where each individual change is backward compatible and you don't move on to the next change until the previous one is rolled out everywhere.
Using this tool has led to a dramatic increase in productivity on our team since we really don't have to worry about database changes anymore. I won't waste space here with the details but these links will fill you in if you have any interest.
This could be solved for relational databases if you implemented application-level abstractions that allowed you to store all your data using JSON storage, but create non-JSON views in order to query it in your application using traditional ORMs, etc.
So, store all data using these tables, which never have to be changed:
- data (int type_id, int id, json data)
- foreign_key_type (...)
- foreign_keys (int type_id, int subject_id, int object_id)
(we'll ignore many-to-many for the moment)
And then at deploy time, gather the list of developer-facing tables and their columns from the developer-defined ORM subclasses, make a request to the application-level schema/view management abstraction to update the views to the latest version of the "schema", along the lines of https://github.com/mwhite/JSONAlchemy.
With the foreign key table, performance would suffer, but probably not enough to matter for most use cases.
For non-trivial migrations where you have to actually move data around, I can't see why these should ever be done at deploy time. You should write your application to be able to work with the both the old and new version of the schema, and have the application do the migration on demand as each piece of data is accessed. If you need to run the migration sooner, then run it all at once using a management application that's not connected to deploy -- with the migration for each row in a single transaction, eliminating downtime for migrating large tables.
I don't have that much experience with serious production database usage, so tell me if this there's something I'm missing, but I honestly think this could be really useful.
Citation needed :) That's going to really depend.
I'm not for or against NoSQL (or any platform). Use what's best for you and your app!
In our case, NoSQL makes for a bad database approach. We do many cross-sectional queries that cover many tables (or documents in that world). For example, a Post document doesn't make a ton of sense, we're looking at questions, answers, comments, users, and other bits across many questions all the time. The same is true of users, showing their activity for things would be very, very complicated. In our case, we're simply very relational, so an RDBMS fits the bill best.
- data (string type, int id, json fields)
- fk (string type, int subj_id, int obj_id)
fk_1.obj_id as 'foo_id'
fk_2.obj_id as 'bar_id'
join fk as fk_1 on data.id = fk_1.subj_id
join fk as fk_2 on data.id = fk_2.subj_id
data.type = 'my_table'
and fk_1.type = 'foo'
and fk_2.type = 'bar'
If your database doesn't enforce the schema you still have a schema, it's just ad-hoc and spread across all your different processes, and no one quite agrees what it is. In the real world as requirements change and your app/service increases in complexity this becomes a constant source of real bugs while simultaneously leading to garbage data. This is not theoretical, we have a lot of direct painful experience with this. Best case scenario your tests and tooling basically replicate a SQL database trying to enforce the schema you used NoSQL to avoid in the first place.
Indexes are fast but they aren't magic. A lot of what a traditional SQL database does is providing a query optimizer and indexes so you can find the data you need really fast. Cramming everything into a few tables means everything has to live in the same index namespace. Yes you can use views and sometimes even indexed views, but then you have a schema so why jump through hoops to use non-optimized storage when the database has actual optimized storage?
Separate database tables can be put on separate storage stacks. A single table can even be partitioned onto separate storage stacks by certain column values. Cramming everything into four tables makes that a lot more complicated. It can also introduce contention (depending on locking strategies) where there wouldn't normally be any.
IMHO most systems would be better served by sharding databases than by using NoSQL and pretending they don't have a schema. If application design prevents sharding then scaling single-master, multiple-read covers a huge number of cases as well. The multiple-master scenario NoSQL systems are supposed to enable is a rare situation and by the time you need that level of scale you'll have thrown out your entire codebase and rewritten it twice anyway.
The key to schema migrations is just to add columns and tables if needed, don't bother actually migrating. Almost all database engines can add columns for "free" because they don't go mutate existing rows. Some can drop columns for "free" too by marking the field as obsolete and only bothering to remove it if the rows are touched.
Storing a data type field in the generic storage table enables the same partitioning ability as a standard schema.
99% of NoSQL database users just don't want to deal with migrations, even if they're "free" (another big issue is synchronizing application code state and DB migration state of production, testing, and developer machines), so what they really need is NoDDL, YesSQL.
> Almost all database engines can add columns for "free" because they don't go mutate existing rows. Some can drop columns for "free" too by marking the field as obsolete and only bothering to remove it if the rows are touched.
Didn't know that, thanks.
> It can also introduce contention (depending on locking strategies) where there wouldn't normally be any.
Didn't think of that. I'm aiming this at 99% of NoSQL users in which doing things you could do with SQL requires much more effort, so allowing them to do it with SQL can accept a modest performance degradation, but if you have any good links relevant to how this storage design would affect lock contention, please share.
Any code change requires this process.
Obviously the quality of the process needs to be high, but when it's effortless and "fun" then everybody wins.
Fun fact: since Linux has no built-in DNS caching, most of the DNS queries are looking for…itself. Oh wait, that’s not a fun fact — it’s actually a pain in the ass.
we had n datacenters each named after their city: ldn.$company.com, ny.$company.com etc etc. in the DHCP we pushed out the search order so that it would try and resolve locally, if that failed try a level up until something worked.
This meant that you'd bind to service it would first look up service.$location.$company.com, if thats not there it'd try and find service.$company.com
This cuts down the need for nasty split horizon DNS, moving VMs/services/machines between datacenters was simple and zero config.
If you were taking a service out of commission in one datacenter, you'd CNAME service.$location.$company.com to a different datacenter, do a staged kick of the machines, and BOOM failed over with only one config change.
On a side note, you can use SSSD or shudder NSLCD to cache DNS.
DNS devolution isn't a good idea here, since the external domain is a wildcard. We'll be paying for that mistake from long ago until (if ever) we change the internal domain name.
This is a pretty recent problem we're just now getting to because the DNS volume has been a back-burner issue - we'll look into permanent solutions for all Linux services after the CDN testing completes. Recommendations on the Linux DNS caching are much appreciated - we'll review each. It's something that just hasn't been an issue in the past so not experts on that particular area. I am surprised caching hasn't landed natively in most of the major distros yet though.
NSCD (name service caching daemon) is in RHEL and debian, so I assume it'll be in ubuntu as well. The problem is that it fights with SSSD if you're not careful. https://access.redhat.com/documentation/en-US/Red_Hat_Enterp...
out of interest, what are you using to bind to AD?
Nor need any other DNS server software do so. The actual DNS protocol has no notion of an ordering within a resource record set in an answer.
I suspect, from your brief description here, that what you'll end up with is using the "sortlist" option in the BIND DNS client library's configuration file /etc/resolv.conf . Although SRV RRSets will introduce some interesting complexities.
The first lookup might take longer, but subsequent ones should be fast.
caching DNS resolvers can fit in a 256MB RAM VM and use virtually 0 CPU.
This is wrong in two ways, and isn't factual at all.
First, the cause of the queries is nothing to do with whether DNS query answers are cached locally or not. There is no causal link here. What causes such queries is applications that repeatedly look up the same things, over and over again; not the DNS server arrangements. One could argue that this is poor design in the applications, and that they should remember the results of lookups. But there's a good counterargument to make that this is good design. Applications shouldn't all maintain their own private idiosyncratic DNS lookup result caches. History teaches us that applications attempting to cache their own DNS lookup results invariably do it poorly, for various reasons. (See Mozilla bug #162871, for one of several examples.) Good design is to hand that over to a common external subsystem shared by all applications.
Which brings us to the second way in which this is wrong. A common Unix tradition is for all machines to have local proxy DNS servers. Linux operating systems have plenty of such server softwares that can be used: dnsmasq, pdnsd, PowerDNS, unbound ...
One of the simplest, which does only caching and doesn't attempt to wear other hats simultaneously, is even named "dnscache". Set this up listening on the loopback interface, point its back-end at the relevant external servers, point the DNS client library at it, and -- voilà! -- a local caching proxy DNS server.
I run server machines that have applications that repeatedly look up the same domain names again and again. Each runs a local dnscache instance, which ameliorates this very well.
I spent 20 years getting deep with PHP, and then Rails, on Linux, where, IMO, doing CI builds with something like Jenkins or Heroku is pretty straightforward. For the past couple years, I've been doing .NET, and, while I never really stopped doing ".NET" since the VB 3.0 days, I've had a lot to learn to get into doing serious, enterprise-level stuff. I just took my first steps in setting up auto-building with Visual Studio Team Services, and my experience has left me really disappointed. Dealing with changing variable values per environment is really hacky, no matter which of a handful of tricks you want to try. So much so, I left the whole thing hanging, and have gone back to doing releases by hand.
Skimming the article, and seeing discussion about this topic in the comments, leads me to conclude that my experience wasn't just limited by my ignorance, and that this is still an area that is underserved in the .NET world. You'd think someone would have neatly sorted this by now. I keep looking for the oversight in MY case, but I'm not finding it.
Did you try Octopus Deploy (1) or Deployment Cockpit (2)
As for the applications - we have little direct input to TeamCity of Gitlab (the problem children here). And even if we did, I think we agree: the application level shouldn't cache anyway.
That being said, we're looking at `dnscache` as one of a few solutions here. But the point remains: we have to do it.
> All software on a Red Hat Enterprise Linux system is divided into RPM packages which can be installed, upgraded, or removed.
A (very) quick check indicates that the CentOS 7 "main" and "updates" repositories have at least three of the DNS softwares that I mentioned. Ubuntu 16 is better endowed, and has all of them that I mentioned, and an additional "Debian fork of djbdns" that I did not, in Ubuntu's "main" and "universe" repositories.
Simple, safe and very effective :)
Shots fired :P
The build messages build...that's also literally all it does. It simply puts handy notices in the chatroom. Why wouldn't you want that integration? Everyone going to look at the build screen and polling it to see what's up is a far less efficient system. A push style notification, no matter the medium, causes far less overhead.
I doubt we'll ever build from chat directly for anything production at least, simply because those are 2 different user and authentication systems in play. It's too risky, IMO.
A developer is new, and early on we want code reviews
A developer is working on a big (or risky) feature, and wants a one-off code review
A lot of this cruft is unnecessary when compared to good domain knowledge and solid coding focus.
This does not diminish code reviews. Would you be a better engineer today if you had regularly participated in code reviews? Would your coworkers?
1. Reviewers are often poorly trained to provide good design reviews and default to nit-picky stuff a code linter should pickup. Human linting is just a poor use of time and money.
2. Nobody seems to ever have time for them to deep dive into the code.
3. Few engineers seem to ever actually want to do them.
4. Reviews can become hostile.
Code reviews are probably really important in some fields, for example, medical equipment, aviation, etc, but for the vast number of projects where we're shoveling A bits to B bucket or transforming C bits into D bits it's overkill and companies would be better off investing the massive amount of wasted time in better CI/CD infrastructure.
Maybe, but probably not. It's not like I never see someone's code, it's right there when I'm working in the same code base and I can go through commits to see the high-level changes.
There are lots of ways to become a better engineer and code reviews are pretty far down on the list in my view. They usually just turn into tedious ordeals that burn up actual productive time.
1. Make sure you don't do something dumb + mentor/educate to better standards.
2. Share the knowledge of how a codebase works so that someone else will know how to fix something at 3am when you can't be reached.
For example, a way to for reviewers to just mark a review as 'acknowledged' and submit a list of potential concerns (which may freely be ignored by the author). This makes them much more low friction as the reviewer is scanning the code to understand the purpose of it and help think of potential pitfalls at a high-level, rather than nit-picking apart little details.
I've mentioned it in previous threads, but we try to prevent hostile reviews by separating the code from the coder. Comments should not reference the author, only the code.
The counterpoint is that this good domain knowledge is bettered by considering other's changes.
Scientific research suggests otherwise though.
We often find issues in code reviews like edge cases that weren't thought of, code that could be refactored to use an existing utility or patter the author wasn't aware of, etc.
We also have frequent production deployments that everyone on the team can do, I view that as something that is independent of code review.
Personally I do code reviews mostly to share knowledge and culture rather than looking for bugs. Occasionally a bug is found, but I don't generally have the time to review the logic, just the style.
As a commenter below notes, there are always two pilots in an airplane -- and that is pretty much also a trust issue -- but we don't pilot planes, we don't have actual lives depending on us.
There are still two people flying a plane.
But to be clear - it's not that we never do reviews. It's more that we have an "ask for it when you need it" type of policy. New hires get regular reviews, so initial architecture/style concerns are addressed then... along with teaching the logistics of our code reviews (push to a branch & submit a merge request).
The importantance of the cohesion and trust amoung their team is critical to their deployments. In fact, I would say it's vital to how they're able to get away with minimal amounts of code reviews for example.
It's dangerous to believe this is easy or reproducible. New teams needs extensive controls in place to make sure the quality of the deployments will not negatively impact the group.
- migration id's freeform in chat -> why not central db with an auto-increment column?
- Use chat to prevent 'migration collisions' -> Same central db/msmq/whatever to report start/stop of migrations and lock based on that...
> there is a slight chance that someone will get new static content with an old hash (if they hit a CDN miss for a piece of content that actually changed this build)
Anyone has a solution to this problem?
This adds complexity as now your static site needs to serve requests based on the hash. This isn't conceptually complex but it means you must deploy at least two versions of your static resource (one for each hash in the wild). And you still have to do two phase deployment in this model (static resources first, then the "real" site). Or you can build redirect or proxying support from v2 to v1 and vice versa, which is a much uglier problem to solve, but eliminates the need for two phase deployment.
Since they have short deployments, their solution is pretty elegant. If you have long deployments, the hash aware server becomes sensible. If you're a masochist, the two way proxying becomes attractive.
It's a pest of a problem but pre-deploying static assets is the best answer.
* If you push static content and web pages together, you get V1 and V2 of both static and web, and you end up with incorrect static resources served in both directions. This approach is only reasonable if your deployment strategy is to take a service outage to upgrade all machines together.
* If you push web first, you get the ugly scenario described in the article where V1 resources get served with V2 hashes and cached for 7 days.
* If you push static content first, you still have V2 static content being served for V1 web pages. The "cache bust" doesn't matter. Somewhere a cache will expire and someone will get V2 static resources for a V1 page.
You have to deal with the two versions somehow if you want to resolve the issue fully.
It works well enough for our needs (e.g. C# has one of the best GCs on the market), and no one is a platform/language zealot, so we keep on using it.
Stuff we added later runs on other platforms, as needed (e.g. we run redis and elastic search on CentOS, our server monitoring tool Bosun is written in go...)
At the moment, Gitlab knows nothing about our builds - and we'd want to keep it simple in that regard. If we can generically configure a hook to hit TeamCity to alert of any repo updates though, that's tractable...I need to see if that's possible now.