Alasater's comments

Alasater · 2026-01-25T00:35:22 1769301322

First - they never want to use someone else software framework again (an early SW architect decided that would accelerate things but we ended up re-writing almost all of it) and it was all C++ on the satellite. We ran linux with preempt_rt.

We wrote everything from low level drivers to the top level application and the corresponding ground software for commanding and planning as well. Going forward, we're writing everything top to bottom, just to simplify and have total ownership since we're basically there already.

For testing we hit it at multiple levels: unit test, hardware in the loop, a custom "flight software in test" we called "FIT" which executed a few different simulated mission scenarios, and we tried to hit as many fault cases as we could too. It was pretty stressful for the team tbh but they were super stoked to see how well it worked on orbit.

A big one for us in a super high resolution mission like this is the timing determinism (low latency/low jitter) of the guidance, navigation, and control (GNC) thread. Basically it needs execute on time, every cycle, for us to achieve the mission. Getting enough timing instrumentation was tough with the framework we had selected and we eventually got there, but making sure the "hot loop" didn't miss deadlines was more a function of working with that framework than any limitation of linux operating well enough in a RTOS fashion for us.

Alasater · 2026-01-25T00:18:05 1769300285

From my perspective, the number one reason we had a well functioning satellite out of the gate is my philosophy of testing "safe mode first". What that means is in a graduated fashion, test that the hardware and software together can always get you into a safe mode - which is usually power positive, attitude stable, and communicative. So our software integration flows hit this mission thread over and over and over with each update. If we shipped a new software feature, make sure you got to safe mode. If we found a bug that prevented it, it's the first thing to triage. We build out our pipelines to simulate this as much as we could and then ran it again on the development hardware and eventually would load a release onto flight once we were confident this was always solid. If you're going to develop for space, start here.

jacquesm · 2026-01-25T15:57:05 1769356625

That principle goes far further than developing for space, but for space the pay-off is the largest. It also applies to maritime, medical, aviation and mining and probably other domains where whatever you make it going to have to function even when you can not reach it at all.

But it is great to point it out and to show how essential this kind of thinking is, and how it can help to focus on what is really important and what can be fixed.

What is interesting is to theorize about the relative impact of losing any of those three and how you managed to fix the second because you still had power and were able to communicate with the device. I think within those the order of relative importance would be communications, power and then attitude but I could well be mistaken.

Alasater · 2026-01-25T00:13:20 1769300000

To be certain, if you're in the trenches of this anomaly investigation you'll get the full root cause and corrective action presentation, but that's not what this post is for.

You're correct on 1, we ended up hitting an edge case in their spec that they hadn't adequately tested to and the upper level management and engineering leadership were swift to accept the fault and implement fixes with us going forward.

From a SE perspective, as a "COTS" product, we had spec'd correctly to them, they accepted our requirements and then executed each unit's acceptance test plan (aka lower level than first unit quals or life tests where this should have been caught) on the ground without anything amiss. We ran through our nominal and off nominal cases at the higher level of assembly, but not for a duration that caught this on the ground. It wasn't until we were at extended operation on orbit the issues began.

Sadly like you state, space isn't like on the ground, you can't buy spares or replace things that fault, even for a true high volume COTS product that might slip through the acceptance testing.

Neywiny · 2026-01-25T01:12:47 1769303567

> We ran through our nominal and off nominal cases at the higher level of assembly, but not for a duration that caught this on the ground. It wasn't until we were at extended operation on orbit the issues began.

So I think that's a great answer. It's all about risk mitigation and tolerance. Your test tested if the part work to a reasonable and hopefully calculated level. It's good that the suppliers' management accepted fault, too. It's a lot harder when they don't but honestly in the professional world I've found that to be much rarer than consumer.

To me, and I'm not an investor, and probably not your target audience, those 3 short paragraphs told me a lot more in a positive way than I expected. I don't think it would be out of place to put it in the post. Honestly as is I thought this was your guys' fault for myriad reasons. Now I'm flipped the other way. Of course it's still your problem even though it's not your fault. Or, maybe, you do claim some blame for the worst case analysis not shaking out that edge case. Either way I feel much less like you guys just went to the hardware store, bought some random lube, packed the bearing, and shipped it thinking you'll figure it out on the next launch (which is sadly the fast and loose reputation new space is starting to get).

Alasater · 2026-01-25T00:05:39 1769299539

Space safety for sure on the cover, although I'm not sure we'll have that cover for future launches because it was less than easy to coordinate with the FCC on where to eject it.

The radio came from a supplier who has been investigating the issue. We had concerns with their NAND and ECC implementation, and we weren’t able to fully root-cause it with them. Going forward, we’ll be building our own radios, which will make it easier to test, iterate, and resolve issues like this internally, or at least be able to trace possible latch ups or destructive failures and implement the right levels of redundancy.

jacquesm · 2026-01-25T00:43:53 1769301833

Ok, good luck with that, that's a tough environment for such sensitive stuff. Unfortunately your application seems to be so advanced that you can't get away from having high speed and super integrated stuff on board there otherwise you'd be more intrinsically safe against such issues. The fact that it worked well initially and then degraded until it failed is a strong indicator of the kind of process that caused the failure but unfortunately that still leaves a whole slew of options on the table.

Much good luck with this, those are hard problems to solve but you guys got so much right on the first try that you're probably ahead on your schedule now so you may have the time and the budget to get this right. I've visited the ESA open day a while ago and have seen the guts of what goes into satellite manufacture (not the most recent stuff, just what they had on display) and what struck me is that the degree of rigor that goes into designing stuff that is in the most literal sense out of reach for fixes or diagnostics requires simulating the environment the device will operate in to the best of their ability. This results in autoclaves that you can walk around in and various radiation sources to be able to test how the devices respond to space conditions.

Your manufacturer/supplier will probably come away from this effort with as much knowledge and improvement items as you do. Given the short time to failure I'm not so sure redundancy alone would have been a sufficient fix, but that's obviously bystander perspective, you know far more than I do. But it certainly is an amazing and interesting project.

Alasater · 2026-01-24T23:59:26 1769299166

We actually didn't get to that part of the payload calibration campaign unfortunately, but all indications pointed towards getting geolocation between 5-10 meters on this first mission, driven primarily by star tracker quaternion error. Ephemeris and field angle map error was right in spec, so we were prepped to do an iterative line of sight pointing calibration but with the CMGs down, we didn't get to get there.

Future systems we've got a few updates though based on learnings, and we'll be shooting for closer to 3-5 meter geolocation error without ground control points (GCPs)

Alasater · 2026-01-24T23:43:08 1769298188

I'm AyJay, Topher's co-founder and Albedo's CTO. We'll actually be publishing a paper here in a few weeks detailing how we got 3-axis torque rod control so you can get the real nitty gritty details then.

We got here after stacking quite a few capabilities we'd developed on top of one another and realizing we were beginning to see behavior we should be able to wrap up into a viable control strategy.

Traditional approaches to torque rod control rely on convergence over long time horizons spanning many orbits, but this artificially restricts the control objectives that can be accomplished. Our momentum control method reduced convergence time by incorporating both current and future magnetic field estimates into a special built Lyapunov-based control law we'd be perfecting for VLEO. By the time the issue popped up, we already had a lot of the ingredients needed and were able to get our algorithms to control within an orbit or two of initialization and then were able to stay coarsely stable for most inertial ECI attitudes albeit with wide pointing error bars as stated in the article. For what we needed though, it was perfect.

alhirzel · 2026-01-25T06:39:59 1769323199

I'd love to read this paper! This was on my mind when I was GNC lead for an undergraduate project at Michigan Tech (Oculus-ASR - Nanosat-6 winner). We had a combined controller for reaction wheels and magtorque rods.

jacquesm · 2026-01-25T08:15:24 1769328924

I'd love to read about that as well! (your project, not just the OPs!)

CheeseFromLidl · 2026-01-25T08:59:02 1769331542

I’ve just read the wiki page on Magnetorquer but I couldn’t find what I was looking for: ballpark numbers.

What kind of current are you driving those coils with (amps or dozens of amps?). What order of magnitude is the resulting force (a few newtons?)

I’ll gladly read the paper but knowing myself I won’t remember why exactly when a few weeks passed.

Alasater · on Feb 1, 2021

While it’s definitely true GPS is a “free spinoff” of a government service (ie not really free), it is accessed by developers through a platform you purchase that translate that GPS signal into useable data (ie your phone, your car, other things you purchase of which GPS is “available”). In that lens, we foresee a similar use for satellite imagery data, gathered through a platform, ours or other satellite service providers, and made available to other users. The cost of entry for utilizing GPS is hidden under these platforms and service taxes, and we look to develop a low cost of entry for utilizing satellite imagery for platforms to use and burgeon; hence the example of things the engineers building GPS never expected their data would be used for! I’ve spotted a few commenters on here who are hoping to use low-cost imagery for just that.

Alasater · on Feb 1, 2021

Indeed a 9x in resolution is a big ol chunk of data, and primarily, increases in the bands collected (PAN vs RGB vs NIIR vs LWIR vs hyperspectral/multispectral) data is one piece of a complicated puzzle of how to get data to the ground. Previous entrants have spent a lot of money building out their own networks and infrastructure, helping contribute to high costs of imagery. We look to leverage the ground-station-as-a-service industries such as AWS, KSAT, and Azure Orbital to help downlink our data. Add to that a switch over to Ka band (higher bandwidth) with inter-satellite links in the tradespace, you start to help the data problem. As well, every image provider does some amount of lossless compression (compressing the non-informative data) to help manage some of that. But the overall data to manage is also a factor of how much tasking you anticipate per rev, if the image you collected is actually useful (ie clouds), if the customer is interested in all those bands for their task, and many more variables.

In terms of the 'big dish' gateway infrastructure, it eases link budgets and opens bigger pipes, but if you're limited in your output power and your access, having smaller dishes and more proliferated networks may make more sense (sounds similar to your Big satellite vs many small satellite trades :) )