
Lessons Learned from Kaggle’s Airbus Challenge - homarp
https://medium.com/@YassineAlouini/lessons-learned-from-kaggles-airbus-challenge-252e25c5efac
======
lettergram
One thing I always recommend when doing segmentation is first altering the
color space; even utilizing a CNN (which should, when trained, essentially
perform that step). This is especially true of pre-trained models where you
don't know if it's been tuned for that.

Just in terms of edge detection, you'll see nearly a 10% improvement just from
shifting the color space to LAB:

[https://austingwalters.com/edge-detection-in-computer-
vision...](https://austingwalters.com/edge-detection-in-computer-vision/)

What's even better about the color space conversion, is you can utilize it as
a pre-processing step to dramatically reduce the number of edges to search:

[https://austingwalters.com/chromatags/](https://austingwalters.com/chromatags/)

In the case of a blue sea, you can shift to the LAB color space and probably
only search the 'A' channel; the channel representing green to red. As the 'B'
channel represents yellow to blue. Which is less likely to produce edges in
the ocean.

This means you only process 1 channel, which dramatically speeds up most
algorithms.

Just food for thought.

~~~
ska
The related L-star c-star h-star works well too, and is easier to interpret
than LAB.

These spaces benefit from being approximately perceptually linear, which also
helps when any metric is computed on them.

------
CoolGuySteve
Great meta article. Some other general advice concerning iterative
development:

\- Test your ideas on a smaller data sample at first to iterate and debug more
rapidly.

\- Caching your augmented data to disk may also improve iteration speed. Or it
may not, depending on kernal disk cache vs disk bandwidth vs compute
tradeoffs.

\- Wrap your functions in timers so you can notice when a new processing stage
is unnaturally slow, rather than having to back out later what happened.

\- Make sure you're actually setup and using the GPU and not accidentally
running on the CPU.

