Show HN: Bearer – Open-source code security scanning solution (SAST)

dang · on March 8, 2023

Vote manipulation is against HN's rules and will get you banned here, so please don't do it again.

https://twitter.com/g_montard/status/1633119734991405058

https://twitter.com/g_montard/status/1633119274838392841

This is the one point that's in both the site guidelines and the FAQ:

https://news.ycombinator.com/newsguidelines.html

https://news.ycombinator.com/newsfaq.html

gmontard · on March 8, 2023

Oh, I’m really sorry about that, I didn’t know (my fault) mentioning we were on HN was against the rules. Calling that « vote manipulation » is quite exaggerated imho but I get it.

Ultimately I think I got carried away by the great community reception.

Anyway thanks for letting me know, I’ll avoid doing so next time.

mdaniel · on March 8, 2023

https://news.ycombinator.com/newsfaq.html#ring exactly describes the circumstance and while it doesn't have its own heading in the guidelines:

> Don't solicit upvotes, comments, or submissions. Users should vote and comment when they run across something they personally find interesting—not for promotion.

seems to also match what dang is drawing attention to. My (outsider) suspicion is that the number of dead comments from new accounts on this thread drew attention to the goings-on

cfabianski · on March 7, 2023

Hello HN community,

I'm Cédric Fabianski, Co-founder and CTO @ Bearer.

This is a big milestone for me personally and I'm super happy to be able to contribute to the Security space and help improve the security of others' applications.

This is by far the most challenging project I've ever worked on but as people say, if you don't make security simple and accessible enough, there is no way engineers are going to care about it.

Let me know what you think! Any feedback is more than welcome!!

RangerScience · on March 8, 2023

Thanks, this is very cool, I've been clicking around a lot! I like what you've got going, and I like how it has a resemblance to Rubocop.

My first feedback -

It's a little too many "clicks to code" given (1) how easy it actually is, and (2) aimed at developers.

Personally, I'd slap the `brew install` + `bear scan` on the initial landing page, just under the "Get Started" link. (with an "Other Installation Options" link, 'cuz brew)`. You do a pretty good job of this (the GIF) but I look at "clicks to code" as an indicator of how focused on ease-of-use the provider is, and you're more focused on it than the landing page suggests to me. (Sinatra is the reigning champ at this).

Next -

1. Pronto integration. I'd like to be able to plug it into things like Pronto, so we make sure we're not introducing new problems while we're not ready to deal with existing ones.

2. Github PR comments. It's not clear to me if the output of the GH action will create comments in a PR ala Pronto / Rubocop. It looks it probably does, so just show me a picture so I know for sure?

3. YAML option for recipes

4. I needed to upgrade to XCode 14.1 (from 14.0). Why was that necessary? Seems like it shouldn't be?

5. THANK YOU for providing links to source code from the docs! (I checked out a couple of rules). I would pick a couple of your favs and link to them from the "Custom Rule" page, too.

6. I'd definitely run a few from-scratch workshops for custom rules, recipes, etc; point people at the docs and ask them where they run into even the smallest friction. Your docs are really good but I did need to scroll and click around a bunch as I came to an understanding. Smoothing that out would be nice! (Think Rails Guides vs Rails Docs)

mdaniel · on March 7, 2023

Elastic 2, for those who care about such things: https://github.com/Bearer/bearer/blob/v1.0.0/LICENSE.txt

gmontard · on March 7, 2023

Absolutely!

We wanted to find a good balance with a license to allow any team to use it for their own usage no strings attached and at the same time protect us against a big vendor tempted to package our work under their product without us getting a dime... Unfortunately, it happens in this world :(

eltondegeneres · on March 7, 2023

AGPLv3 would ensure any changes by a big vendor would remain freely licensed. The current license for this project fails to meet the Open Source Definition (Criteria 6: No Discrimination Against Fields of Endeavor) since it restricts offering Bearer as a managed service.

gmontard · on March 7, 2023

That's right, we don't want to have someone doing managed service on top of us without a getting a license (or just an agreement). Basically, it's the AWS vs Elastic case, that resulted in this license.

Happy to revisit the license in the future when we feel more protected, but for now, we've seen so much bad behaviors in this industry with big vendors taking advantages of small companies like ours.

UncleEntity · on March 7, 2023

> …we've seen so much bad behaviors in this industry with big vendors taking advantages of small companies like ours.

Firstly, I have absolutely no problem with your choice of license so don’t take this as a criticism of your project.

What I do take issue with is people releasing software under a “free” license and then complaining about people taking them up on their offer. This isn’t “taking advantage”, this is taking what they are freely giving.

gmontard · on March 7, 2023

Not taken, just wanted to give the context of why this license.

mrtweetyhack · on March 7, 2023

nothing wrong with this license. Don't like it, then don't use it. I don't needs somebody to tell me how OSS is defined or yours.

smcleod · on March 7, 2023

Always great to see more SAST options.

"Contact Us" for pricing immediately disqualifies any product I'm looking at however, I'd suggest making pricing very clear on the site.

gmontard · on March 7, 2023

Once we're a bit more ready on the Cloud version, we'll release the pricing. Honestly I also hate when pricing is not available, so I'd like us to avoid this going further! Thanks for putting this back in my radar.

Anyway, with the OSS, you don't need to care about pricing :)

scinerio · on March 7, 2023

How does this compare with Semgrep, which to my understanding is the dominating open-source SAST offering to date?

gmontard · on March 7, 2023

I wouldn't say dominating tbh, but clearly one of the good solution out there for sure.

Probably the biggest differentiator is our ability to detect sensitive data flows and map those to the different security findings. It allows finding unique risks as sensitive data leaking in loggers for example, but also dynamically prioritize issues based on the type of sensitive data at risks or even decide it's not important if none are.

Let's say you're connecting to an unsecure API, we're going to assess if you're sending sensitive data or not there, depending on that we'll change the priority of the risk. If none are involved it would be a low risk, if PHI are involved it would be critical.

For the rest, I let you be the judge of the UX, quality of findings, speed etc.

scinerio · on March 7, 2023

Interesting! Thanks for responding. Sounds like it's bridging SAST and threat modeling.

tyingq · on March 7, 2023

The big missing feature for these kinds of tools is a workflow and relationship for dev teams to mark findings. Marking them as "false positive" or "only applies if these other conditions are true", or "yes, but we have a mitigation/exception". etc. A fast workflow that allows for less blockers, reduced noise and a focus on things that actually matter.

spmurrayzzz · on March 7, 2023

Totally agree. I love the idea of SAST-in-CI, but I ran this on a handful of repos I manage (ranging from 40k-100k SLOC) and there were too many false positives for me want to add this as build-breaking criteria to our CI pipeline. Not unique at all to bearer in any way of course, as you point out, but still a real problem.

I suppose an alternative would be to not have this be a zero-sum part of CI, but maybe as a qualitative summary that gets autogenerated as part of the PR / code review process. The noise issue is still a real one as people will eventually ignore the noisy summaries or filter/whitelist them into relative oblivion.

I like the idea of "only applies if these other conditions are true". In all the false positives I encountered so far, if given the option I would be able to declaratively express when and when not to apply the rule. I'd even be ok with inline ignore comments to that end which, while not ideal, is something folks are already used to for other idioms like test coverage et al.

gmontard · on March 7, 2023

We need to open for configuration the filtering and prioritization logic that essentially does that today, but so you can apply your own logic.

I advise to start today by looking first only to critical alerts, with our scoring based on sensitive data impact that should be a good first step in triaging.

gmontard · on March 7, 2023

Btw if you have some exemple please share or even better write an issue, we’d be super happy to look at it and fine tune the rules.

It’s just a 1.0, we can do much better for sure :)

spmurrayzzz · on March 7, 2023

I'll cherry pick an example: the default cookie config rule(https://github.com/bearer/bearer/blob/main//pkg/commands/pro...).

We have many places where `cookie: <EncryptedString>` is used in our code and it triggers that rule. There are a few issues with this:

- Most of the expressions where we use that pattern are used to send a full encrypted cookie string. The use of `cookie` is not the name of a key in the cookie string, its the whole cookie.

- All of the data in the cookie string itself is encrypted and also sent over https. Just matching on a regex expression won't tell you this information without an accompanying AST to verify.

Notably, we're using hapi and not express but my notes above would still apply to some use cases in express as well. Its possible I am missing the actual value of that rule, but just matching on the expression is going to generate a ton of false positives.

_elsapet · on March 8, 2023

Thanks for the feedback here; it is much appreciated :) I do know your point around catching encryption is more general than this example, but I’ve made a small improvement to the default cookie config rule regex to address one of the false positive cases mentioned https://github.com/Bearer/bearer/pull/754

spmurrayzzz · on March 10, 2023

This still generates the same false positive for me, in all of the previous repos I tested on.

_elsapet · on March 11, 2023

Thanks for the report back; that's interesting. Perhaps I misunderstood your example. Feel free to write an issue if you like, and I can investigate further.

cfabianski · on March 7, 2023

I’ve introduced the `absence` trigger that does that If express is present but helmet is missing then break.

Do you think that’d help achieve what you have in mind?

spmurrayzzz · on March 7, 2023

I think the design flaw in most of the problematic rules was from too simple of regex matching. Looking for a string pattern should be a clue to do some deeper analysis (maybe verify via AST), not necessarily to flag the string alone as security failure.

didroe · on March 8, 2023

The rules do work on the AST but the current cookie rule is not as advanced as it could/should be. For example, we really should treat encryption as sanitizing the value.

We'll take another look at the rules with this in mind. If you are able to share the (rough) approach you take to build the cookie string it would help us to ensure we're covering the specific case(s) you have.

gmontard · on March 7, 2023

Workflow is coming with our Cloud offering, with all the cool integration you can think of as Jira or Slack.

On the "marking" part, we have two options that will be available super soon: 1) Directly in the code, by adding a special comment that will ignore findings. 2) In the Cloud, an ignore action will forever park an issue, even if it changes line etc. (smart fingerprinting applied). We can't really have that in the OSS since it's state-less.

brightball · on March 7, 2023

Gitlab has a great security dashboard for this. It organizes the output of multiple tools in a place where you can discuss, triage, ignore or track an issue to resolve it.

https://docs.gitlab.com/ee/user/application_security/securit...

gmontard · on March 7, 2023

Also, super expensive, you need the $99 plan :) https://about.gitlab.com/pricing/

Integration with SCM is clearly a top priority for us, especially directly in PR. GitHub SARIF is a nice way to integrate third-party into their Dashboard, we're commited to it.

eranation · on March 10, 2023

100% agree

(shameless plug, the product we are working on for the last 1.5 years aims to solve exactly that… either via a PR bot, slack / teams etc). Ping me (see profile for details) if it’s interesting.

ecares · on March 7, 2023

Github actually has this feature (only for open source and enterprise IIRC) when there is a SARIF output

gmontard · on March 7, 2023

SARIF output is on our Roadmap btw!

Github code scanning is not so great from what we've heard so far, but also it's very expensive, you need to be on the Enterprise plan...

ac · on March 7, 2023

First of all, thank you for making and sharing this. I have a few technical questions, if I may.

Does Bearer perform data-flow analysis? If so:

1. Is the analysis inter-procedural?

2. Is it sound? (Does it only report findings that it’s absolutely certain in but missing others; or does it report all possible findings even if some of them report false positives)

3. How are sources and sinks of information specified?

4. I see it supports JavaScript and Ruby. Any plans on adding other languages? Is the current analysis implementation amenable to adding support for other languages?

5. What’s the analysis behavior around dynamic language constructs (e.g. eval)?

6. What’s the analysis behavior around missing symbols/dependencies?

didroe · on March 8, 2023

Thanks for your questions. Yes we do perform dataflow analysis:

1. Not yet but we are exploring ways to support that

2. The analysis part is sound. False +ves (mainly) come from limitations with what you can specify in the rule language. We're working on this however.

3. We don't make that distinction in the rules language currently. Sensitive data detection (which is built-in) is effectively treated as a source. But we need to allow rules to specify sources. I don't think the limitation matters to finding issues, but more to how well they are reported (you effectively only get the sinks reported at the moment).

4. We plan to add other languages but are mindful of the balance of depth vs breadth of support. Is there a particular language you'd like to see support for?

5. There is no support for these currently unfortunately.

6. As it's intra-procedural, we take quite a basic approach to these (with some special cases in the engine). In terms of dataflow, we treat unknown function calls as identity functions (assume the output is somehow influenced by all the inputs). Obviously this is not ideal in terms of false +ves, but we need to work on inter-procedural support first to do a good job of this. In terms of type analysis, we will try to infer unknown types locally from field/property access.

spuz · on March 7, 2023

This is a great looking project - we've been looking for tools similar to this to add an extra layer of validation to our codebase. Are you thinking about supporting Java in the future?

gmontard · on March 7, 2023

Thank you! We were actually thinking Java or PHP for the next one, so I guess it's a +1 on java :D

bryant · on March 7, 2023

another +1 for Java then

gmontard · on March 7, 2023

We hear you

tomekw · on March 7, 2023

+1 for Java, because that, possibly, means… Clojure? :)

gmontard · on March 7, 2023

You're pushing it ^^

itake · on March 7, 2023

I wish these tools would just auto fix it for me. I hate messages like this:

> CRITICAL: Only communicate using SFTP connections.

If you know what’s wrong, then fix it. My integration or unit tests will fail if your fix doesn’t work.

gmontard · on March 7, 2023

Well, we're getting there, at least into proposing some fixes.

Automatically fixing is tricky, it means changing your code that can get automatically deployed in production without any other checks.. Dangerous. Not sure if you want to trust anyone to do that, tbh.

Also, considering all the edge-cases there are, it's impossible to guarantee that a fix won't break your code. If someone does, they just lie to you.

But I understand why you'd love that, as a developer, I do too :)

itake · on March 7, 2023

> changing your code that can get automatically deployed in production without any other checks.

I’ve never worked at a place that didn’t have at least 2:

Code review checks

Qa checks

Automated testing

If an edge case breaks the code, then great! The developer can fix it (if the tool can’t). Even if 2% of the time, the system fixes it properly that’s 2% of the time the developer didn’t have to roll up their sleeves.

gmontard · on March 7, 2023

I agree, in theory :)

But I’m happy you say that and gives me hope our future automated remediation suggestion can be easily adopted.

itake · on March 7, 2023

I think these tools have to have the automation baked into the checks from v0. Adding it later can be a mess without the right abstraction.

olddustytrail · on March 7, 2023

You can't just fix that in code. FTP and SFTP are completely different protocols that use different servers.

You need a new server to talk to in order to fix that. And if it's a customer server maybe it can only do FTPS rather than SFTP.

itake · on March 7, 2023

Yeah… so this example is saying “you need to redesign your infrastructure before you can merge this change in.”

If sftp is a requirement, it should have been captured earlier in the process and not after the integration code was written.

gmontard · on March 7, 2023

In an ideal world security tools like this one should be useless… but unfortunately we don’t all live in this world where security requirements are all captured, understood and implemented correctly.

This is what just an exemple, think about application level encryption, leakage in logger messages etc.

deepakprab · on March 7, 2023

Tracking and mapping where your sensitive data goes is challenging and manual approaches always fall short. This is a very unique unique approach to preventing sensitive data leakage.

AlphaWeaver · on March 7, 2023

Also check out Wazuh, for another great solution in this area.

jlkinsel · on March 7, 2023

While it is great, Wazuh is not close to a static analyzer.

kwi · on March 8, 2023

Excellent product! I was a bit skeptical, but it worked on the first try on my Rails app and helped me discover a few issues!

alexandre_i · on March 7, 2023

Had the chance to try it a few weeks ago. Took only a couple of minutes to setup, and It gave me a a few interesting warnings about PII on one of my projects.

Feels like it would be a great tool for a team that is just starting to pay attention to security risks and vulnerabilities.

Will follow next evolutions of your tool, thanks for sharing!

thorhammer88 · on March 7, 2023

Congrats Bearer team, looks awesome