

Unsafe Query Risk in Active Record - gkop
https://groups.google.com/forum/#!topic/rubyonrails-security/8CVoclw-Xkk

======
rubiquity
You can easily see if you are vulnerable to this by opening up rails console
and iterating over your models and seeing if the model's table name is a
column name:

    
    
        Dir.glob(File.join(Rails.root, 'app/models/**/*.rb')).each { |file| require file }
        ActiveRecord::Base.descendants.select do |model|
          model.column_names.include? model.table_name
        end
    

EDIT: As rosser points out below, the code above doesn't cover the case of
doing a join between two tables where one table has a column named the same as
the joining table name. Below you can find any table that has a column name
from another table. From there you can check to make sure you aren't joining
this table with the table sharing the same name as the column.

    
    
        ActiveRecord::Base.descendants.select do |model|
          shared = model.column_names & ActiveRecord::Base.descendants.map(&:table_name)
          if shared.any?
            puts "#{model.to_s} has potentially dangerous columns: #{shared.join(' ')}"
            true
          end
        end

~~~
rosser
Doesn't that only capture the first of the two vulnerable cases (a table that
has a column with the same name as the table itself)? What about the second
case, where you have a column on one table that has the same name as another
table?

Queries (PostgreSQL-specific) I whipped up investigating our stack this
morning for this issue that appear to catch both cases (corrections or
improvements welcome):

Case 1:

    
    
      select c.relname
           , a.attname
        from pg_class c
        join pg_attribute a on a.attrelid = c.oid
       where c.relname = a.attname;
    

Case 2:

    
    
      select c2.relname
           , c1.relname || '.' || a.attname
        from pg_class c1
        join pg_attribute a on a.attrelid = c1.oid
        join pg_namespace n on c1.relnamespace = n.oid
        left join pg_class c2 on c2.relname = a.attname
       where c2.relname is not null
         and n.nspname not in ('pg_catalog', 'information_schema');

~~~
rubiquity
I think case 2 is only for when you're doing joins, less likely but still
possible.

~~~
rosser
My query for that case may also yield some false positives, as we appear to
have a couple of cases where one table has a column that shares its name with
a completely unrelated table.

~~~
rubiquity
Yup, I updated my reply above. You can get the list but then you still have to
manually go through your codebase to make sure you aren't joining the table
with the potentially dangerous column name with that exact table.

------
saurik
> Due to the typical approach of pluralizing for tables and using singular
> names for columns, this kind of conflict is very unlikely.

In PostgreSQL, where a table is really a type definition (you can effectively
think of a table as a struct), using plural names for tables is against the
recommended practice; this will be much more common.

Though, in a way, the conflict is still rare for a different but highly
related reason, an implicit column which I sadly don't know enough about
ActiveRecord (having never used it) to know if this bug could be used on.

    
    
        select product from product;
    

This will return records, each with one field whose value is a record of type
product that contains the value of all fields in the row of the table being
scanned; effectively you are returning the row as a field.

(And now you can see an example where thinking of the table type name as
"plural" leads to weird semantics: PostgreSQL prefers thinking of them as
classes of object from which you are pulling a subset of instances.)

You can also compare this record if you can get a tuple into the ActiveRecord
query; alternatively, if ActiveRecord doesn't specify the type of its
parameter values (something a lot of drivers I've seen don't automatically do
for string input), you could also pass an unknown-typed text value that
happens to be comma-separated and surrounded in parenthesis and it will be
coerced to a compatible record type.

Again, I don't know if the issue with ActiveRecord requires the column to be
defined in the model or something (the description was really brief and didn't
explain the underlying bug), but maybe someone else knows.

~~~
icambron
In the absence of instructions telling it to do something different, AR relies
on the convention that instances of a class Foo are stored in a table called
"foos", regardless of what database system you're using. That cuts down on the
amount of configuration you have to do, and you wouldn't normally override it
unless you were applying AR to a preexisting schema.

Edit: removing speculation

~~~
saurik
OK, and I then imagine the idea is that the kind of person using ActiveRecord
doesn't "design their database with the idea of multiple potential consumers,
only one of which might be ActiveRecord"? Fair enough.

Regardless, the primary question that my post was leading up to stands: a
table with the name "foos" will still have what is effectively an implicit
column with the name "foos" that is a record value of type "foos".

~~~
icambron
People pick their tools per-project, so I would put it as "if you need to
build a database with multiple consumers, AR will be a less valuable choice
for any of them, and you are thus much less likely to use it for that
project". Right tool, right job, etc.

On your primary question, since I've now played with "select foos from foos"
on my PG database: I doubt it. When you do a query like Foo.where(foos:
'value').to_a or Foo.find_by_foos('value'), what you're basically doing is a
"select foos.* from foos where foos='value'". If foos doesn't actually exist
as an explicit column, that doesn't actually work:

PG::UndefinedColumn: ERROR: column foos.foos does not exist

So it's not like people have been writing queries like that expecting that the
implicit column will do something (which I'm not sure would make sense
anyway). Unless I'm misunderstanding something, it remains the case that you'd
have to have explicitly created a column with the same name as your table to
trigger this bug with otherwise working code.

Edit: Of course, maybe there are more ways to trigger this bug than the ones
provided in the email, and maybe they include queries where you might be
relying on an implicit column for something else, such as selecting foos from
foos. I'm just saying the cases listed in the email don't seem to be affected
by the implicit column.

~~~
saurik
Did you mean "foos.foos = 'value'"? I do not get an error on the column not
existing with "where foos = 'value'" (which is the query you wrote). However,
it does turn out I'm wrong on a key aspect: PostgreSQL refuses to convert
untyped-text to a record without a cast "ERROR: input of anonymous composite
types is not implemented" (which would make this require at least a tuple to
have been passed into the query to begin with, which seems extremely unlikely,
and likely require the cast, which wouldn't happen).

~~~
icambron
Sorry, yes, I meant "foos.foos='value'", which AR writes because the code
generating it doesn't pay attention to what joins you've attached (or in this
case, haven't) and what ambiguities that might create, so it does the safe
thing.

I also noticed the error you mentioned when playing with it, but didn't
understand it well enough to be confident there was no way to force that cast.

------
sergiotapia
Actually readable:
[https://gist.github.com/sergiotapia/89c2b90251c4d489a162](https://gist.github.com/sergiotapia/89c2b90251c4d489a162)

------
pjungwir
Query to check if you have any columns with the same name as their table (in
Postgres):

    
    
        SELECT table_name, column_name
        FROM   information_schema.columns
        WHERE  table_name = column_name;

------
fraser_newton
I am one of the reporters of this issue (Fraser Newton of Clio www.clio.com).

We applied the following patch (and configured the blacklist as described in
the comment) to resolve this issue. This works by rejecting any query
parameters that look like user-supplied hashes while allowing developer-
supplied hashes. e.g.,

    
    
      Model.where(something: params[:something])
    

raises an exception if params[:something] was constructed to be a hash while

    
    
      Model.where(something: {id: 1})
    

works as expected. This also doesn't require any casting to work around the
vulnerability.

    
    
      diff --git a/activerecord/lib/active_record/relation/predicate_builder.rb b/activerecord/lib/active_record/relation/predicate_builder.rb
      index 413b81c..a114f08 100644
      --- a/activerecord/lib/active_record/relation/predicate_builder.rb
      +++ b/activerecord/lib/active_record/relation/predicate_builder.rb
      @@ -1,6 +1,22 @@
       module ActiveRecord
         class PredicateBuilder # :nodoc:
      +
      +    ##
      +    # :singleton-method:
      +    # Refuse to trust attributes of the given classes when building predicates.
      +    # For example, the following...
      +    #
      +    #   ActiveRecord::PredicateBuilder.blacklist = [ActionController::Parameters]
      +    #   
      +    # ... would reject any user-supplied hashes submitted via controllers.
      +    cattr_accessor :blacklist , :instance_writer => false
      +    @@blacklist = []
      +
      +    
           def self.build_from_hash(engine, attributes, default_table, allow_table_name = true)
      +      ActiveRecord::PredicateBuilder.blacklist.each do |blacklisted|
      +        raise ArgumentError, "Blacklist violation: instance of #{blacklisted.name} found" if attributes.is_a?(blacklisted)
      +      end
             predicates = attributes.map do |column, value|
               table = default_table

------
joevandyk
I reported this one last year to their security list.

I noticed the problem when bots/scrapers screwed with URL parameters which
made ActiveRecord generate SQL for columns and tables that didn't exist in the
database. Sorta worrying.

You should make sure that you don't have any columns that match your table
names.

------
VeejayRampay
I really hate to be that person because the author took time to help other
people and it's a great thing to do, really, but this is excruciating to read
due to very strange page layout.

That being said, thanks to Rafael for a simple and layman-accessible
explanation of the issue.

