And, just because I'm in a golfing mood: def gen_stats_tech2(dataset_python): st...

And, just because I'm in a golfing mood:

    def gen_stats_tech2(dataset_python):
        start = time.time()

        totals = defaultdict(lambda: (0, 0, 0.0))

        for product_id, order_id, quantity, price in dataset_python:
            o_count, o_quant, o_price = totals[product_id]
            totals[product_id] = (o_count + 1, o_quant + quantity, o_price + price)

        product_stats = [
            [product_id, num_orders, total_quantity,
             round(total_price / num_orders, 2)]
            for product_id, (num_orders, total_quantity, total_price)
            in totals.items()
        ]
        end = time.time()
        working_time = end - start
        return product_stats, working_time

950x improvement. I think the former reads more pleasantly and you're likely to do better with something like pypy though rather than dragging the code through the muck.

Most improvements are from use of some of the builtins like dictionary's .items method rather than making a getitem call per loop, removal of references to globals (int) since the values stored already have an integer value, removing N calls to append and letting the list-comp manage, and removing a store-then-unpack.

Sadly I don't think the top loop can be reduced in a similar manner because it's self-referential. Still, that made for a fun hour or so, I was really hoping to hit 1,000x improvement :/