
A Visual Intro to NumPy and Data Representation - jalammar
https://jalammar.github.io/visual-numpy/
======
lejar
Nice overview! One thing I think you should add, which I find immensely useful
is the reordering of arrays using indexing.

Take for example:

    
    
        In [2]: numpy.array([1, 2, 3])[[0, 2, 1]]                                       
        Out[2]: array([1, 3, 2])
    

You index using a list and it gives you a view of the array with the new order
(the underlying array is not changed and there is no copy being done).

~~~
quietbritishjim
Using "fancy" indices like this does result in a copy because it can't be
represented as a simple slice of the original matrix. A good explaination is
here (it's from 2008 but still true):

[https://scipy-
cookbook.readthedocs.io/items/ViewsVsCopies.ht...](https://scipy-
cookbook.readthedocs.io/items/ViewsVsCopies.html#FAQ)

You can verify there's a copy by changing the new array __after putting the
result in a new variable __(see above link for why this makes a difference)
and verifying the old one is unchanged:

    
    
        >>> import numpy as np
        >>> x = np.array([1, 2, 3])
        >>> y = x[[0, 2, 1]]
        >>> y[0] = 3
        >>> y
        array([3, 3, 2])
        >>> x
        array([1, 2, 3])
    
    

Edit:

But a view can be based on a slice that includes a skip parameter, and in fact
you even slice in multiple dimensions and it will still be a view. _That_ is
worth discussing in the article:

    
    
        >>> x = np.array([np.arange(7), np.arange(7)+1]*3)
        >>> y = x[4:1:-2, 1:5:2]
        >>> y
        array([[1, 3],
               [1, 3]])
        >>> y[0,0] = 99
        >>> x
        array([[ 0,  1,  2,  3,  4,  5,  6],
               [ 1,  2,  3,  4,  5,  6,  7],
               [ 0,  1,  2,  3,  4,  5,  6],
               [ 1,  2,  3,  4,  5,  6,  7],
               [ 0, 99,  2,  3,  4,  5,  6],
               [ 1,  2,  3,  4,  5,  6,  7]])

~~~
improbable22
A related fun fact, when slicing several dimensions:

    
    
        >>> a = np.arange(9).reshape(3,3) # a matrix
        >>> a[0:3,0:3]          # ranges are treated independently
        array([[0, 1, 2],
               [3, 4, 5],
               [6, 7, 8]])
        >>> a[[0,1,2],[0,1,2]]  # but arrays are treated at once
        array([0, 4, 8])

------
grenoire
Pretty, but not particularly in-depth.

Also, nitpick but I can't hold it: Why isn't the MSE
np.mean(np.square(predictions - labels)? That's even breez-ier!

~~~
manojlds
I think it's generally done this way because of the way the formula is
represented mathematically.

------
milliams
I like this. One change I would make is on the aggregation and indexing
section, change the representation of single values (as opposed to single-
element arrays) to not be in a coloured box. It's important that the result of
these operations is a different type.

------
pard68
Numpy was a huge boon in college. I had mostly gotten my homework process down
to editing a LaTeX file with the csv files for my datasets and then when I
compiled it would first crunch the numbers with Numpy, export it as Tex, and
then build a pdf.

~~~
alanbernstein
Care to share an example?

~~~
pard68
I might still have something. I didn't version control it, but it might be on
Dropbox still.

------
dintech
This is excellent. I'd love to see even more on Pandas.

~~~
dintech
And now I see that you've already started one!

[https://jalammar.github.io/gentle-visual-intro-to-data-
analy...](https://jalammar.github.io/gentle-visual-intro-to-data-analysis-
python-pandas/)

------
iandanforth
It would be good to mention the @ operator in the matrix multiplication
section.

[https://alysivji.github.io/python-matrix-multiplication-
oper...](https://alysivji.github.io/python-matrix-multiplication-
operator.html)

~~~
improbable22
A warning sign that your faith in 0-based indexing may be faltering --
catching yourself writing comments like this :)

    
    
        # element at the top right. i.e. (1, 2) aka (0, 1) in python
        A[0, 0] * B[0, 1] + A[0, 1] * B[1, 1]

~~~
iandanforth
That's called the "Matlab Hangover"

------
1-6
Wow, this is so timely! I love the visual references. I'm still a little
confused about the section on Matrix Indexing. Overall, great work!

------
Vaslo
Good stuff! I'll definitely look for more from you!

------
tjpaudio
Nice page, but unless you have never used software for math before, I am not
sure it's very useful.

------
xvilka
Would be nice to have something like this, but for Julia.

~~~
improbable22
There is this, though shorter:
[https://julia.guide/broadcasting](https://julia.guide/broadcasting)

