
Show HN: Tag – instantly jump to your ag matches - aykamko
https://github.com/aykamko/tag
======
whatnotests
For anyone else confused about what an "ag" match is:
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)

(Ag = Chemical symbol for silver)

------
rogual
Surprised ag isn't better known here. If you grep through code a lot and you
haven't tried it, I'd heartily recommend it.

~~~
TorKlingberg
I have limited time for keeping up with non-standard tools. It's the same
reason I use bash rather than zsh/ksh/fish. With grep I have a solid mental
model of what it does. ack/ag have some magic that is fine most of the time,
but will eventually bite me when I do something unusual.

Still, I will check out ag. Maybe this time I will stick with it.

On a side note: People often say that C is just for legacy, and today
everything new should be written in a higher level language unless it's a
kernel or otherwise inherently low-level. Yet ag is basically a rewrite of ack
in C.

------
TomHubelbauer
Can someone explain how searching files is "embarrassingly parallel"? I
imagine storage drive to be sequential, the reading head can only be in one
place at the time. For SSDs, this may be different, that I don't know. Does
this speed up only apply to SSDs? Or is the I/O not the bottleneck here and
instead the project benefits from the searching of memory cached files being
somehow well parallelizable (memory still seems basically sequential, but is
much faster, maybe fast enough to make this problem overwhelmingly CPU-bound?)

~~~
ggreer
I'm the author of ag. Hardware may or may not support it, but the algorithms
involved are embarrassingly parallel. Searching involves no data dependencies
between files. If you're searching 1,000 files, finding results in file #827
doesn't require any information from files 1-826. That's what I meant by
"embarrassingly parallel".

Side note: originally, ag was single-threaded. When I added pthreads, it only
gave a 15% speedup in my test benchmark.[1] The speedup is larger if checking
a file for matches requires more CPU usage (such as when using a complex
regex).

1\. [http://geoff.greer.fm/2012/09/07/the-silver-searcher-
adding-...](http://geoff.greer.fm/2012/09/07/the-silver-searcher-adding-
pthreads/)

~~~
TomHubelbauer
EDIT: On second thought you basically answered my question in the last
paragraph of your reply, I'm keeping it up in case you want to confirm my
understanding, but the way I read it, the 15 % speed up comes from the fact
that indeed CPU processing (the regexes) was comparable to the file reading in
speed and breaking up the bits to individual core caches is what gives way for
the speed up, memory is fast enough for this to be possible.

I don't understand the I/O part of the whole deal. AFAIK you can't read
multiple files at once off the drive, so are the files read sequentially to
memory and then searched in parallel in memory? Does memory allow you to read
multiple places at once, even with multiple threads? Surely the files are too
large for the processor caches to be of any effect. So if you'd entertain my
curiosity a bit more, is what's happening the fact that multiple files are
cached in memory and memory is so fast that loading bits of the files to the
processor cache takes less time than regexing those bits, moving the balance
of this process to the CPU-bound side? Is the actual parallelism in the fact
that multiple cores can search their individual caches at the same time and
loading those caches from the RAM is fast enough to not become a bottleneck?
Sorry for possibly amateurish question, I've never dug deep enough when it
comes to parallelism to understand this, but I spent a good amount of time
thinking about parallelising I/O stuff and came to the conclusion that I/O
must be magnitudes slower and thus always is a bottleneck and any sort of
I/O-bound problem (which file search surely is) must be non-parallelizable and
instead can only be sped up by keeping indexes the way some OS's do.

~~~
todd8
One of the more important jobs a system's OS does is manage I/O devices.
Modern kernels don't wait for a read request and then go and read from the
disk drive. They both read ahead, anticipating future read requests from the
pattern of requests already made, and cache as much read data as they can to
avoid touching the disk when a file is revisited. (I was a kernel architect on
IBM's AIX.)

While coding, your overall system is really just idling and it wouldn't be
unusual for many of your projects header files and source files to be cached
in memory because of your last compile. This effect is more pronounced on
machines with a lot of memory of course. The cost savings from careful disk
management is very important overall, but will it speed up the ag application?
I'm surprised that ag only gets a 15% speed up with threads, but naturally it
will depend on many factors.

~~~
TomHubelbauer
Very cool insight, do you know if the Windows kernel does the same? I'd like
to learn more about it in context of Windows first, then Linux.

~~~
todd8
Yes Windows does disk caching. This looks like a nice explanation:

[https://msdn.microsoft.com/en-
us/library/windows/desktop/aa3...](https://msdn.microsoft.com/en-
us/library/windows/desktop/aa364218\(v=vs.85\).aspx)

------
maerF0x0
There is already a ag plugin for vim that will jump you to the matches.

See: [https://github.com/rking/ag.vim](https://github.com/rking/ag.vim)

or see: [http://codeinthehole.com/writing/using-the-silver-
searcher-w...](http://codeinthehole.com/writing/using-the-silver-searcher-
with-vim/)

~~~
aykamko
Yes, you're right! I mention this in my README:

> Inside vim, vim-grepper or ag.vim is probably the way to go. Outside vim (or
> inside a Neovim :terminal), tag is your best friend.

~~~
johncoltrane
No, the way to go is to set the `grepprg` option.

~~~
akavel
And in case you need more flexibility juggling one-off calls to various
similar tools, try:

:cex system('grep_or_ag_or_whatever -flags pattern') | copen

With proper ':set errorformat=...', you can call your compiler with this, or
'go test', or much more various stuff (even Go panic stacktraces).

------
todd8
Does anyone else have the trouble that I do in watching these short animations
introducing some feature? I must be slow because I always have to watch them
two or three times to figure out what they are trying to show me. I think if
they ran at a more leisurely pace I would comprehend them faster.

The worst are new editor packages; trying to follow what's going on in a Vim
session or an Emacs session with the author flying through the features of his
or her new package without commentary is always frustrating to me.

------
sdegutis
TL;DR: it does this by generating shell aliases which use $EDITOR to open the
file. Neat, but does it clean up the aliases when you use one? And what if you
don't use any? Kinda makes my OCD cringe a bit.

~~~
anamexis
It looks like the alias file is overwritten with each invocation.

~~~
aykamko
The alias file is indeed overwritten on every invocation, but tag doesn't
attempt to clean up stale aliases from previous invocations. I figured it
wasn't really worth implementing this since, as far as I know, shells don't
slow down from having "too many aliases".

~~~
abglassman
tag is very cool, but please don't implement the shortcuts with shell aliases:

    
    
      ~/badcode$ echo "func" > badfile.go\;\ echo\ \"Gotcha\!\"
      ~/badcode$ ls
     badfile.go; echo "Gotcha!"
      ~/badcode$ tag func
     badfile.go; echo "Gotcha!"
     [1] 1:func
      ~/badcode$ e1
     Gotcha! +1
    

In case that's not clear, I was able to create a file with a name that
contains potentially malicious shell commands that is now bound to an alias
via tag.

Other ways to implement the shortcuts: create a flag or subcommand to look up
an hit from the results file and jump to it (I can alias this command to suit
my taste) or prompt me for input before returning to the command line.

------
agentgt
This sort of reminds me of fasd[1]. The thing is I think I would prefer a menu
instead typing "e6". Typing line numbers to go somewhere seems to be an
inherent Vim user trait that I just haven't picked up (ie typing the command
plus some number is just too much of a cognitive load for me.. I would rather
press the keys over and over again).

Maybe you can pipe ag to the venerable "dialog" (yes that thing is garbage and
reminds me of old school RedHat installs) but I wonder if someone has
something better.

[1]: [https://github.com/clvv/fasd](https://github.com/clvv/fasd)

------
riffraff
why not have a separate command rather than shell aliases? i.e.

    
    
       $ tag func # returns e1..e4, e5, e6 and dumps them in .tag-locations
       $ tagopen e4 # uses .tag-locations to open $EDITOR
    
    

you still have scary global state, but at least it's limited to the
`tag/opentag` command. Plus, you can do `tagopen` without arguments and show a
menu :)

------
Myrmornis
Cool. For emacs, here's an interface for narrowing down `git grep` and `ag`
results that I've been enjoying using: [https://github.com/dandavison/emacs-
search-files](https://github.com/dandavison/emacs-search-files)

~~~
barrkel
How does it compare to helm-git-grep / helm-ag ?

~~~
Myrmornis
Not as good :) Thanks, you got me to finally try out helm and helm-projectile.
Mine did a couple of things, like searching for function defense that might be
nice to implement in helm-world. I expect someone has already.

------
nathankot
Another tool that uses ag to look for tags (emacs):
[https://github.com/jacktasia/dumb-jump](https://github.com/jacktasia/dumb-
jump)

------
cosmicexplorer
[https://github.com/syohex/emacs-helm-ag](https://github.com/syohex/emacs-
helm-ag) is also super cool, although it works from within emacs; you can
interactively preview the matched areas without opening files, and jump
directly to matches. tag seems really cool for people who prefer working more
directly in the terminal, though (or people who don't use emacs).

------
Pirate-of-SV
Cool but I'm not a big fan of the idea by storing "state" in aliases that
later commands can reference. (It's better than environment variables though.)

For source code I've had a really good experience with ctags[1].

[1]: [https://github.com/universal-ctags/ctags](https://github.com/universal-
ctags/ctags)

~~~
spb
What exactly is ctags? That README and the accompanying docs seem to only
describe how the project has changed from an unseen progenitor on SourceForge.

------
amelius
It seems to me that a more generic solution would be preferable. For example,
why should this work only with "ag"? Also, this could work with vim to allow
the user to skip to the next match more easily.

------
jmcomets
Similarly, I'd love a more generic version of this (kind of like Vim's
quickfix and `errorformat`). Typing `make` or `cargo build` and jumping
directly to any errors would be extremely fast. :)

------
seanp2k2
Cool idea but terrible name. No one will ever be able to google this.

~~~
namuol
[https://www.google.com/?q=tag%20matches%20shell](https://www.google.com/?q=tag%20matches%20shell)

------
tamana
Are we supposed to know what "ag" is?

~~~
x0x0
If you haven't tried
[https://github.com/ggreer/the_silver_searcher](https://github.com/ggreer/the_silver_searcher)
yet you're missing out

The progression is grep with a ton of flags (or even find | xargs grep) -> ack
-> ag.

ag is faster than ack, automatically understands .gitignore, gives you
.agignore as well, and is just a really nice piece of software.

~~~
RasmusWL
ag has issues with .gitignore, as patterns are not recognized properly. The
matcher that git uses cannot be included due to licensing (see
[https://github.com/ggreer/the_silver_searcher/pull/614](https://github.com/ggreer/the_silver_searcher/pull/614)).

Does seem like there is hope, as a BSD licensed version exists. My fingers are
crossed that this is solved soon :)

~~~
pmontra
ag uses .agignore

I often need to search only on a subset of the files checked in git. Example,
I don't search the min.js files but I need them for the deploys.

It would be nice if they could share the same format.

------
amawgad
as a long time ag user that is neither using vim or emacs, I salute you!

------
omegote
Not sure about this wrapper for an unknown tool, "ag". I don't know, what's
wrong with good ol' grep -Hnir ?

~~~
wowoc
Ag is way faster.

~~~
efaref
It seems an order of magnitude slower to me:

    
    
        $ time grep -r some_token .
        real	0m0.467s
        user	0m0.252s
        sys 	0m0.215s
    
        $ time ag some_token
        real	0m2.948s
        user	0m0.112s
        sys 	0m3.083s
    

(Both run twice to ensure the disk cache was warm).

Am I doing something wrong?

~~~
drcongo
Could be to do with where you're searching. The fact that ag skips everything
in your .gitignore seems to have helped when I tested. Both of these were run
at the root of my projects directory...

    
    
      $ time grep -r f_admin .
      35.87s user 6.50s system 65% cpu 1:04.47 total
    
      $ time ag f_admin
      1.51s user 4.94s system 196% cpu 3.284 total

~~~
efaref
I'm searching mostly C code. I'm in the src directory, so there's nothing
there except the source code.

