Skip to content

Full text searching in Info mode with Apropos

Aug 4 11
by mickey

Most Emacs hackers quickly memorize the myriad of help commands available to them in Emacs; among them, the humble M-x apropos command, a full text regexp search command that searches all known Lisp symbols (macros, functions, variables – you name it) and returns a list of matches. I should point out that — as the two are often confused — it has a closely-related sibling M-x apropos-command, bound to C-h a, that will only search interactive commands*.

What most people are not aware of is a little-known command named info-apropos, a full text search for Emacs’s documentation browser, info (C-h i.) Not limited to just your Emacs documentation, the info-apropos command will also search other info documentation present on your system, such as the GNU CoreUtils manuals. What sets it apart from existing info commands like C-s (isearch) is that it will give you a full list of all matches and tell you what node it was found under. Very useful.

* – Of course, there are many more domain-specific apropos commands – use C-h a to find them :-)

Searching in Buffers with Occur Mode

Jul 20 11
by mickey

The Emacs M-x occur (also bound to M-s o) command is a useful replacement for GNU grep, when your only requirement is searching open buffers in Emacs. Like grep, the occur command will take a regular expression and print, in a separate buffer, all the lines that match the expression. One really nifty thing about occur is that it will preserve the faces (the colors, or syntax highlighting, if you will) in the displayed matches.

The default command, M-x occur, will only search the active buffer, but its cousins M-x multi-occur and M-x multi-occur-in-matching-buffers will search the buffers you specify, or all buffers that match a given regexp pattern, respectively.

There’s a few helper commands that will make your life easier, such as occur-rename-buffer that renames an *Occur* output buffer so it includes the names of the buffers it searched. Useful if you want to search for different things. This command is also bound to r in the *Occur* buffer itself.

You can also re-run the occur command by pressing g in the output buffer. (Note: this is actually a standard, of sorts, used by most interactive buffers including dired, compile and grep.)

Another useful feature is its support for the compilation mode commands next/previous-error (M-g M-n and M-g M-p respectively), as they enable you to cycle through the list of occur matches from within the source buffer itself.

In a similar vein, you can enable follow mode in the *Occur* buffer by pressing C-c C-f, and future calls to M-n and M-p in the *Occur* buffer will automatically jump to the correct match in the source buffer.

Making Occur a little more useful

My only complaint about occur is that it does not let you quickly search a set of buffers that match a specific major mode — arguably a common use case if you’re a programmer. The code seen below will search all open buffers that share the same mode as the active buffer.

(eval-when-compile
  (require 'cl))
 
(defun get-buffers-matching-mode (mode)
  "Returns a list of buffers where their major-mode is equal to MODE"
  (let ((buffer-mode-matches '()))
   (dolist (buf (buffer-list))
     (with-current-buffer buf
       (if (eq mode major-mode)
           (add-to-list 'buffer-mode-matches buf))))
   buffer-mode-matches))
 
(defun multi-occur-in-this-mode ()
  "Show all lines matching REGEXP in buffers with this major mode."
  (interactive)
  (multi-occur
   (get-buffers-matching-mode major-mode)
   (car (occur-read-primary-args))))
 
;; global key for `multi-occur-in-this-mode' - you should change this.
(global-set-key (kbd "C-<f2>") 'multi-occur-in-this-mode)

Repeating Commands in Emacs

Jul 15 11
by mickey

Repeating a command you just carried out is a surprisingly useful thing to do, yet most people are completely unaware that bound to C-x z is Emacs’s repeat command.

Like the . command in vi, the repeat command will repeat the last action, skipping any input events (like character input.)

To save you from press the rather awkward keybind every time you want to repeat something, you can repeatedly press z after your first invocation to call repeat. Of course, you can also use the universal argument to repeat the command N number of times.

As I mentioned in my article on Mastering Keybindings in Emacs, you can also repeat (and edit!) complex commands like query-replace-regexp by typing C-x M-: or C-x M-ESC.

Sorting Text by Line, Field and Regexp in Emacs

May 29 11
by mickey

Sorting text is such a common operation that Emacs has several commands dedicated to it, ranging from line-based sorting to complex field sorting by regexp.

Important Points

Case Sensitivity

By default Emacs will distinguish between upper and lowercase alphabet when determining sort order, and this behavior is governed by the variable sort-fold-case. Set it to t to force Emacs to ignore case differences when sorting.

Sorting Order

Emacs will by default use lexicographic sorting for all but the sort-numeric-fields command. Make sure you use the right command for the job if you want to sort numbers.

You can reverse the order of some sort commands by using a negative argument; for the commands where this does not work you must use M-x reverse-region.

Sorting by line

The simplest sorting routine is sort-lines and that function does pretty much what you would expect it to.

Let’s sort these names by line

Jerry
Elaine
George
Cosmo

And this is what the expected output should be:

Cosmo
Elaine
George
Jerry

Sorting by Paragraphs and Pages

To sort by paragraph you use the sort-paragraph command. The definition of a paragraph varies by mode, but it is usually defined as anything that is separated by one or more newlines. The variables paragraph-start and paragraph-separate control how paragraphs work.

Emacs will treat something as a page if it is delimited by the form feed character, which is ASCII 12. To sort by page use the sort-pages command.

Sorting by Fields

Sorting by a field is much akin to sorting tabulated data: you have a list of data and you wish to sort by only a subset of that data — a field.

Emacs has two commands to do this: sort-fields for most things; and sort-numeric-fields for numeric sort order. Both require a numeric argument if you want to sort by anything other than the first field, where a field is defined as anything separated by a whitespace such TAB or SPACE. If you pass a negative argument, then Emacs will count backwards when picking the field to use.

I recommend that you use the numeric sort if you intend to sort by numbers as Emacs is clever enough to detect hexadecimal (if beginning with 0x) and octal (if beginning with 0) or an entirely different base, as determined by the sort-numeric-base variable, which defaults to 10 (for decimal.)

If you don’t sort numbers using the numeric command you risk sorting your numbers the wrong way.

Sorting fields

Let’s sort by first name and then by last name.

Jerry Seinfeld
Cosmo Kramer
Elaine Benes
George Costanza

To sort by first name I type M-x sort-fields. No need for a numeric argument as it will default to one, which is the first field — the first name field.

Cosmo Kramer
Elaine Benes
George Costanza
Jerry Seinfeld

OK, so that’s nice and easy. Sorting by last name is just as easy. Type M-2 M-x sort-fields and you should see this:

Elaine Benes
George Costanza
Cosmo Kramer
Jerry Seinfeld

Sorting fields numerically

Sorting numerically with sort-numeric-fields is much the same as with sort-fields, though I will highlight why it is important to use the correct command when you want to sort numbers.

Consider the following data.

4 – Locke
8 – Reyes
15 – Ford
16 – Jarrah
23 – Shephard
42 – Kwon

I want to sort by the number, but I will do so with sort-fields and not sort-numeric-fields.

15 – Ford
16 – Jarrah
23 – Shephard
4 – Locke
42 – Kwon
8 – Reyes

Hmm. Not exactly the intended output. The sort-fields command (indeed, so would sort-lines) will sort lexicographically and not numerically.

Again, with sort-numeric-fields this time.

4 – Locke
8 – Reyes
15 – Ford
16 – Jarrah
23 – Shephard
42 – Kwon

Much better.

Sorting by Regular Expression

I love this command. It’s very powerful and lets you do sorting with the precision of a regular expression.

The sort-regexp-fields works by searching the region for everything that matches a record regexp and for each match it finds, it looks in that record for a key regexp. The key is used to determine how to sort each record.

What this means, in practical terms, for you is that you can sort just a subset of your text and leave the rest untouched. In other words, you could, if you wanted to, sort only parts of the text but leave the rest as it were; for example, sort everybody’s first name but without shuffling the last name as well.

The key prompt, if left blank, will default to \&, which is the entire match string. If you have capturing groups in the record regexp, you can use the usual \N subexpression matching.

Emulating sort-lines

To emulate sort-lines you can run sort-regexp-fields with these parameters:

Regexp specifying records to sort: ^.*$
Regexp specifying key within record: \&

Complex sorting

Say you want to sort the text below by the last character in each last name.

Cosmo Kramer
Elaine Benes
George Costanza
Jerry Seinfeld

Invoke sort-regexp-fields and use the following parameters:

Regexp specifying records to sort: \w+\(\w\)$
Regexp specifying key within record: \1

The resultant output is what you would expect — almost!

Cosmo Costanza
Elaine Seinfeld
George Kramer
Jerry Benes

The sort command only sorted the last name — which is all the record regexp matched — and left the first names alone. Let’s try again with a revised record parameter:

Regexp specifying records to sort: ^.+\w+\(\w\)$
Regexp specifying key within record: \1

Now the output is correct:

George Costanza
Jerry Seinfeld
Cosmo Kramer
Elaine Benes

So regexp sorting is really powerful but can introduce subtle errors you may not spot right away. Always match the entirety of each unit — each record — and never do partial matchess unless that is what you want, of course. Use the subexpression matches to pick out the actual keys you want to sort by.

Conclusion

Sorting in Emacs is really powerful and a very useful tool if you do any sort of data scrubbing or manipulation. But beware the differences between lexicographic and numeric sorting when you work with numbers, and double-check the regexps you use when you sort by regexp fields.

re-builder: the Interactive regexp builder

Apr 12 11
by mickey

I doubt it’s a well-kept secret that Emacs has a regexp “helper” called M-x re-builder. But if you haven’t heard about it before, Emacs’s re-builder lets you interactively build a regular expression and see what it matches on the screen. It’ll even uniquely color capturing groups so you can tell them apart.
read more…

Working with multiple files in dired

Mar 25 11
by mickey

Here’s another frequent workflow question that needs answering:

How do you work on/with multiple files in bulk, and what if the files you want to edit are split over multiple directories?

The simplest solution, when the files are all in the same directory, is to use M-x dired (C-x d) and mark the files you want to work on. I’ll wax lyrical about dired’s amazing features in another article, so I’ll just talk about working with files en masse today.

So working with files in a single directory is nice and easy, but what if the files you want to change are not in the same directory?

Surprisingly, it is actually very, very simple. Dired works by querying ls (or an emulated equivalent if it is not available) and formatting the printed output from ls. In fact, the GNU version of ls comes with a switch -D specifically designed to format the output to suit dired (though it’s not used in newer Emacsen.)

Now that you know how dired works, it’s not such a far stretch to imagine that dired is capable of acting on any number of files, regardless of where they are on your filesystem — provided dired knows where they are, of course.

The GNU Toolchain

A lot of the external commands invoked in Emacs will involve the GNU CoreUtils, FileUtils and FindUtils packages (ls, find, mv, cp, etc.) so if you’re on Windows make sure you install the Win32 ports of the previously mentioned packages as a lot of stuff in Emacs won’t work without it.

If you’re on a platform where you don’t have GNU CoreUtils (or a commercial equivalent) then there is some hope, as an elisp version is available.

The Find Dired library

The command find-dired will use find to match the files and ls to format them so dired can understand it. It’s pretty bare-bones and it lets you change the syntax for find to suit your immediate needs.

Generally, though, I find find-name-dired to be more useful for day-to-day use when all I want is to feed it a single string to match against.

By default Emacs will pass -exec to find and that makes it very slow. It is better to collate the matches and then use xargs to run the command. To do this instead add this to your .emacs:

(require 'find-dired)
(setq find-ls-option '("-print0 | xargs -0 ls -ld" . "-ld"))

The Find Lisp library

The Find Lisp library is very similar to the Find Dired library, but instead of calling out to the external find tool, the the find lisp library emulates a very basic find in elisp.

The find lisp library uses Emacs’s regular expression engine and that means you cannot use wildcards like *.foo; instead, you have to write \.foo$. That may or may not be an issue for you, but it is worth keeping in mind.

One advantage the find lisp library does have is that it is very fast compared to find dired (though it will not run asynchronously like calling out to find would.)

To find stuff with find lisp type find-lisp-find-dired. To see only directories use find-lisp-find-dired-subdirectories. Yeah. Quite a mouthful.

Final Thoughts

I haven’t covered the deeper points of dired, but I hope to do that soon enough; dired is a complex beast and it has an incredible amount of features that, when combined with elisp and Emacs itself, makes it the best file manager in the world.

Removing blank lines in a buffer

Mar 16 11
by mickey

This is a frequent question so I figured I’d mention the solution here:

You want to remove all empty (blank) lines from a buffer. How do you do it? Well, it’s super easy.

Mark what you want to change (or use C-x h to mark the whole buffer) and run this:

M-x flush-lines RET ^$ RET

And you’re done. So what does that mean? Well, M-x flush-lines will flush (remove) lines that match a regular expression, and ^$ contain the meta-characters ^ for beginning of string and $ for end of string. Ergo, if the two meta-characters are next to eachother, it must be a blank line.

We can also generalize it further and remove lines that may have whitespace (only!) characters:

M-x flush-lines RET ^\s-*$ RET

In this case \s- is the syntax class (type C-h s to see your buffer’s syntax table) for whitespace characters. The * meta-character, in case you are not a regexp person, means zero or more of the preceding character.

Update — Pete Wilson asks: “How do you collapse multiple lines into one blank line?”.

That’s a bit harder, mostly because flush-lines only works well on whole, single lines. For multi-line processing you have two choices: you can abuse regexp, or you can use a macro. It’s fairly easy to do it with regexp in this case, but for more complex data-scrubbing I would use a macro; nevertheless, I will do it both ways*.

*I’m pretty sure my macro/regexp examples are general enough to work in all cases; but let me know if they aren’t

For the regexp approach I will use C-M-% (query-replace-regexp) and because I have to use a literal newline character I will use Emacs’s quoted-insert command, bound to C-q. So to insert a newline, you would type C-q C-j.

The ^J represents the literal newline or line feed character (see ASCII Control Characters on Wikipedia for more information).

So the text we want to search for looks like this:

Search For: ^^J\{2,\}

Replace With: ^J

So how does it work? Well, we tell Emacs to search for any two or more newlines that are at the beginning of a string — where each line is considered a string by Emacs — and because we search for two or more we skip the ones that only have a single newline. So if there are 10 newlines in a row, we replace them all in one fell swoop with a single newline. You can omit the replacement newline to remove them altogether!

The other way is very similar and uses a keyboard macro, C-M-r and delete-blank-lines bound to C-x C-o. This approach is more complicated than it really ought to be, because delete-blank-lines will annoyingly (in this case — it’s a useful feature otherwise!) convert multiple blank lines into a single blank line (good), and remove single blank lines altogether (bad.)

To make the macro, go to the end of the buffer M->, press F3 to begin recording, and then type C-M-r and in the isearch prompt enter:

Regexp ISearch backward: ^^J\{2,\}

Press return to go to the first match and press C-x C-o. Now press F4 to stop recording and you’re done with the macro. Press C-u C-x e to fix all remaining instances, and that’s it — you’re done.

Exercise to the reader: Why did I search in reverse with C-M-r instead of using C-M-s?

What’s New in Emacs 23.3

Mar 10 11
by mickey

Update 10/3/11: Emacs 23.3 is officially out

Emacs 23.3 is out now. Get the source archives here or the compiled Windows binaries here.

What’s New in Emacs 23.3

Here’s a few interesting changes. The full set can be read by typing C-h n.

You can allow inferior Python processes to load modules from the current directory by setting python-remove-cwd-from-path to nil.

This is very useful if you work with Python. If you set that variable to nil you can force Emacs to add the current directory to sys.path, meaning you can import the modules in that path without having to add it manually. The docstring correctly points out that this is a potential security risk, so do keep that in mind.

New VC command vc-log-incoming, bound to C-x v I. This shows a log of changes to be received with a pull operation. For Git, this runs “git fetch” to make the necessary data available locally; this requires version 1.7 or newer.

New VC command vc-log-outgoing, bound to C-x v O. This shows a log of changes to be sent in the next commit.

Good news for DVCS users.

The g key in VC diff, log, log-incoming and log-outgoing buffers reruns the corresponding VC command to compute an up to date version of the buffer.

A useful update; now we don’t have to (kill) the buffer and re-run the command in the original buffer.

Special markup can be added to log-edit buffers. You can add headers specifying additional information to be supplied to the version control system. For example:

Author: J. R. Hacker
Fixes: 4204
Actual text of log entry…

Bazaar recognizes the headers “Author”, “Date” and “Fixes”. Git, Mercurial, and Monotone recognize “Author” and “Date”. Any unknown header is left as is in the message, so it is not lost.

Again, very useful if you use Emacs’s own VC facility to commit files.

smie.el is a generic navigation and indentation engine. It takes a simple BNF description of the grammar, and provides both sexp-style navigation (jumping over begin..end pairs) as well as indentation, which can be adjusted via ad-hoc indentation rules.

Now this is very interesting. A generic indentation and navigation module is a most useful addition to Emacs. Currently, as far as “generic” goes, I suppose we were limited to abusing syntax tables and complex functions like parse-partial-sexp and the other helper functions in the syntax.el library. But now we can write a more generic navigation/indentation engine and possibly gain other benefits from using it as well.

How this fits in with Emacs’s long-term plan of integrating CEDET and the Semantic Bovinator lexer/parser/parser-generator remains to be seen.

The SMIE (Simple-Minded Indentation Engine) will no doubt be a useful tool for elisp hackers and I wonder if it is powerful enough to indent things like Python (or, ugh, all the many flavors of C styles) without resorting to nasty hacks.

You can read more about it in the file itself — see M-x find-library smie — or its newly-minted info manual at M-: (info "(elisp) SMIE").

Using the commandline network utilities from Emacs

Mar 2 11
by mickey

Unbeknownst to many, Emacs comes with a full suite of wrappers around the common GNU network utilities.

Most of the utilities are just simple wrappers around their command-line equivalents, but in full technicolor; but some — like the nslookup support — also adds full Emacs comint support.

Another useful feature is the built-in ffap support (it means find file at point) and it will try to determine if the point is — if used interactively with the net utils below — on a hostname or IP and default to that.

The net utils library were written with the GNU libraries in mind, so Windows users may find the support a bit lacking. But you can always download the Win32 ports.

Here’s a list of utilities Emacs supports; invoke with M-x. You may have to configure them to your liking, and you can do that by invoking M-x customize-group RET net-utils RET.

Command Description
ifconfig and ipconfig Runs ifconfig or ipconfig
iwconfig Runs the iwconfig tool
netstat Runs the netstat tool
arp Runs the arp tool
route Runs the route tool
traceroute Runs the traceroute tool
ping Runs ping, but on most systems it may run indefinitely; adjust ping-program-options.
nslookup-host Runs nslookup in non-interactive mode.
nslookup Runs nslookup in interactive mode in Emacs as an inferior process
dns-lookup-host Look up the DNS information for an IP or host using host.
run-dig and dig Invokes the dig in interactive mode as an inferior process
ftp Very simple wrapper around the commandline tool ftp. You are probably better off with TRAMP for all but low-level system administration.
smbclient and smbclient-list-shares Runs smbclient as an inferior process or list a hosts’ shares.
finger Runs the finger tool
whois and whois-reverse-lookup Runs the whois tool but tries to guess the correct WHOIS server. You may have to tweak whois-server-tld and whois-server-list or set whois-guess-server to nil

How to mark a buffer as “not modified”

Feb 26 11
by mickey

You can tell Emacs to set a buffer as not modified (even though it may well be) by pressing M-~, also bound to M-x not-modified. This will obviously suppress any save prompts for that file — at least until you do something that makes it become modified again — so do be careful.