«
»

A Quick Introduction to GREP Search in InDesign & InCopy

Adobe InDesign (and its editorial counterpart, InCopy) has robust search features that make copy editing game text easier, but they require more training to use effectively than standard text searching. This document gives a basic introduction to using the advanced method of searching called GREP1, where you use special formats called “regular expressions” to find groups of similar matches at once.

These instructions give a quick walkthrough in constructing a GREP search, so that you get started with GREP right away without being bogged down in detailed explanations and numerous options. Once you’ve used GREP a few times, you’ll have the context for those details and options to make sense.

What Makes GREP Different

Unlike standard searching, GREP allows for sophisticated search terms involving special symbols that stands for multiple characters. With normal searching, if you want to find all instances of “mage” so you could change them to “wizard,” you would have to do the following:

  • Search on “mage” with the Whole Word option selected, so that you avoid false positives such as “damage,” and replace those instances manually after inspecting each use. As always, even if you’re certain about a change, be extremely cautious about using the Replace All feature.
  • Search on “mages” with the Whole Word option selected, repeated as above, to catch plurals without catching false positives such as “damages.”
  • Search for “mage” without the Whole Word option in order to hunt down any odd game terminology (such as “magelike”). This would also flag false positives (again, “damage”), so extra care is needed on this step.

grep-first

Constructing a GREP Search

Because this is a search we know how to do using the normal method, it makes a good example for using GREP. First, we’ll switch to using a GREP search by selecting “GREP” at the top of the search box. (To go back to a standard text search, select “Text” instead. If your search isn’t doing what you think it should be doing, first check to make sure you’re in the right mode).

Then, understand what you want to find—which in this exercise are both “mage” and some words that use “mage” as a root, like “mages” and “magelike” (but not “magic”). In GREP, that search term is:

\<mage[a-z]*\>

grep-second

This search term probably looks like gibberish to you. It breaks down as follows.

  • The first two characters (\<) is a special code that tells InDesign to start at the beginning of a word, which is defined by a letter  that has white space or punctuation before it, or a letter at the start of a story. This prevents a false positive on “damage,” since that word doesn’t start with “mage.”
  • The next four characters (mage) is the literal term we want to find. It works just as if you typed it into the normal search field.
  • The next five characters ([a-z]) is a special set of codes that tells InDesign to find any character from “a” to “z,” inclusive. The brackets indicate that you’re looking for the next character in the search to be one of multiple options. The hyphen between a and z indicates a range of characters. If you wanted to find only vowels, for example, you would instead use [aeiou]. If you wanted to find numbers, you would use [0-9].2
  • The next character (*) is a special character that indicates you want zero or more of the most recent character you’re searching for. Because what we’re looking for is “any letter,” the asterisk tells InDesign that we want the search results to include any letter that follows “mage,” including no letters at all. Note: if we wanted at least one of that character, rather than being okay if the match finds zero, use +, like so: [a-z]+.
  • The last two characters (\>) mean the end of a word boundary, defined by a letter followed by white space or punctuation, or a letter at the end of a story. Because hyphens trigger a word boundary (as all punctuation does), this finds the “mage” in “mage-like.” This is important to note, because a document may be inconsistent in its hyphenation of such terms.

What This Does & Doesn’t Find

Combined, this tells InDesign to find anything that starts with “mage” and is followed by any combination of letters, including no letters, that is a whole word. It finds the following, among other things:

  • mage
  • mages
  • magelike
  • magery
  • magemaster
  • mageeeee
  • magejasdhnufnelnafe
  • mage-like

It doesn’t find:

  • damage
  • mage0
  • Mage
  • MAGE

(Technically, it only finds the “mage” part of “mage-like,” but that is sufficient for this introduction. As you learn more about GREP, you’ll discover how to make it find all of “mage-like” without losing correct matches or causing false positives.)

The Double-Edge of Precision

At this point, you should be asking yourself, “Wait, it won’t find ‘Mage?’ Why?” This leads us to the first rule of working with GREP searches.

GREP is utterly precise. It doesn’t take kindly to sloppy search terms.

The precision is the power behind GREP, but also a pitfall for new users and anyone writing a search term in haste. To find the exact match of “Mage” along with the matches we’ve already found, we introduce a small change:

\<[Mm]age[a-z]*\>

By replacing m with [Mm], we tell InDesign that we want to find a word that begins with either an uppercase “M” or lowercase “m.” The rest of the word is expected to be in lowercase. If we want to search for any occurrence of “mage” regardless of case—Mage, MAGE, MaGeRy, mageLIKE, etc.—we would instead use this term:

\<(?i)mage[a-z]*\>

The (?i) tells InDesign to treat everything that follows that code as case-insensitive. The placement of this indicator is important; if you were to put it before the word boundary part of the search term, GREP would interpret the whole term differently, and not find instances such as “Mage.” (The detailed reason for this in the scope of this introduction, but once you get comfortable with GREP searching, take a few minutes to research the answer for yourself. You’ll learn a lot about more GREP as you do.)

End of This Lesson

Our exercise for finding cases of “mage” that we would like to change to “wizard” is done. If you think that this is more complicated than need be for such a simple search, you’d be mostly right, which leads to our second rule about using GREP.

Use GREP only when you need to. A powerful tool isn’t necessarily the best tool in all circumstances.

Practical GREP Searches

We use the following GREP searches on our projects.

Find all the XXs in a document: (?i)xx+
This uses our case-insensitive option again, searching for any place where we have more than one x together. It isn’t restricted to whole words; we would find places where we have “page xxx” and “XXXNEEDS EXAMPLEXXX.”

grep-xx

Find punctuation for a given font: \p{punctuation}
By itself, this finds punctuation in all cases. To restrict this to a given font, use the Find Format section of the search dialog.

grep-punct

Find cases where we are using hyphens around numbers when we should be using en-dashes: (\d* *- *\d+|\d+ *-[ \d]+)
This is an especially complicated example whose explanation is outside of this introductory document’s scope. However, the portion with the parentheses and the pipe symbol (|) is worth noting: this is two search terms in one, and it tells InDesign to find either one or the other.

You also see a variant code for digits: \d instead of [0-9]. It’s briefly introduced in the footnote below.

grep-digits

Don’t Use GREP for Search & Replace… Yet

GREP replacing is a much more complicated process that is easy to get wrong, so this document only covers searching. Once you’re intimately familiar with GREP searching and understand the nature of its precision, only then should you use GREP replacing.

Want to Know More?

There are various resources to help you learn about GREP in InDesign, including:

Get This as a PDF

Download this post as a three-page PDF I laid out in InDesign. Maybe hand it to someone who could use a quick intro to GREP searching.

– Ryan

1 Though it’s not important, but if you’re curious about what “GREP” stands for, it’s a legacy acronym from the 1970s Unix era for “Globally search for a Regular Expression and Print.” If you want to know more trivia, visit the Wikipedia page: en.wikipedia.org/wiki/Grep

2 There are multiple ways of finding any letter or any number. \l\u means “find any single lowercase or uppercase letter,” and \d means “find any single digit.” As you use GREP, you’ll find there are multiple ways to handle many searches.

Share
«
»

One Response to A Quick Introduction to GREP Search in InDesign & InCopy

  1. Kevin Richey says:

    I love grep for search and replace. I don’t have InDesign but I use Vim and command-line tools a lot.