select-string.jpg

Select-String: The grep of PowerShell

Select-String: The grep of PowerShell

When writing PowerShell code and you need to search for text inside of a single string or an entire text file, where do you turn? If you've used Linux very much, you're probably familiar with the popular grep utility. The grep utility allows a user to search text using some different options, but this utility doesn't exist in Windows. Are we out of luck? No. We've got PowerShell's Select-String cmdlet.

Using PowerShell for a Typical Grep Task

Let's say you've got a big string of containing various employee names and addresses. Unfortunately, this string isn't in any well-known structure, so you're forced to pull out all of the employee names via text parsing. How would you make that happen? First, let's start with the example string we'll be using.

||Adam Bertram|| 2122 Acme Ct, Atlantic City, NJ
--||Joe Jonesy||-- 555 Lone St, Las Vegas, NV
==|Suzie Shoemaker|== 6783 Main St, Los Angelas, CA

I've assigned this string to a variable called $employees. To grab just the employee names from this string, I'll first attempt just to get the syntax right on Select-String. To do that, I'll statically search for one of the employee names using the Pattern parameter.

PS> $employees | Select-String -Pattern 'Adam Bertram'

||Adam Bertram|| 2122 Acme Ct, Atlantic City, NJ
--||Joe Jonesy||-- 555 Lone St, Las Vegas, NV
==|Suzie Shoemaker|== 6783 Main St, Los Angelas, CA

Notice that Select-String did return something so it found a match otherwise it would have returned nothing. But, returned the entire string. Why's that? Well, the reason was that Select-String parsed the entire string as one. We first need to figure out how to separate out each of these lines into different strings. Since each employee reference is on a new line, I can break them up by splitting this string on the new-line character (`n).

PS> $employees = $employees -split "`n"
PS> $employees | Select-String -Pattern 'Adam Bertram'

||Adam Bertram|| 2122 Acme Ct, Atlantic City, NJ

Now notice that it's just returning a single line. We're getting closer! Next, I need to figure out how to return all employee lines. To this, I need to figure out a familiar pattern each shows.

Finding Patterns with PowerShell Select-String

 

It looks like each employee name is surrounded by a | character. We can use this pattern in the Pattern parameter on Select-String. Also, since each employee's first and last name is separated by a space, we can account for this as well.

I'll now represent this pattern as a regular expression which Select-String gladly accepts in it's Pattern parameter.

PS> $employees | Select-String -Pattern '\|\w+ \w+\|'

||Adam Bertram|| 2122 Acme Ct, Atlantic City, NJ
--||Joe Jonesy||-- 555 Lone St, Las Vegas, NV
==|Suzie Shoemaker|== 6783 Main St, Los Angelas, CA

Notice now that Select-String has returned each line again using the regular expression. Next, I need to parse out each of the employee names themselves. At this time, I don't need the address for each. To do this, I'll reference the Matches property returned on each matched object that Select-String returns.

PS> $employees | Select-String -Pattern '\|\w+ \w+\|' | foreach {$_.Matches}

Groups   : {0}
Success  : True
Name     : 0
Captures : {0}
Index    : 1
Length   : 14
Value    : |Adam Bertram|

Groups   : {0}
Success  : True
Name     : 0
Captures : {0}
Index    : 3
Length   : 12
Value    : |Joe Jonesy|

Groups   : {0}
Success  : True
Name     : 0
Captures : {0}
Index    : 2
Length   : 17
Value    : |Suzie Shoemaker|

We're getting closer! I now see that the Value property contains the employee names I need, but it's still got those pipe characters surrounding them. This is because the regex match was the employee name including the pipe characters.

Regex and Select-String in PowerShell

We still need to include the pipe characters in the search, but we don't want to return them as matches. How would we do that? One way is to use regular expression groups. Regex groups are represented by parentheses surround the match you'd like to return. In this case, I'll enclose the regex string representing just the employee first and last name and try again.

PS> $employees | Select-String -Pattern '\|(\w+ \w+)\|' | foreach {$_.Matches}

Groups   : {0, 1}
Success  : True
Name     : 0
Captures : {0}
Index    : 1
Length   : 14
Value    : |Adam Bertram|

Groups   : {0, 1}
Success  : True
Name     : 0
Captures : {0}
Index    : 3
Length   : 12
Value    : |Joe Jonesy|

Groups   : {0, 1}
Success  : True
Name     : 0
Captures : {0}
Index    : 2
Length   : 17
Value    : |Suzie Shoemaker|

Hmm..it still shows the Value with the pipe characters. But, look at the Groups property now. Rather than just being a 0, they now show 0,1. This means Select-String has captured a group. To view this group, I'll add the reference in our foreach loop again. Since each Groups property is an array, I can reference the 1 element by surrounding it with brackets and then referencing the Value property.

PS> $employees | Select-String -Pattern '\|(\w+ \w+)\|' | foreach {$_.Matches.Groups[1].Value}
Adam Bertram
Joe Jonesy
Suzie Shoemaker

You can now see that I've pulled out each of the employee names from the string! There's a lot more that Select-String can do so I'd also suggest having a look at the help content: help Select-String -Detailed for a full breakdown.

 

Related Posts


Comments
Comments are disabled in preview mode.
Loading animation