Wednesday, March 21, 2012

FOR command: The DOS Sonic Screwdriver

When I was younger I enjoyed the British TV show "Doctor Who".  The imports we got in the states were the seasons where Tom Baker starred as the eponymous, dalek-fighting time-lord.  The shows were entertaining but had the deus ex machina of the sonic screwdriver that the Dr. would use for everything from picking locks to picking up women. ("Yes that is a sonic screwdriver in my pocket and I am happy to see you.")  I used to dream of having a single tool that could do all kinds of useful things.
http://media.screened.com/uploads/0/34/44299-tom_baker1.jpg

Then I found the FOR command in Windows XP and my dreams were made reality -- in a clunky, geeky, command-line sort of way.  (Adult life never really turns out like you imagined it when you were a kid.)  What does the FOR command do?

Runs a specified command for each file in a set of files.

FOR %variable IN (set) DO command [command-parameters]
  %variable  Specifies a single letter replaceable parameter.
  (set)      Specifies a set of one or more files.  Wildcards may be used.
  command    Specifies the command to carry out for each file.
  command-parameters
             Specifies parameters or switches for the specified command.

So (set) is one or more filenames (which may include wildcards) on which we can perform a command.

Well that's fine for working on a bunch of files but what if I need to work on a bunch of directories?

FOR /D %variable IN (set) DO command [command-parameters]
    If set contains wildcards, then specifies to match against directory
    names instead of file names.

What if I need to recursively look through all of these directories?

FOR /R [[drive:]path] %variable IN (set) DO command [command-parameters]
    Walks the directory tree rooted at [drive:]path, executing the FOR
    statement in each directory of the tree.  If no directory
    specification is specified after /R then the current directory is
    assumed.  If set is just a single period (.) character then it
    will just enumerate the directory tree.

But wait, I remember using a FOR command in BASIC to loop through a bunch of integers.

FOR /L %variable IN (start,step,end) DO command [command-parameters]
    The set is a sequence of numbers from start to end, by step amount.
    So (1,1,5) would generate the sequence 1 2 3 4 5 and (5,-1,1) would
    generate the sequence (5 4 3 2 1)

That's all good stuff, if fairly pedestrian.  But the next option is the one I use all the time and starts to crack open the sonic screwdriver capability of the FOR command.

FOR /F ["options"] %variable IN (file-set) DO command [command-parameters]
FOR /F ["options"] %variable IN ("string") DO command [command-parameters]
FOR /F ["options"] %variable IN ('command') DO command [command-parameters]

    filenameset is one or more file names.  Each file is opened, read
    and processed before going on to the next file in filenameset.

These seem like different functions but they get grouped together because they provide similar input and use the same ["options"] (discussed below).  But look at what is available now.  I can provide a set of one or more files, each of which will be opened, parsed line by line, and acted upon.  I can provide a string that will get parsed and acted upon.  But most importantly I can provide a command and have the output of that command parsed and acted upon.  The command can be a native DOS function or some other command line utility that produces output.  This is huge! 

Before we get to some nifty screwdriving let's take a look at the options.

        eol=c           - specifies an end of line comment character
                          (just one)
        skip=n          - specifies the number of lines to skip at the
                          beginning of the file.

These are easy to grasp.  Use one character to mark the end of the line which allows adding comments to the input file (although you can use it for other purposes).  Skip past the first one or more lines of the input to avoid processing header information.  It would be handy if there was a way to mark the end of the input file but it is up to us to process that in our code.

        delims=xxx      - specifies a delimiter set.  This replaces the
                          default delimiter set of space and tab.

Use one or more characters to parse the input.  This lets me process a comma separated .csv file but with a little creativity you can do much more.

        tokens=x,y,m-n  - specifies which tokens from each line are to
                          be passed to the for body for each iteration.

So from the input  I get one or more lines of text that is going to be parsed by breaking it into tokens based on the delimiters specified.  For example parsing the string "A B C D" will produce 4 tokens, one for each letter.  In the FOR command I specify a variable (%X in the following examples) that by default will be assigned to the first token (A).  But if I want the variable to be assigned to a the third token I would specify "tokens=3" and my variable %X will have the value C

I can also generate multiple variables that follow in alphabetical sequence and assign them values based on the tokens I select.  So if I specify "tokens=2,4" the variable %X will have the value B and the variable %Y will have the value D.  Or I can specify a range so "tokens=2-4" will make %X=B, %Y=C, and %Z=D.

I can also use an * to generate one final variable whose value will be the rest of the unparsed line of text.  So "tokens=1,2*" will make %X=A, %=B, and %Z=C D.  Using "tokens=1*" will make %X=A and %Y=B C D"Tokens=*" will prevent parsing completely and make %X=A B C D.


This is another example with the same tokens= values described above.  Notice in the last 2 examples there is no token to assign to the variables at the end so the ECHO command just displays the variable name as if it were a string. 

An important point about the FOR command is that because it can return a sequence of variables it only allows single character variable names.  I tend to start with %A so I can get as many tokens as I might need.  If I need to nest FOR commands in a pipeline I usually start later in the alphabet for the FOR commands later in the chain to avoid collision.

Now scroll back up a bit and notice that the IN part of the FOR /F command can be a command, a string, or a file set.  And notice that the string to be parsed will be in double quotes.  But what if I have a file set that has a filespec that contain spaces?  Well, those filespecs will have to be enclosed in double quotes, otherwise they will be seen as different filespecs.  But if the filespecs are enclosed in quotes won't they be confused for strings?  Hmmmm... why, yes they would.  Well how do we get around this problem?  I'm glad you asked.

        usebackq        - specifies that the new semantics are in force

This option will use the backquote to specify an executable command, a single quote to specify a string, which leaves double quotes available for enclosing file names that contain spaces.  Those guys at Microsoft think of everything, don't they?

So now we have a bunch of arrows in our quiver.  Let's go shoot some stuff.

Who is your computer talking to?  The netstat command will show all ports your computer has open:
I have my browser open to google.com  so the IPv4 addresses are google's servers.  Let's parse that output and see how the traffic gets from my computer to google.

FOR /F "skip=4 tokens=1-4" %A in ('netstat -n') do IF %D==ESTABLISHED 
   FOR /F "delims=:" %X in ('echo %C') do tracert %X

This all goes on a single command line but word wraps in the box above.  In the first FOR command I am skipping the first 4 lines so I can ignore the column headings.  I get 4 tokens broken up by white space.  I am only interested in established connections so my IF checks %D, the fourth token, that shows the connection state.  If netstat tells me the connection is ESTABLISHED then I will parse the third token (variable %C) which has the IP address and port.  I use a second FOR command to echo the IP:port value and have FOR split the string at the colon to get just the IP address. I use that as the parameter for the tracert command which shows each hop on the way from my PC to google.

Well, that is interesting but not terribly useful.  Here is something I have actually used my job.  I needed a way to get the list of users from a domain global group.  The NET GROUP command was selected because it is available on every platform.  But the output from the command lists the users in three columns which isn't useful if you need to do something else with the information like drop it into a spreadsheet or pipe the account names into another command.  So I gave the requestors this:

@ECHO OFF
IF %1.==. GOTO :Done


FOR /F "skip=8 tokens=1-3*" %%I IN ('net group %1 /domain') DO CALL :DumpEm %%I %%J %%K
GOTO :Done

:DumpEm
IF %1.==The. GOTO :Done
ECHO %1
IF NOT %2.==. ECHO %2
IF NOT %3.==. ECHO %3

:Done

I made the font small on the FOR command so it would fit into a single line to avoid confusion.  In this FOR command I'm skipping the header lines and grabbing the three columns of names  (I will gloss over how I handle the input to the batch file for now.  That will be the subject of a future post.)

One thing to note is that because this is run from a batch file and not the command line the variables in the FOR command have to use two percent signs (%%).  In the IF statements the first thing I do is see if we are at the end of the output from the NET GROUP command.  For the other IF statements I only ECHO output if there is data.  (Sometimes you will see examples like IF NOT "%1"=="" but really all the IF statement command does is compare two strings.  If my variable has no value and "%1" resolves to "" or %1. resolves to .  Either way I have verified the empty string and since I am more efficient if I type less so I use the technique shown in my batch file.)


This shows the results of a normal NET GROUP command to compare it to the results of the ShowUsers.cmd file.  So I achieved the goal of skipping the header and footer and getting all the names in a single column.  Mission accomplished.

One final example that shows the directory parsing capabilities of the FOR command.  On our NTFS file servers I was asked to dump the Access Control Lists (ACLs for you cool guys, "who has access to what" if you are in upper management).  I just needed a simple list for the top level folders so I cam up with this.

:: Show the ACLs for the top level folders on each file server data drive
::

SETLOCAL
SET AdmFS=\\FS0111\E$
\\FS0113\E$ \\FS0115\E$
Set FN=FileServerACLs.txt

DEL /Y %FN%

FOR %%A in (%AdmFS%) DO FOR /D %%B in (%%A\*.*) DO CACLS "%%B" >> %FN%

I can't show you the output but I will describe what happens.  The variable %AdmFS% lists the administrative shares on three file servers.  The variable %FN% has the name of the output file which gets cleared by the DEL command each time the script runs.

Now I chain together two FOR commands, the first one parses %AdmFS% to call the second FOR command once for each file server.  The second FOR command lists the top level folders.  Because the folders may contain spaces the variable from the second FOR command is enclosed in quotes so CACLS will correctly process its value.  The results of CACLS are piped using >> so they always append to the output file.

The output from CACLS isn't pretty and fortunately I wasn't asked the follow up of having to list all the users in all of the groups that were output.  If I were working this project today I would use icacls.exe instead because the output is cleaner and the utility has more features.  And I would use Powershell instead of DOS because it provides more capabilities for parsing results and creating friendly reports.

Didn't I say I would talk about Powershell in the first blog post?  Isn't it about time I started?

1 comment:

  1. http://www.futureoftech.msnbc.msn.com/technology/futureoftech/scientists-create-working-sonic-screwdriver-726186#

    Somebody invented an actual sconic screwdriver

    ReplyDelete