COUNTCHARS
— Count characters
in text files.
Syntax:
COUNTCHARS
/C:
x-
y /CP:
n /O /P /R /RO /S /U /V /W /X
filespec…
/C: x- y | specify a range of characters to count |
/CP: n | interpret non-Unicode input text using code page n |
/O | sort by frequency |
/P | page output |
/R | report counts for ranges as well as individual characters |
/RO | report range counts only, not counts of individual characters |
/S | search in subdirectories for matching files |
/U | force characters to uppercase |
/V | do not automatically merge overlapping ranges |
/W | do not report count of ‘other’ characters |
/X | do not report total characters count |
/ASCII | short for /C:0-127 |
/BMP | short for /C:0-0xFFFF |
/HI | short for /C:0x10000-0x10FFFF |
… | Range options are also supported. |
Input filenames may be specified on the command line, or text may be
redirected or piped into COUNTCHARS
. If you want to pipe to
COUNTCHARS
, remember that pipes open a new shell. To pipe to a plugin
command, you must either ensure that the plugin is loaded in the transient
shell, e.g. by installing the .DLL file
in the shell’s PlugIns directory; or else
use temporary files or an in-process pipe.
You may specify more than one filename;
wildcards and directory aliases are supported. You can search recursively into
subdirectories for matching files with /S
. @File lists and
internet files are supported. You may also specify CLIP:
to read
text from the clipboard.
Specify ranges of characters to count with
/C:
x-
y.
The start and end characters x and
y may be given as decimal, hexadecimal with a
leading 0x
, or as literal characters:
rem These three are all the same:
countchars /c:65-90 myfile.txt
countchars /c:0x41-0x5a myfile.txt
countchars /c:A-Z myfile.txt
To specify a literal digit, wrap it in apostrophes:
countchars /c:'0'-'9' myfile.txt
You may specify up to 32 ranges. If you do not specify any ranges, the
default is /C:0-127
(ASCII characters).
All values, both in character ranges and in COUNTCHARS
’s
reports, refer to Unicode code points. If the text uses an 8-bit or OEM
encoding, the values reported are the values of the Unicode characters that
the OEM characters are translated into — not the OEM
character values.
How many letters are in Engine Summer.txt?
countchars /c:A-Z /u /ro "Engine Summer.txt"
File "C:\Bin\JPSDK\TextUtils\Engine Summer.txt" :
0041 - 005A : 343
Other : 161
TOTAL : 504
/C:A-Z
defines a range of characters from A to Z.
/U
converts lowercase letters to uppercase so they will also be
counted in the same range. /RO
reports only the the total number
of characters in the range; we only want the total number of letters, not the
number of As, Bs, Cs, and so on. There are 343 letters in this file.
How many Cyrillic letters? Most Cyrillic letters fall in the range of U+0400 to U+04FF:
countchars /c:0x0400-0x04ff /ro "Engine Summer.txt"
File "C:\Bin\JPSDK\TextUtils\Engine Summer.txt" :
0400 - 04FF : 0
Other : 504
TOTAL : 504
Mr. Crowley is not writing in Russian.