LineBreakInfo
LineBreakInfo is a Total Commander content plugin (WDX) that provides various pieces of information about file contents like line break type, type of BOM (Byte Order Mark), and number of each type of line break character(s) (CR/LF/CRLF/FF/VT).Just like any content plugin it can be used in TC's custom columns, search function, tooltips, multi-rename tool and so on. See TC Wiki for more information.
Contents
1. Features
- Determine type of line breaks in a file, i.e. N/A, CR (Carriage Return), LF (Line Feed), CRLF, FF (Form Feed), VT (Vertical Tab) or Mixed
- Determine type of BOM (Byte Order Mark) in a file, i.e. None, UTF-8, UTF-16 LE/BE, UTF-32 LE/BE
- Provides the number of CR, LF, FF, VT and combined CRLF sequences as well as the number of "binary" (non-printable) bytes
- Supports Unicode and long paths (> 259 characters)
2. System Requirements
- Windows 2000 or later, both 32 and 64 bit
- Total Commander 7.50 or later, both 32 and 64 bit
3. Plugin settings
3.1. Location of plugin settings files
As a content plugin (WDX) this plugin returns a so-called detect string to TC. This string is saved in wincmd.ini and
can be edited and customized if necessary. See next section below.
The plugin has additional settings which are saved in a different file. If you want to change any plugin settings you can
do so in either
- LineBreakInfo.ini in the plugin's directory, or
- contplug.ini in the directory where wincmd.ini is located (default).
The first option is good for portable mode, the latter option is useful on systems where Total Commander is installed in a directory where users can't write to (like %ProgramFiles%).
Important: If LineBreakInfo.ini exists in the plugin's directory it takes precedence over contplug.ini!
3.2. Detect string
By default this plugin returns the following string to TC:
n_detect="SIZE > 0"
where n is the number assigned by TC to the plugin. This prevents TC from showing the same information for directories where none of the plugin fields are of any use. Directories don't have any file content, so there are no line breaks and BOMs. The drawback is that zero byte files won't be handled by this plugin either. But that's an acceptable compromise because the plugin fields won't be that useful for such files either.If you want TC to show values for zero byte files, you can remove the detect string entirely. To do that, open wincmd.ini in your favorite editor, look for the plugin number assigned by TC in section [ContentPlugins], then look for the detect string starting with that same number, and remove that line. Note that opening a new TC instance (or a TC restart) is required for any changes to take effect.
The TC Wiki article ContentGetDetectString is a good starting point if you need to make major changes to the detect string.
3.3. The settings in detail
The settings are explained in the LineBreakInfo.sample.ini file, but they're also listed here for reference.
Section [LineBreakInfo]
Setting and default | Description |
---|---|
AnalyzeBytes = 4096 | Defines the number of bytes to read from each file. Increasing this value can make the detection more accurate,
but it will make it slower at the same time. The actual number of bytes read from a file can go above that value under certain conditions - see the information in chapter Number of analyzed bytes below. The hard limit for this setting imposed by the plugin is currently 20 MiB (20*1024*1024 bytes). |
FieldValuesInBackground = 0 | Specifies whether or not field values are returned in a background thread. 1 - Return field values in background thread. This makes TC more responsive in a custom columns view, especially for large files, but it also adds some processing and communication overhead which can slow things down for small files. 0 - Return field values in foreground thread. This is probably faster for small files. |
CacheSize = 4000 | Set the maximum number of items to cache in memory to allow TC fast access to the plugin field values. Values equal to or smaller than 0 are ignored. |
ClearCacheOnRefresh = 1 | 1 - Flush the cache when the cm_RereadSource command is issued in TC, e.g. by pressing Ctrl+R. This
forces a refresh on all plugin field values. 0 - Don't clear the cache on cm_RereadSource. Note that, even with cache flushing disabled, items will still be removed from the cache once CacheSize is reached. |
4. Information on how the plugin works
4.1. Number of analyzed bytes
The plugin reads up to a specific amount of bytes from a given file, which is determined by the AnalyzeBytes setting. However, the plugin will read a few more bytes from a particular file in case a CR (Carriage Return) is found within a certain number of bytes at the end of the buffer. That number depends on the file:
- up to one byte for files detected as UTF-8 as well as files without a BOM
- up to three bytes for files detected as UTF-16
- up to seven bytes for files detected as UTF-32
This way the plugin tries to find a corresponding LF character for the last CR, and in doing so match the number of CRs and LFs. If the number of CRs doesn't match the number of LFs, files may be detected as having mixed line breaks while in reality they don't.
4.2. Line Break Types
The plugin provides several fields which count the various line break types. These types are
Abbr. | Type | Hex code |
---|---|---|
LF | Line Feed | 0x0A |
VT | Vertical Tab | 0x0B |
FF | Form Feed | 0x0C |
CR | Carriage Return | 0x0D |
CRLF | CR followed by LF | 0x0D 0x0A |
4.3. BOM detection
Only the following BOM types are currently detected:
BOM type | Hex representation |
---|---|
UTF-8 | EF BB BF |
UTF-16 LE | FF FE |
UTF-16 BE | FE FF |
UTF-32 LE | FF FE 00 00 |
UTF-32 BE | 00 00 FE FF |
All other BOM types are not supported!
The plugin checks files for a BOM (Byte Order Mark), but it doesn't check for the encoding of any characters within a given file. This means that UTF-16 and UTF-32 files without a BOM will be detected as "Mixed" or "N/A".
5. Known issues and limitations
5.1. Known Issues
- Currently there are no known issues.
5.2. Limitations
- The plugin won't retrieve and provide values for files whose access permissions don't allow to read it.
6. Frequently Asked Questions
Why does the plugin determine a file as having X amount of line breaks when I know it has many more?
Since this plugin doesn't read whole files (because that would be awfully slow), the number of line breaks is limited by the number of bytes read from each file, which is determined by the AnalyzeBytes setting. Files larger than that can indeed have many more line breaks, but the plugin won't count them.
Why does the plugin detect a file as "Mixed" or "N/A" when I know it's UTF-16/UTF-32?
The file most likely doesn't have a BOM. See BOM detection above.
7. License
This software is provided "as is". No warranty provided. You use this program at your own risk. The author will not be responsible for data loss, damages, etc. while using or misusing this software.
The software must not be modified, you may not decompile or disassemble it.
This plugin is copyrighted freeware.
8. Thanks to
- Christian Ghisler, the author of Total Commander, for developing this great program that I use every day
- The members of the Delphi-PRAXiS forum that helped me understand and fix and optimize some things
9. Contact
If you have found a bug, have a suggestion, improvement, criticism, translation, you can contact me, Dalai,
in English or German, at:
Mail: dalai82@gmx.net
Please put "LineBreakInfo" somewhere in the subject.
There is a discussion thread in the official TC forum which can be used, too: https://ghisler.ch/board/viewtopic.php?t=80528