Trying to understand RLI and RLI2 better
#2
The purpose of rli is to diff two lists, and only show the new ones in the new file. It's not a dedupe tool.

For general dedupe, sort -u is your go-to for this. I use this alias (adjust parameters to your hardware):

Code:
LC_ALL=C sort --parallel=4 -S 4000M -T /storage-hdd/tmp/ -u

What people usually do is sort -u both lists that will be the input to rli (or rather, rli2, which assumes sorted and uniq'd lists).

If your original inputs are in frequency order, you can use rli, but they should be at least deduped first.
~


Messages In This Thread
RE: Trying to understand RLI and RLI2 better - by royce - 12-25-2017, 08:57 PM