Since the key word used here was fast, I would recommend something like Boyer Moore. And if it's more than one sequence to look for, Aho-Corasick.
Wow, I didn't know about Boyer-Moore and Aho-Corasick, but as I see I go in right direction, because what I tried to do was similar to BM
Typical implementations of these algorithms should already include stream functionality.
Do you know some? I found for strings, but as I understand is applicable for any data where need to find sequence.
You should eliminate object
access there by copying the values to local variables.
For performance?
Delphi-Quellcode:
function OffsetsLookup(
const AStream: TStream;
const APattern: TBytes): TOffsets;
overload;
var
Position, Size, Offset: int64;
PatternLen, MaxPosition: Int64;
Memory: TMemoryStream;
MemoryPtr: Pointer;
Buffer: Byte;
begin
SetLength(Result, 0);
Memory := TMemoryStream.Create;
try
Memory.CopyFrom(AStream, 0);
Memory.Position := 0;
// Work only with variables to eliminate Object access
MemoryPtr := Memory.Memory;
Position := Memory.Position;
Size := Memory.Size;
PatternLen := Length(APattern);
// No calculation of end condition
MaxPosition := Size - PatternLen + 1;
while Position < MaxPosition
do
begin
Offset := Position;
Buffer := Byte(Pointer(Longint(MemoryPtr) + Offset)^);
// ????? :(
end;
finally
Memory.Free;
end;
end;
And don't know how to operate on raw binary data. I found these implementations for strings:
Code:
http://www.delphigeist.com/2009/10/boyer-moore-horspool-algorithm.html
http://www.delphigeist.com/2011/05/boyer-moore-horspool-return-all.html <-- this looking good
http://www.delphigeist.com/2010/04/boyer-moore-horspool-in-delphi-2010.html
Don't know how to apply for streams in above function
BTW: what's the difference between Boyer-Moore and Boyer-Moore-Horspool? I read in Wikipedia, but don't understan, for me is the same